Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
The get operations change the data pointed to by the structure, but do
not update the contents of the struct.
Mark the struct mac arguments of mac_[gs]etsockopt_*label() and
mac_check_structmac_consistent() const to prevent this from changing
in the future.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14488
The mac_ipacl policy module enables fine-grained control over IP address
configuration within VNET jails from the base system.
It allows the root user to define rules governing IP addresses for
jails and their interfaces using the sysctl interface.
Requested by: multiple
Sponsored by: Google, Inc. (GSoC 2019)
MFC after: 2 months
Reviewed by: bz, dch (both earlier versions)
Differential Revision: https://reviews.freebsd.org/D20967
Ensure MAC modules are inserted in order that they are registered.
Reviewed by: markj
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39589
Currently, sysctls which enable KDB in some way are flagged with
CTLFLAG_SECURE, meaning that you can't modify them if securelevel > 0.
This is so that KDB cannot be used to lower a running system's
securelevel, see commit 3d7618d8bf. However, the newer mac_ddb(4)
restricts DDB operations which could be abused to lower securelevel
while retaining some ability to gather useful debugging information.
To enable the use of KDB (specifically, DDB) on systems with a raised
securelevel, change the KDB sysctl policy: rather than relying on
CTLFLAG_SECURE, add a check of the current securelevel to kdb_trap().
If the securelevel is raised, only pass control to the backend if MAC
specifically grants access; otherwise simply check to see if mac_ddb
vetoes the request, as before.
Add a new secure sysctl, debug.kdb.enter_securelevel, to override this
behaviour. That is, the sysctl lets one enter a KDB backend even with a
raised securelevel, so long as it is set before the securelevel is
raised.
Reviewed by: mhorne, stevek
MFC after: 1 month
Sponsored by: Juniper Networks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37122
Summary:
Port the MAC modules to use the IfAPI APIs as part of this.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38197
2449b9e5fe introduced API changes
that require ensuring that loadable MAC modules use the matching API.
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
o Assert that every protosw has pr_attach. Now this structure is
only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw. This was suggested
in 1996 by wollman@ (see 7b187005d1), and later reiterated
in 2006 by rwatson@ (see 6fbb9cf860).
o Make struct domain hold a variable sized array of protosw pointers.
For most protocols these pointers are initialized statically.
Those domains that may have loadable protocols have spacers. IPv4
and IPv6 have 8 spacers each (andre@ dff3237ee5).
o For inetsw and inet6sw leave a comment noting that many protosw
entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36232
Add three simple hooks to the debugger allowing for a loaded MAC policy
to intervene if desired:
1. Before invoking the kdb backend
2. Before ddb command registration
3. Before ddb command execution
We extend struct db_command with a private pointer and two flag bits
reserved for policy use.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35370
When packet is a SYN packet, we don't need to modify any existing PCB.
Normally SYN arrives on a listening socket, we either create a syncache
entry or generate syncookie, but we don't modify anything with the
listening socket or associated PCB. Thus create a new PCB lookup
mode - rlock if listening. This removes the primary contention point
under SYN flood - the listening socket PCB.
Sidenote: when SYN arrives on a synchronized connection, we still
don't need write access to PCB to send a challenge ACK or just to
drop. There is only one exclusion - tcptw recycling. However,
existing entanglement of tcp_input + stacks doesn't allow to make
this change small. Consider this patch as first approach to the problem.
Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D29576
place -- in the VOP rather than vn_setexttr() -- and that it is for historic
reasons. We might wish to relocate it in due course, but this way at least
we document the asymmetry.
pipes get stated all thet time and this avoidably contributed to contention.
The pipe lock is only held to accomodate MAC and to check the type.
Since normally there is no probe for pipe stat depessimize this by having the
flag.
The pipe_state field gets modified with locks held all the time and it's not
feasible to convert them to use atomic store. Move the type flag away to a
separate variable as a simple cleanup and to provide stable field to read.
Use short for both fields to avoid growing the struct.
While here short-circuit MAC for pipe_poll as well.
I have such a custom kernel configuration and its build failed with:
linking kernel.full
ld: error: undefined symbol: mac_vnode_assert_locked
>>> referenced by mac_framework.h:556 (/usr/devel/git/apu2c4/sys/security/mac/mac_framework.h:556)
>>> tmpfs_vnops.o:(mac_vnode_check_stat)
>>> referenced by mac_framework.h:556 (/usr/devel/git/apu2c4/sys/security/mac/mac_framework.h:556)
>>> vfs_default.o:(mac_vnode_check_stat)
>>> referenced by mac_framework.h:556 (/usr/devel/git/apu2c4/sys/security/mac/mac_framework.h:556)
>>> ufs_vnops.o:(mac_vnode_check_stat)
The code would unconditionally lock the vnode to audit or call the
mac hoook, even if neither want to do anything. Pre-check the state
to avoid locking in the common case of nothing to do.
Note this code should not be normally executed anyway as vnodes are
always return ready. However, poll1/2 from will-it-scale use regular
files for benchmarking, presumably to focus on the interface itself
as the vnode handler is not supposed to do almost anything.
This in particular fixes poll2 which passes 128 fds.
$ ./poll2_processes -s 10
before: 134411
after: 271572
This lock was made unnecessary by the addition of mac_policy_rms in r356120.
Reviewed by: mjg, kib
Differential Revision: https://reviews.freebsd.org/D24283
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
All checking routines walk a linked list of all modules in order to determine
if given hook is installed. This became a significant problem after mac_ntpd
started being loaded by default.
Implement a way perform checks for select hooks by testing a boolean.
Use it for priv_check and priv_grant, which are constantly called from priv_check.
The real fix would use hotpatching, but the above provides a way to know when
to do it.
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427
If any non-static modules are loaded (and mac_ntpd tends to be), the lock is
taken all the time al over the kernel. On platforms like arm64 this results in
an avoidable significant performance degradation. Since write-locking is almost
never needed, use a primitive optimized towards read-locking.
Sample result of building the kernel on tmpfs 11 times:
stock 11142.80s user 6704.44s system 4924% cpu 6:02.42 total
patched 11118.95s user 2374.94s system 4547% cpu 4:56.71 total
entry, when that entry has been seen already, keep the
already-looked-up value in a variable and use that instead of looking
it up again.
Approved by: alc, markj (earlier version), kib (earlier version)
Differential Revision: https://reviews.freebsd.org/D22348
around entry->{next,prev} when those are used for ordered list
traversal, and use those wrapper functions everywhere. Where the next
field is used for maintaining a stack of deferred operations, #define
defer_next to make that different usage clearer, and then use the
'right' pointer instead of 'next' for that purpose.
Approved by: markj
Tested by: pho (as part of a larger patch)
Differential Revision: https://reviews.freebsd.org/D22347
In case the implementation ever changes from using a chain of next pointers,
then changing the macro definition will be necessary, but changing all the
files that iterate over vm_map entries will not.
Drop a counter in vm_object.c that would have an effect only if the
vm_map entry count was wrong.
Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision: https://reviews.freebsd.org/D21882
neighbors, and is used in a way so that if entries a and b cannot be
merged, we consider them twice, first not-merging a with its successor
b, and then not-merging b with its predecessor a. This change replaces
vm_map_simplify_entry with vm_map_try_merge_entries, which compares
two adjacent entries only, and uses it to avoid duplicated
merge-checks.
Tested by: pho
Reviewed by: alc
Approved by: markj (implicit)
Differential Revision: https://reviews.freebsd.org/D20814
lock mac_ifnet_mtx, which protects labels on struct ifnet, unless at least
one policy is actively using labels on ifnets. This avoids a global mutex
acquire in certain fast paths -- most noticeably ifnet transmit. This was
previously invisible by default, as no MAC policies were loaded by default,
but recently became visible due to mac_ntpd being enabled by default.
gallatin@ reports a reduction in PPS overhead from 300% to 2.2% with this
change. We will want to explore further MAC Framework optimisation to
reduce overhead further, but this brings things more back into the world
of the sane.
MFC after: 3 days
Prior to the change the code would branch on return value and then check
if probes are enabled. Since vast majority of the time they are not, this
is clearly wasteful. Check probes first.
Sponsored by: The FreeBSD Foundation
The buffer size may be used to initialize an sbuf in
MAC_POLICY_EXTERNALIZE, and without this constraint it's possible to
trigger an assertion failure in the sbuf code. With INVARIANTS
disabled, the first attempt to write to the sbuf will fail.
Reported by: pho
Reviewed by: delphij
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D16527
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.
Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385