and non-P_SUGID cases, simplify p_cansignal() logic so that the
P_SUGID masking of possible signals is independent from uid checks,
removing redundant code and generally improving readability.
Reviewed by: tmm
Obtained from: TrustedBSD Project
the ability of unprivileged processes to deliver arbitrary signals
to daemons temporarily taking on unprivileged effective credentials
when P_SUGID is not set on the target process:
Removed:
(p1->p_cred->cr_ruid != ps->p_cred->cr_uid)
(p1->p_ucred->cr_uid != ps->p_cred->cr_uid)
o Replace two "allow this" exceptions in p_cansignal() restricting
the ability of unprivileged processes to deliver arbitrary signals
to daemons temporarily taking on unprivileged effective credentials
when P_SUGID is set on the target process:
Replaced:
(p1->p_cred->p_ruid != p2->p_ucred->cr_uid)
(p1->p_cred->cr_uid != p2->p_ucred->cr_uid)
With:
(p1->p_cred->p_ruid != p2->p_ucred->p_svuid)
(p1->p_ucred->cr_uid != p2->p_ucred->p_svuid)
o These changes have the effect of making the uid-based handling of
both P_SUGID and non-P_SUGID signal delivery consistent, following
these four general cases:
p1's ruid equals p2's ruid
p1's euid equals p2's ruid
p1's ruid equals p2's svuid
p1's euid equals p2's svuid
The P_SUGID and non-P_SUGID cases can now be largely collapsed,
and I'll commit this in a few days if no immediate problems are
encountered with this set of changes.
o These changes remove a number of warning cases identified by the
proc_to_proc inter-process authorization regression test.
o As these are new restrictions, we'll have to watch out carefully for
possible side effects on running code: they seem reasonable to me,
but it's possible this change might have to be backed out if problems
are experienced.
Submitted by: src/tools/regression/security/proc_to_proc/testuid
Reviewed by: tmm
Obtained from: TrustedBSD Project
ability of unprivileged processes to modify the scheduling properties
of daemons temporarily taking on unprivileged effective credentials.
These cases (p1->p_cred->p_ruid == p2->p_ucred->cr_uid) and
(p1->p_ucred->cr_uid == p2->p_ucred->cr_uid), respectively permitting
a subject process to influence the scheduling of a daemon if the subject
process has the same real uid or effective uid as the daemon's effective
uid. This removes a number of the warning cases identified by the
proc_to_proc iner-process authorization regression test.
o As these are new restrictions, we'll have to watch out carefully for
possible side effects on running code: they seem reasonable to me,
but it's possible this change might have to be backed out if problems
are experienced.
Reported by: src/tools/regression/security/proc_to_proc/testuid
Obtained from: TrustedBSD Project
by p_can(...P_CAN_SEE), rather than returning EACCES directly. This
brings the error code used here into line with similar arrangements
elsewhere, and prevents the leakage of pid usage information.
Reviewed by: jlemon
Obtained from: TrustedBSD Project
p_can(...P_CAN_SEE...) to getpgid(), getsid(), and setpgid(),
blocking these operations on processes that should not be visible
by the requesting process. Required to reduce information leakage
in MAC environments.
Obtained from: TrustedBSD Project
from signal authorization checking.
o p_cansignal() takes three arguments: subject process, object process,
and signal number, unlike p_cankill(), which only took into account
the processes and not the signal number, improving the abstraction
such that CANSIGNAL() from kern_sig.c can now also be eliminated;
previously CANSIGNAL() special-cased the handling of SIGCONT based
on process session. privused is now deprecated.
o The new p_cansignal() further limits the set of signals that may
be delivered to processes with P_SUGID set, and restructures the
access control check to allow it to be extended more easily.
o These changes take into account work done by the OpenBSD Project,
as well as by Robert Watson and Thomas Moestl on the TrustedBSD
Project.
Obtained from: TrustedBSD Project
toggle the P_SUGID bit explicitly, rather than relying on it being
set implicitly by other protection and credential logic. This feature
is introduced to support inter-process authorization regression testing
by simplifying userland credential management allowing the easy
isolation and reproduction of authorization events with specific
security contexts. This feature is enabled only by "options REGRESSION"
and is not intended to be used by applications. While the feature is
not known to introduce security vulnerabilities, it does allow
processes to enter previously inaccessible parts of the credential
state machine, and is therefore disabled by default. It may not
constitute a risk, and therefore in the future pending further analysis
(and appropriate need) may become a published interface.
Obtained from: TrustedBSD Project
enable easy access to the hash chain stats. The raw prefixed versions
dump an integer array to userland with the chain lengths. This cheats
and calls it an array of 'struct int' rather than 'int' or sysctl -a
faithfully dumps out the 128K array on an average machine. The non-raw
versions return 4 integers: count, number of chains used, maximum chain
length, and percentage utilization (fixed point, multiplied by 100).
The raw forms are more useful for analyzing the hash distribution, while
the other form can be read easily by humans and stats loggers.
because:
- it used a better namespace (smp_ipi_* rather than *_ipi),
- it used better constant names for the IPI's (IPI_* rather than
X*_OFFSET), and
- this API also somewhat exists for both alpha and ia64 already.
count drops to 0 in witness_destroy, set the w_name and w_file pointers
to point to the string "(dead)" and the w_line field to 0. This way,
if a mutex of a given name is used only in a module, then as long as
all mutexes in the module are destroyed when the module is unloaded,
witness will not maintain stale references to the mutex's name in the
module's data section causing a panic later on when the w_name or w_file
field's are examined.
list into a public witness_list_locks() function. Call this function
twice in witness_list() instead of using an evil goto.
- Adjust the 'show locks' command to take an optional parameter which
specifies the pid of a process to list the locks of. By default the
locks held by the current process are displayed.
The mbuf and mcluster free lists now each "own" a condition variable,
m_starved.
- Clean up minor indentention issues in sys/mbuf.h caused by previous
commit.
Don't use atomic operations for the stats updating, instead protect
the counts with the mbuf mutex. Most twiddling of the stats was
done right before or after releasing a mutex. By doing this we
reduce the number of locked ops needed as well as allow a sysctl
to gain a consitant view of the entire stats structure.
In the future...
This will allow us to chain common mbuf operations that would
normally need to aquire/release 2 or 3 of the locks to build an
mbuf with a cluster or external data attached into a single op
requiring only one lock.
Simplify the per-cpu locks that are planned.
There's also some if (1) code that should check if the "how"
operation specifies blocking/non-blocking behavior, we _could_ make
it so that we hold onto the mutex through calls into kmem_alloc
when non-blocking requests are made, but for safety reasons we
currently drop and reaquire the mutex around the calls.
Also, note that calling kmem_alloc is rare and only happens during
a shortage so drop/re-getting the mutex will not be a common
occurance.
Remove some #define's that seemed to obfuscate the code to me.
Remove an extranious comment.
Remove an XXX, including mutex.h isn't a crime.
Reviewed by: bmilekic
avoid silly lock contention on sched_lock since in 2 out of the 3 places
that we call stop(), we get sched_lock right after calling it and we were
locking sched_lock inside of stop() anyways.
SIGCHLD to our parent process. Otherwise, we could block while obtaining
the process lock for our parent process and switch out while we were
in SSTOP. Even worse, when we try to resume from the mutex being blocked
on our p_stat will be SRUN, not SSTOP.
- Fix a comment above stop() to indicate that it requires that the proc lock
be held, not a proctree lock.
Reported by: markm
Sleuthing by: jake
operations on file descriptors, which complement the existing set of
calls, extattr_{delete,get,set}_file() which act on paths. In doing
so, restructure the system call implementation such that the two sets
of functions share most of the relevant code, rather than duplicating
it. This pushes the vnode locking into the shared code, but keeps
the copying in of some arguments in the system call code. Allowing
access via file descriptors reduces the opportunity for race
conditions when managing extended attributes.
Obtained from: TrustedBSD Project
ps_showallprocs such that if superuser is present to override process
hiding, the search falls through [to success]. When additional
restrictions are placed on process visibility, such as MAC, new clauses
will be placed above the return(0).
Obtained from: TrustedBSD Project
two subject ucreds. Unlike p_cansee(), u_cansee() doesn't have
process lock requirements, only valid ucred reference requirements,
so is prefered as process locking improves. For now, back p_cansee()
into u_cansee(), but eventually p_cansee() will go away.
Reviewed by: jhb, tmm
Obtained from: TrustedBSD Project
locks were held, we could be preempted and switch CPU's in between the time
that we set a variable to the list of spin locks on our CPU and the time
that we checked that variable to ensure no spinlocks were held while
grabbing a sleep lock. Losing the race resulted in checking some other
CPU's spin lock list and bogusly panicing.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
and change the u_int mtx_saveintr member of struct mtx to a critical_t
mtx_savecrit.
- On the alpha we no longer need a custom _get_spin_lock() macro to avoid
an extra PAL call, so remove it.
- Partially fix using mutexes with WITNESS in modules. Change all the
_mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line
parameters and rename them to use a prefix of two underscores. Inside
of kern_mutex.c, generate wrapper functions for
_mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore)
that are called from modules. The macros mtx_{un,}lock_{spin,}_flags()
are mapped to the __mtx_* macros inside of the kernel to inline the
usual case of mutex operations and map to the internal _mtx_* functions
in the module case so that modules will use WITNESS and KTR logging if
the kernel is compiled with support for it.
process we're looking for. (I don't think this can currently
happen, but it depends how the function is called).
PR: 25932
Submitted by: David Xu <davidx@viasoft.com.cn>