1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-20 11:11:24 +00:00
Commit Graph

3942 Commits

Author SHA1 Message Date
Ruslan Ermilov
4589be70fe Unbreak setregid(2).
Spotted by:	Alexander Leidinger <Alexander@Leidinger.net>
2001-06-06 13:58:03 +00:00
John Baldwin
262c9f8a3b Don't hold sched_lock across addupc_task().
Reported by:	David Taylor <davidt@yadt.co.uk>
Submitted by:	bde
2001-06-06 00:57:24 +00:00
Dima Dorfman
ddf5b79683 Add a line discipline close routine which restores some functionality
I accidently nuked in rev. 1.54.  Also rework the error handling in
snplwrite a little.
2001-06-05 05:07:53 +00:00
Dima Dorfman
f09f49f136 Style and cosmetic cleanups. This driver is now reasonably stlye(9)
compliant.  All the variable definitions and function names are
reasonably consistent, and the functions which should be static (i.e.,
all of them) are.  Other assorted fixes were made.  The majority of
the delta is indentation fixes.

Partially reviewed by:	bde
2001-06-05 05:00:17 +00:00
Dima Dorfman
7fd72392d9 Use the l_nullioctl exported from tty_conf.c rather than rolling our own. 2001-06-04 23:31:21 +00:00
Dima Dorfman
22cf0fb34d Unstaticize l_nullioctl; it is needed elsewhere (like in tty_snoop.c).
Suggested by:	bde
2001-06-04 23:30:47 +00:00
Matthew Dillon
1b3e974a71 The pipe_write() code was locking the pipe without busying it first in
certain cases, and a close() by another process could potentially rip the
pipe out from under the (blocked) locking operation.

Reported-by: Alexander Viro <viro@math.psu.edu>
2001-06-04 04:04:45 +00:00
Dima Dorfman
87826386e0 Remove unused includes, use *min() inline functions rather than a
home-grown macro, rewrite a confusing conditional in snpdevtotty(),
and change ibuf to 512 bytes instead of 1024 bytes in dsnwrite().

Reviewed by:	bde
2001-06-03 05:17:39 +00:00
Dima Dorfman
b8edb44cc3 When tring to find out if this is a request for a write in
kernel_sysctl and userland_sysctl, check for whether new is NULL, not
whether newlen is 0.  This allows one to set a string sysctl to "".
2001-06-03 04:58:51 +00:00
Dima Dorfman
c0b824f97d Include sys/mutex.h to silence a warning. 2001-06-03 02:19:07 +00:00
Jesper Skriver
5b86eac4e5 Revert the last bits of my bogus move of NMBCLUSTERS
to <sys/param.h>
2001-06-01 21:47:34 +00:00
Thomas Moestl
d279178df7 Clean up the code exporting interrupt statistics via sysctl a bit:
- move the sysctl code to kern_intr.c
- do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine
  the length of the intrcnt array
- move the declarations of intrnames, eintrnames, intrcnt and eintrcnt
  from machine-dependent include files to sys/interrupt.h
- remove the hw.nintr sysctl, it is not needed.
- fix various style bugs

Requested by:	bde
Reviewed by:	bde (some time ago)
2001-06-01 13:23:28 +00:00
Ruslan Ermilov
0b381bf1fd Remove vestiges of MFS. 2001-06-01 10:07:28 +00:00
David E. O'Brien
240ef84277 Back out jesper's 2001/05/31 14:58:11 PDT commit. It does not compile. 2001-06-01 09:51:14 +00:00
Jesper Skriver
e916d96e64 Move the definition of NMBCLUSTERS from src/sys/kern/uipc_mbuf.c
to <sys/param.h>, so it's available to src/sys/netinet/ip_input.c,
and remove the now unneeded includes of "opt_param.h".

MFC after:	1 week
2001-05-31 21:56:44 +00:00
Dima Dorfman
a723c4e173 Export via sysctl:
* all members of msginfo from sysv_msg.c;
  * msqids from sysv_msg.c;
  * sema from sysv_sem.c; and
  * shmsegs from sysv_shm.c;

These will be used by ipcs(1) in non-kvm mode.

Reviewed by:	tmm
2001-05-30 03:28:59 +00:00
Poul-Henning Kamp
22628ccf96 Remove the hack-around for the slice/label code, it didn't
cover the hole.
2001-05-29 18:19:57 +00:00
Ian Dowse
5f558fa42f Since the netexport struct was centralised to 'struct mount',
attempting to remove nonexistant exports with MNT_DELEXPORT returns
an error; before this change it always succeeded. This caused
mountd(8) to log "can't delete exports for /whatever" warnings.

Change the error code from EINVAL to a more specific ENOENT, and
make mountd ignore this error when deleting the export list. I
could have just restored the previous behaviour of returning success,
but I think an error return is a useful diagnostic.

Reviewed by:	phk
2001-05-29 17:46:52 +00:00
Poul-Henning Kamp
b63436919d Remove a comment which was past its shelf life.
PR:		18750
Submitted by:	Tony Finch <dot@dotat.at>
2001-05-29 09:22:22 +00:00
Poul-Henning Kamp
c01a009dc5 With the new kernel dev_t conversions done at release 4.X,
it becomes possible to trap in ptsstop() in kern/tty_pty.c
     if the slave side has never been opened during the life of a kernel.

     What happens is that calls to ttyflush() done from ptyioctl() for the
     controlling side end up calling ptsstop() [via (*tp->t_stop)(tp, <X>)]
     which evaluates the following:

	     struct pt_ioctl *pti = tp->t_dev->si_drv1;

     In order for tp->t_dev to be set, the slave device must first be
     opened in ttyopen() [kern/tty.c].

     It appears that the only problem is calls to (*tp->t_stop)(tp, <n>),
     so this could also happen with other ioctls initiated by the
     controlling side before the slave has been opened.

PR:		27698
Submitted by:	David Bein bein@netapp.com
MFC after:	6 days
2001-05-28 20:22:12 +00:00
Poul-Henning Kamp
507fbee0ad The disklabel/slice code is more twisted than I thought. Revert to
calling the cdevsw_add() unconditionally.
2001-05-28 16:12:55 +00:00
Brian Somers
04bd20e31d Handle NULL struct device *s 2001-05-28 01:00:03 +00:00
Robert Watson
823c224e95 o uifree() the cr_ruidinfo in crfree() as well as cr_uidinfo now that the real uid
info is in the credential also.

Submitted by:	egge
2001-05-27 21:43:46 +00:00
Robert Watson
7cb8e4d277 o pcred-removal changes included modifications to optimize the setting of
the saved uid and gid during execve().  Unfortunately, the optimizations
  were incorrect in the case where the credential was updated, skipping
  the setting of the saved uid and gid when new credentials were generated.
  This change corrects that problem by handling the newcred!=NULL case
  correctly.

Reported/tested by:	David Malone <dwmalone@maths.tcd.ie>

Obtained from:	TrustedBSD Project
2001-05-26 19:59:44 +00:00
Poul-Henning Kamp
3344c5a17e Create a general facility for making dev_t's depend on another
dev_t.  The dev_depends(dev_t, dev_t) function is for tying them
to each other.

When destroy_dev() is called on a dev_t, all dev_t's depending
on it will also be destroyed (depth first order).

Rewrite the make_dev_alias() to use this dependency facility.

kern/subr_disk.c:
Make the disk mini-layer use dependencies to make sure all
relevant dev_t's are removed when the disk disappears.

Make the disk mini-layer precreate some magic sub devices
which the disk/slice/label code expects to be there.

kern/subr_disklabel.c:
Remove some now unneeded variables.

kern/subr_diskmbr.c:
Remove some ancient, commented out code.

kern/subr_diskslice.c:
Minor cleanup.  Use name from dev_t instead of dsname()
2001-05-26 08:27:58 +00:00
John Baldwin
9d127f9ffb Add vm locking to sendfile(2) and sf_buf_free().
Reported by:	Tamiji Homma <thomma@BayNetworks.com>
Tested by:	Tamiji Homma <thomma@BayNetworks.com>
2001-05-25 19:23:04 +00:00
Robert Watson
b1fc0ec1a7 o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00
Poul-Henning Kamp
5696db457d Make the PTY drivers cloning algorithm create "CHEAPCLONE" dev_t,
so that some twit cannot allocate all 256 PTY's with "ls -l".
2001-05-25 13:23:42 +00:00
Poul-Henning Kamp
2613d3fec9 Use the name given to the dev_t, rather than creating our own.
This makes it possible to give sensible information for /dev/fd.720
and similar "special" devices.
2001-05-25 09:06:52 +00:00
Ruslan Ermilov
1166fb516b - sys/msdosfs moved to sys/fs/msdosfs
- msdos.ko renamed to msdosfs.ko
- /usr/include/msdosfs moved to /usr/include/fs/msdosfs
2001-05-25 08:14:14 +00:00
Poul-Henning Kamp
25e0288d07 Don't rely on cdevsw_add() when we hack about with dev_t's. 2001-05-24 20:28:06 +00:00
Poul-Henning Kamp
8576c652b4 Don't take the detour around devsw() to find out if the proto-cdevsw
is already initialized.
2001-05-24 20:27:16 +00:00
Alfred Perlstein
0cea693084 whitespace/style 2001-05-24 18:06:22 +00:00
Matthew Dillon
ac8f990bde This patch implements O_DIRECT about 80% of the way. It takes a patchset
Tor created a while ago, removes the raw I/O piece (that has cache coherency
problems), and adds a buffer cache / VM freeing piece.

Essentially this patch causes O_DIRECT I/O to not be left in the cache, but
does not prevent it from going through the cache, hence the 80%.  For
the last 20% we need a method by which the I/O can be issued directly to
buffer supplied by the user process and bypass the buffer cache entirely,
but still maintain cache coherency.

I also have the code working under -stable but the changes made to sys/file.h
may not be MFCable, so an MFC is not on the table yet.

Submitted by:	tegge, dillon
2001-05-24 07:22:27 +00:00
Dima Dorfman
028f979d1d Correct style bugs with regards to long lines and comments.
Reviewed by:	bde
2001-05-23 23:38:05 +00:00
John Baldwin
0dfefe6829 Don't acquire Giant just to call trap_fatal(), we are about to panic
anyway so we'd rather see the printf's then block if the system is
hosed.
2001-05-23 22:58:09 +00:00
John Baldwin
bdc60f5bd3 Don't release Giant around vm_oject_page_clean() in fsync() as the pager
putpages called will need Giant.
2001-05-23 22:55:13 +00:00
John Baldwin
8aa66068ed - Always call bfreekva() w/o vm_mtx held.
- Always call vfs_setdirty() with vm_mtx held.
- Fix an old comment: vm_hold_unload_pages is called vm_hold_free_pages()
  nowadays.
- Always call vm_hold_free_pages() w/o vm_mtx held.
2001-05-23 22:24:49 +00:00
John Baldwin
1b2555b243 - Lock the VM when initializing the vmspace for proc0.
- Don't bother releasing Giant while doing a lookup on the vm_map of
  initproc while starting up init.  We have to grab it again right after
  the lookup anyways.
2001-05-23 22:06:47 +00:00
John Baldwin
613c83cbf1 Lock the VM while twiddling the vmspace. 2001-05-23 22:05:08 +00:00
Bosko Milekic
629db60492 Increment mbstat.m_mpfail, not mbstat.m_mcfail, when m_pullup() fails.
This slipped in accidently a few commits back.
2001-05-23 20:44:54 +00:00
John Baldwin
5bd57bc8b7 Don't release the vm lock just to turn around and grab it again. 2001-05-23 19:51:12 +00:00
John Baldwin
b516d2f5e1 Add in assertions to ensure that we always call msleep or mawait with
either a timeout or a held mutex to detect unprotected infinite sleeps
that can easily lead to deadlock.

Submitted by:	alfred
2001-05-23 19:38:26 +00:00
Poul-Henning Kamp
4787f91d6b syslogd gets kernel log messages only once every 30 seconds or
at the top of the minute, whichever comes first.  It seems
logtimeout() is only called once after the kernel log is opened
and then never again after that.  So I guess syslogd only gets
kernel log messages by virtue of syncer(4)'s flushes ...?

PR:		27361
Submitted by:	pkern@utcc.utoronto.ca
MFC after:	1 week
2001-05-23 19:02:50 +00:00
Alfred Perlstein
53240603ee aquire vm_mutex a little bit earlier to protect a pmap call. 2001-05-23 10:26:36 +00:00
Ruslan Ermilov
99d300a1ec - FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file
systems were repo-copied from sys/miscfs to sys/fs.

- Renamed the following file systems and their modules:
  fdesc -> fdescfs, portal -> portalfs, union -> unionfs.

- Renamed corresponding kernel options:
  FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.

- Install header files for the above file systems.

- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
  Makefiles.
2001-05-23 09:42:29 +00:00
Dima Dorfman
0150c6e83d Unifdef DEV_SNP; snp(4) no longer requires these ugly hacks.
Silence by:	-hackers, -audit
2001-05-22 22:16:18 +00:00
Dima Dorfman
47eaa5f542 Convert this driver to (ab?)use line disciplines to get the input it
needs instead of relying on idiosyncratic hacks in the tty subsystem.
Also add module code since this can now be compiled as a module.

Silence by:	-hackers, -audit
2001-05-22 22:13:14 +00:00
Bruce Evans
1c1771cb5b Convert npx interrupts into traps instead of vice versa. This is much
simpler for npx exceptions that start as traps (no assembly required...)
and works better for npx exceptions that start as interrupts (there is
no longer a problem for nested interrupts).

Submitted by:	original (pre-SMPng) version by luoqi
2001-05-22 21:20:49 +00:00
Dima Dorfman
a8dbafbe87 Correct the vm_mtx handling; specifically, don't acquire it in
shm_deallocate_segment because shmexit_myhook calls it, and the latter
should always be called with it already held.

Submitted by:	dwmalone, dd
Approved by:	alfred
2001-05-22 03:56:26 +00:00
Alfred Perlstein
a4d22b8035 Remove KASSERT test for sleeping on mv_mtx, instead let WITNESS catch
it.

Requested by: jhb
2001-05-22 00:58:20 +00:00
John Baldwin
9dceb26b23 Sort includes. 2001-05-21 18:52:02 +00:00
John Baldwin
270b041d95 - Assert that the vm mutex is held in pipe_free_kmem().
- Don't release the vm mutex early in pipespace() but instead hold it
  across vm_object_deallocate() if vm_map_find() returns an error and
  across pipe_free_kmem() if vm_map_find() succeeds.
- Add a XXX above a zfree() since zalloc already has its own locking,
  one would hope that zfree() wouldn't need the vm lock.
2001-05-21 18:47:17 +00:00
John Baldwin
d8aad40c88 Axe unneeded spl()'s. 2001-05-21 18:30:50 +00:00
Alfred Perlstein
67d1f21cbe Aquire vm mutex when releasing sysv shm segments.
Obtained from: Dima Dorfman <dima@unixfreak.org>
2001-05-20 20:37:47 +00:00
Jonathan Lemon
1890520a77 Add convenience function kernel_sysctlbyname() for kernel consumers,
so they don't have to roll their own sysctlbyname function.
2001-05-19 05:45:55 +00:00
Alfred Perlstein
5ee5c3aa1f remove my private assertions from tsleep.
add one assertion to ensure we don't sleep while holding vm.
2001-05-19 01:40:48 +00:00
Alfred Perlstein
2c3c846931 Regen syscalls that were made mpsafe via vm_mtx
obreak, getpagesize, sbrk, sstk, mmap, ovadvise, munmap, mprotect,
madvise, mincore, mmap, mlock, munlock, minherit, msync, mlockall,
munlockall
2001-05-19 01:37:12 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
John Baldwin
1ad5401134 - Don't panic on a try lock operation for a sleep lock if we hold a spin
lock.  Since we won't actually block on a try lock operation, it's not
  a problem.  Add a comment explaining why it is safe to skip lock order
  checking with try locks.
- Remove the ithread list lock spin lock from the order list.
2001-05-17 22:44:56 +00:00
John Baldwin
4d29cb2db9 - Remove the global ithread_list_lock spin lock in favor of per-ithread
sleep locks.
- Delay returning from ithread_remove_handler() until we are certain that
  the interrupt handler being removed has in fact been removed from the
  ithread.
- XXX: There is still a problem in that nothing protects the kernel from
  adding a new handler while the ithread is running, though with our
  current architectures this is not a problem.

Requested by:	gibbs (2)
2001-05-17 22:43:26 +00:00
John Baldwin
7a08bae6ec - Move the setting of bootverbose to a MI SI_SUB_TUNABLES SYSINIT.
- Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be
  toggled after boot.
- Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just
  afer the display of the copyright message instead of doing it by hand in
  three MD places.
2001-05-17 22:28:46 +00:00
Robert Watson
6bd1912df4 o Modify access control checks in p_candebug() such that the policy is as
follows: the effective uid of p1 (subject) must equal the real, saved,
  and effective uids of p2 (object), p2 must not have undergone a
  credential downgrade.  A subject with appropriate privilege may override
  these protections.

  In the future, we will extend these checks to require that p1 effective
  group membership must be a superset of p2 effective group membership.

Obtained from:	TrustedBSD Project
2001-05-17 21:48:44 +00:00
Alfred Perlstein
0fd061c0c4 Cleanup
Remove comment about setting error for reads on EOF, read returns 0 on
EOF so the code should be ok.

Remove non-effective priority boost, PRIO+1 doesn't do anything
(according to McKusick), if a real priority boost is needed it should
have been +4.

Style fixes:
.) return foo -> return (foo)
.) FLAG1|FlAG2 -> FLAG1 | FlAG2
.) wrap long lines
.) unwrap short lines
.) for(i=0;i=foo;i++) -> for (i = 0; i=foo; i++)
.) remove braces for some conditionals with a single statement
.) fix continuation lines.

md5 couldn't verify the binary because some code had to
be shuffled around to address the style issues.
2001-05-17 19:47:09 +00:00
Alfred Perlstein
2deb4a20c3 initialize pipe pointers 2001-05-17 18:22:58 +00:00
Alfred Perlstein
82a283fcf3 pipe_create has to zero out the select record earlier to avoid
returning a half-initialized pipe and causing pipeclose() to follow
a junk pointer.

Discovered by: "Nick S" <snicko@noid.org>
2001-05-17 17:59:28 +00:00
Ian Dowse
0864ef1e8a Change the second argument of vflush() to an integer that specifies
the number of references on the filesystem root vnode to be both
expected and released. Many filesystems hold an extra reference on
the filesystem root vnode, which must be accounted for when
determining if the filesystem is busy and then released if it isn't
busy. The old `skipvp' approach required individual filesystem
xxx_unmount functions to re-implement much of vflush()'s logic to
deal with the root vnode.

All 9 filesystems that hold an extra reference on the root vnode
got the logic wrong in the case of forced unmounts, so `umount -f'
would always fail if there were any extra root vnode references.
Fix this issue centrally in vflush(), now that we can.

This commit also fixes a vnode reference leak in devfs, which could
result in idle devfs filesystems that refuse to unmount.

Reviewed by:	phk, bp
2001-05-16 18:04:37 +00:00
Alfred Perlstein
a428c5ffef remove include of ipl.h because it no longer exists 2001-05-16 02:52:06 +00:00
John Baldwin
8bd57f8fc2 Remove unneeded includes of sys/ipl.h and machine/ipl.h. 2001-05-15 23:22:29 +00:00
John Baldwin
74fc745594 - Remove unneeded include of sys/ipl.h.
- Lock the process before calling killproc() to kill it for exceeding the
  maximum CPU limit.
2001-05-15 23:15:06 +00:00
John Baldwin
9081e5e826 - Remove unneeded include of sys/ipl.h.
- Require the proc lock be held for killproc() to allow for the vmdaemon to
  kill a process when memory is exhausted while holding the lock of the
  process to kill.
2001-05-15 23:13:58 +00:00
Brian Somers
eeee064735 Support /dev/ctty again
Submitted by:	peter
2001-05-15 18:12:38 +00:00
Seigo Tanimura
1b36970495 Back out scanning file descriptors with holding a process lock.
selrecord() requires allproc sx in pfind(), resulting in lock order
reversal between allproc and a process lock.
2001-05-15 10:19:57 +00:00
Jonathan Lemon
97f6754ff1 When calling poll() on a fd associated with a filesystem, let POLLIN/POLLOUT
behave identically to POLLRDNORM/POLLWRNORM.

Submitted by: bde
PR: 27287
merge after: 1 week
2001-05-14 14:37:25 +00:00
Poul-Henning Kamp
241e77c8a5 Use the new ability to avoid practically all the gunk in this file.
When people access /dev/tty, locate their controlling tty and return
the dev_t of it to them.  This basically makes /dev/tty act like
a variant symlink sort of thing which is much simpler than all the
mucking about with vnodes.
2001-05-14 08:22:56 +00:00
Seigo Tanimura
265fc98f36 - Convert msleep(9) in select(2) and poll(2) to cv_*wait*(9).
- Since polling should not involve sleeping, keep holding a
  process lock upon scanning file descriptors.

- Hold a reference to every file descriptor prior to entering
  polling loop in order to avoid lock order reversal between
  lockmgr and p_mtx upon calling fdrop() in fo_poll().
  (NOTE: this work has not been done for netncp and netsmb
  yet because a socket itself has no reference counts.)

Reviewed by:	jhb
2001-05-14 05:26:48 +00:00
John Baldwin
1efb92b7ca Simplify the vm fault trap handling code a bit by using if-else instead of
duplicating code in the then case and then using a goto to jump around
the else case.
2001-05-11 23:50:08 +00:00
Ian Dowse
1feb7a6efa In vrele() and vput(), avoid triggering the confusing "missed vn_close"
KASSERT when vp->v_usecount is zero or negative. In this case, the
"v*: negative ref cnt" panic that follows is much more appropriate.

Reviewed by:	mckusick
2001-05-11 20:42:41 +00:00
John Baldwin
9e5620599e Check witness_dead in more functions to avoid panic'ing when assertions
fail due to witness exhausting its internal resources and shutting down.

Reported by:	Szilveszter Adam <sziszi@petra.hos.u-szeged.hu>
Tested by:	David Wolfskill <david@catwhisker.org>
2001-05-11 20:25:29 +00:00
Tor Egge
dd1c45f3ca Regenerate. 2001-05-11 17:05:47 +00:00
Tor Egge
b4b469e6bb gettimeofday() is MP safe on both -current and -stable. 2001-05-11 17:05:12 +00:00
John Baldwin
ba228f6d96 - Split out the support for per-CPU data from the SMP code. UP kernels
have per-CPU data and gdb on the i386 at least needs access to it.
- Clean up includes in kern_idle.c and subr_smp.c.

Reviewed by:	jake
2001-05-10 17:45:49 +00:00
Alfred Perlstein
97d4578662 Remove an 'optimization' I hope to never see again.
The pipe code could not handle running out of kva, it would panic
if that happened.  Instead return ENFILE to the application which
is an acceptable error return from pipe(2).

There was some slightly tricky things that needed to be worked on,
namely that the pipe code can 'realloc' the size of the buffer if
it detects that the pipe could use a bit more room.  However if it
failed the reallocation it could not cope and would panic.  Fix
this by attempting to grow the pipe while holding onto our old
resources.  If all goes well free the old resources and use the
new ones, otherwise continue to use the smaller buffer already
allocated.

While I'm here add a few blank lines for style(9) and remove
'register'.
2001-05-08 09:09:18 +00:00
Poul-Henning Kamp
e0e0b6610e Always initialize bio_resid from bio_bcount in the disk mini-layer so
that the drivers don't have to do it umpteen times.
2001-05-08 08:24:54 +00:00
Akinori MUSHA
3b26be6ae1 Properly copy the P_ALTSTACK flag in struct proc::p_flag to the child
process on fork(2).

It is the supposed behavior stated in the manpage of sigaction(2), and
Solaris, NetBSD and FreeBSD 3-STABLE correctly do so.

The previous fix against libc_r/uthread/uthread_fork.c fixed the
problem only for the programs linked with libc_r, so back it out and
fix fork(2) itself to help those not linked with libc_r as well.

PR:		kern/26705
Submitted by:	KUROSAWA Takahiro <fwkg7679@mb.infoweb.ne.jp>
Tested by:	knu, GOTOU Yuuzou <gotoyuzo@notwork.org>,
		and some other people
Not objected by:	hackers
MFC in:		3 days
2001-05-07 18:07:29 +00:00
Poul-Henning Kamp
079f2df393 Make the disk mini-layer check for and handle zero-length transfers
instead of the underlying drivers.
2001-05-06 21:55:22 +00:00
Poul-Henning Kamp
a468031ce8 Actually biofinish(struct bio *, struct devstat *, int error) is more general
than the bioerror().

Most of this patch is generated by scripts.
2001-05-06 20:00:03 +00:00
Poul-Henning Kamp
b966319db7 Fix return type of vop_stdputpages()
Noticed by:	rwatson
2001-05-06 17:40:22 +00:00
Robert Watson
29b2efeb6b o First step in cleaning up authorization code for the posix4
implementation.  Move from direct uid 0 comparision to using suser_xxx()
  call with the same semantics.  Simplify CAN_AFFECT() macro as passed
  pcred was redundant.  The checks here still aren't "right", but they
  are probably "better".

Obtained from:	TrustedBSD Project
2001-05-06 16:15:42 +00:00
Matthew Dillon
1766b2e5fa Raise the SysV shared memory defaults to more reasonable values.
Mainly increases the shared memory limit from 4M to 32M (approx).
Many more programs these days use SysV shared memory, especially X-related
programs.
2001-05-04 18:43:19 +00:00
John Baldwin
6c49a8e295 Fix a bug in the pfind() changes due to confusing the process returned by
pfind() ('pp') with the process being detached from ptrace.

Reported by:	bde
2001-05-04 18:13:11 +00:00
John Baldwin
2d96f0b145 - Move state about lock objects out of struct lock_object and into a new
struct lock_instance that is stored in the per-process and per-CPU lock
  lists.  Previously, the lock lists just kept a pointer to each lock held.
  That pointer is now replaced by a lock instance which contains a pointer
  to the lock object, the file and line of the last acquisition of a lock,
  and various flags about a lock including its recursion count.
- If we sleep while holding a sleepable lock, then mark that lock instance
  as having slept and ignore any lock order violations that occur while
  acquiring Giant when we wake up with slept locks.  This is ok because of
  Giant's special nature.
- Allow witness to differentiate between shared and exclusive locks and
  unlocks of a lock.  Witness will now detect the case when a lock is
  acquired first in one mode and then in another.  Mutexes are always
  locked and unlocked exclusively.  Witness will also now detect the case
  where a process attempts to unlock a shared lock while holding an
  exclusive lock and vice versa.
- Fix a bug in the lock list implementation where we used the wrong
  constant to detect the case where a lock list entry was full.
2001-05-04 17:15:16 +00:00
John Baldwin
ac07d659c3 Don't hold the process mutex across calls to FREE() since the vm system
uses lockmgr locks and this leads to a lock order reversal.  At this point
in wait1() the process is not on any process lists or in the process tree,
so no other process should be able to find it or have a reference to it
anyways, so the locking is not needed.
2001-05-04 16:13:28 +00:00
Poul-Henning Kamp
a62615e59b Implement vop_std{get|put}pages() and add them to the default vop[].
Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but
the default.
2001-05-01 08:34:45 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Alfred Perlstein
aad7597ce0 When panic()'ing because of recursion on a non-recursive mutex, print
out the location it was initially locked.

Ok'd by: jake
2001-04-30 01:01:52 +00:00
Jake Burkholder
e6af1080c2 Make rtprio work again.
- add a missing break which caused RTP_SET to always return EINVAL
- break instead of returning if p_can fails so proc_lock is always
  dropped correctly
- only copyin data that is actually needed
- use break instead of goto
- make rtp_to_pri return EINVAL instead of -1 if the values are out
  or range so we don't have to translate
2001-04-29 22:09:26 +00:00
Robert Watson
46157a65d7 o As part of the move to not maintaining copies of the vnode owning uid
and gid in the ACL, vaccess_acl_posix1e() was changed to accept
  explicit file_uid and file_gid as arguments.  However, in making the
  change, I explicitly checked file_gid against cr->cr_groups[0], rather
  than using groupmember, resulting in ACL_GROUP_OBJ entries being
  compared to the caller's effective gid only, not the remainder of
  its groups.  This was recently corrected for the version of the
  group call without privilege, but the second test (when privilege is
  added) was missed.  This change replaces an additiona cr->cr_groups[0]
  check with groupmember().

Pointed out by:	jedgar
Reviewed by:	jedgar
Obtained from:	TrustedBSD Project
2001-04-29 19:53:50 +00:00
Poul-Henning Kamp
855aa097af VOP_BALLOC was never really a VOP in the first place, so convert it
to UFS_BALLOC like the other "between UFS and FFS function interfaces".
2001-04-29 12:36:52 +00:00
Poul-Henning Kamp
b7ebffbc08 Add a vop_stdbmap(), and make it part of the default vop vector.
Make 7 filesystems which don't really know about VOP_BMAP rely
on the default vector, rather than more or less complete local
vop_nopbmap() implementations.
2001-04-29 11:48:41 +00:00
Greg Lehey
60fb0ce365 Revert consequences of changes to mount.h, part 2.
Requested by:	bde
2001-04-29 02:45:39 +00:00
Alfred Perlstein
6157b69f4a Instead of asserting that a mutex is not still locked after unlocking it,
assert that the mutex is owned and not recursed prior to unlocking it.

This should give a clearer diagnostic when a programming error is caught.
2001-04-28 12:11:01 +00:00
John Baldwin
6caa8a1501 Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
  into hardclock_process() and statclock_process() respectively.  hardclock()
  and statclock() call the *_process() functions for the current process so
  that UP systems will run as before.  For SMP systems, it is simply necessary
  to ensure that all other processors execute the *_process() functions when the
  main clock functions are triggered on one CPU by an interrupt.  For the alpha
  4100, clock interrupts are delievered in a staggered broadcast fashion, so
  we simply call hardclock/statclock on the boot CPU and call the *_process()
  functions on the secondaries.  For x86, we call statclock and hardclock as
  usual and then call forward_hardclock/statclock in the MD code to send an IPI
  to cause the AP's to execute forwared_hardclock/statclock which then call the
  *_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
  involve less hackery.  Now the cpu doing the forward sets any flags, etc. and
  sends a very simple IPI_AST to the other cpu(s).  AST IPIs now just basically
  return so that they can execute ast() and don't bother with setting the
  astpending or needresched flags themselves.  This also removes the loop in
  forward_signal() as sched_lock closes the race condition that the loop worked
  around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
  a process to act on rather than assuming curproc so that they can be used to
  implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
  header sys/smp.h declares MI SMP variables and API's.   The IPI API's from
  machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
  SLIST of globaldata structures has become MI and moved into subr_smp.c.
  Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by:	jake, peter
Looked over by:	eivind
2001-04-27 19:28:25 +00:00
Alfred Perlstein
3abedb4e01 Actually show the values that tripped the assertion "receive 1" 2001-04-27 13:42:50 +00:00
Robert Watson
80c9c40df9 o Remove the disabled p_cansched() test cases that permitted users to
modify the scheduling properties of processes with a different real
  uid but the same effective uid (i.e., daemons, et al).  (note: these
  cases were previously commented out, so this does not change the
  compiled code at al)

Obtained from:	TrustedBSD Project
2001-04-27 01:56:32 +00:00
Poul-Henning Kamp
8ee8b21b48 vfs_subr.c is getting rather fat. The underlying repocopy and this
commit moves the filesystem export handling code to vfs_export.c
2001-04-26 20:47:14 +00:00
Alfred Perlstein
06336fb26d Sendfile is documented to return 0 on success, however if when a
sf_hdtr is used to provide writev(2) style headers/trailers on the
sent data the return value is actually either the result of writev(2)
from the trailers or headers of no tailers are specified.

Fix sendfile to comply with the documentation, by returning 0 on
success.

Ok'd by: dg
2001-04-26 00:14:14 +00:00
Seigo Tanimura
ebdc3f1d2d Do not leave a process with no credential in zombproc.
Reviewed by:	jhb
2001-04-25 10:22:35 +00:00
Kirk McKusick
112f737245 When closing the last reference to an unlinked file, it is freed
by the inactive routine. Because the freeing causes the filesystem
to be modified, the close must be held up during periods when the
filesystem is suspended.

For snapshots to be consistent across crashes, they must write
blocks that they copy and claim those written blocks in their
on-disk block pointers before the old blocks that they referenced
can be allowed to be written.

Close a loophole that allowed unwritten blocks to be skipped when
doing ffs_sync with a request to wait for all I/O activity to be
completed.
2001-04-25 08:11:18 +00:00
Poul-Henning Kamp
a13234bb35 Move the netexport structure from the fs-specific mountstructure
to struct mount.

This makes the "struct netexport *" paramter to the vfs_export
and vfs_checkexport interface unneeded.

Consequently that all non-stacking filesystems can use
vfs_stdcheckexp().

At the same time, make it a pointer to a struct netexport
in struct mount, so that we can remove the bogus AF_MAX
and #include <net/radix.h> from <sys/mount.h>
2001-04-25 07:07:52 +00:00
Thomas Moestl
83f3198b2b Change uipc_sockaddr so that a sockaddr_un without a path is returned
nam for an unbound socket instead of leaving nam untouched in that case.
This way, the getsockname() output can be used to determine the address
family of such sockets (AF_LOCAL).

Reviewed by:	iedowse
Approved by:	rwatson
2001-04-24 19:09:23 +00:00
John Baldwin
33a9ed9d0e Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by:	-smp, dfr, jake
2001-04-24 00:51:53 +00:00
Thomas Moestl
e15480f8dd Fix a bug introduced in the last commit: vaccess_acl_posix1 only checked
the file gid gainst the egid of the accessing process for the
ACL_GROUP_OBJ case, and ignored supplementary groups.

Approved by:	rwatson
2001-04-23 22:52:26 +00:00
Greg Lehey
d98dc34f52 Correct #includes to work with fixed sys/mount.h. 2001-04-23 09:05:15 +00:00
Robert Watson
5ea6583e2d o Remove comment indicating policy permits loop-back debugging, but
semantics don't: in practice, both policy and semantics permit
  loop-back debugging operations, only it's just a subset of debugging
  operations (i.e., a proc can open its own /dev/mem), and that's at a
  higher layer.
2001-04-21 22:41:45 +00:00
John Baldwin
9d4f526475 Spelling nit: acquring -> acquiring.
Reported by:	T. William Wells <bill@twwells.com>
2001-04-21 01:50:32 +00:00
Alfred Perlstein
98689e1e70 Assert that when using an interlock mutex it is not recursed when lockmgr()
is called.

Ok'd by: jhb
2001-04-20 22:38:40 +00:00
John Baldwin
242d02a13f Make the ap_boot_mtx mutex static. 2001-04-20 01:09:05 +00:00
John Baldwin
d8915a7f34 - Whoops, forgot to enable the clock lock in the spin order list on the
alpha.
- Change the Debugger() functions to pass in the real function name.
2001-04-19 15:49:54 +00:00
Bosko Milekic
d04d50d1f7 Fix inconsistency in setup of kernel_map: we need to make sure that
we also reserve _adequate_ space for the mb_map submap; i.e. we need
space for nmbclusters, nmbufs, _and_ nmbcnt. Furthermore, we need to
rounddown, and not roundup, so that we are consistent.

Pointed out by: bde
2001-04-18 23:54:13 +00:00
Alfred Perlstein
2f3cf91876 Check validity of signal callback requested via aio routines.
Also move the insertion of the request to after the request is validated,
there's still looks like there may be some problems if an invalid address
is passed to the aio routines, basically a possible leak or having a
not completely initialized structure on the queue may still be possible.

A new sig macro was made _SIG_VALID to check the validity of a signal,
it would be advisable to use it from now on (in kern/kern_sig.c) rather
than rolling your own.

PR: kern/17152
2001-04-18 22:18:39 +00:00
Seigo Tanimura
759cb26335 Reclaim directory vnodes held in namecache if few free vnodes are
available.

Only directory vnodes holding no child directory vnodes held in
v_cache_src are recycled, so that directory vnodes near the root of
the filesystem hierarchy remain in namecache and directory vnodes are
not reclaimed in cascade.

The period of vnode reclaiming attempt and the number of vnodes
attempted to reclaim can be tuned via sysctl(2).

Suggested by:	tegge
Approved by:	phk
2001-04-18 11:19:50 +00:00
Poul-Henning Kamp
793d6d5d57 bread() is a special case of breadn(), so don't replicate code. 2001-04-18 07:16:07 +00:00
Dima Dorfman
25c7870e5d Make this driver play ball with devfs(5).
Reviewed by:	brian
2001-04-17 20:53:11 +00:00
Alfred Perlstein
e04670b734 Add a sanity check on ucred refcount.
Submitted by: Terry Lambert <terry@lambert.org>
2001-04-17 20:50:43 +00:00
Alfred Perlstein
603c86672c Implement client side NFS locks.
Obtained from: BSD/os
Import Ok'd by: mckusick, jkh, motd on builder.freebsd.org
2001-04-17 20:45:23 +00:00
Poul-Henning Kamp
0dfba3cef1 Write a switch statement as less obscure if statements. 2001-04-17 20:22:07 +00:00
John Baldwin
e3ee8974e3 Fix an old bug related to BETTER_CLOCK. Call forward_*clock if SMP
and __i386__ are defined rather than if SMP and BETTER_CLOCK are defined.
The removal of BETTER_CLOCK would have broken this except that kern_clock.c
doesn't include <machine/smptests.h>, so it doesn't see the definition of
BETTER_CLOCK, and forward_*clock aren't called, even on 4.x.  This seems to
fix the problem where a n-way SMP system would see 100 * n clk interrupts
and 128 * n rtc interrupts.
2001-04-17 17:53:36 +00:00
Poul-Henning Kamp
f84e29a06c This patch removes the VOP_BWRITE() vector.
VOP_BWRITE() was a hack which made it possible for NFS client
side to use struct buf with non-bio backing.

This patch takes a more general approach and adds a bp->b_op
vector where more methods can be added.

The success of this patch depends on bp->b_op being initialized
all relevant places for some value of "relevant" which is not
easy to determine.  For now the buffers have grown a b_magic
element which will make such issues a tiny bit easier to debug.
2001-04-17 08:56:39 +00:00
Kirk McKusick
5819ab3f12 Add debugging option to always read/write cylinder groups as full
sized blocks. To enable this option, use: `sysctl -w debug.bigcgs=1'.
Add debugging option to disable background writes of cylinder
groups. To enable this option, use: `sysctl -w debug.dobkgrdwrite=0'.
These debugging options should be tried on systems that are panicing
with corrupted cylinder group maps to see if it makes the problem
go away. The set of panics in question are:

	ffs_clusteralloc: map mismatch
	ffs_nodealloccg: map corrupted
	ffs_nodealloccg: block not in map
	ffs_alloccg: map corrupted
	ffs_alloccg: block not in map
	ffs_alloccgblk: cyl groups corrupted
	ffs_alloccgblk: can't find blk in cyl
	ffs_checkblk: partially free fragment

The following panics are less likely to be related to this problem,
but might be helped by these debugging options:

	ffs_valloc: dup alloc
	ffs_blkfree: freeing free block
	ffs_blkfree: freeing free frag
	ffs_vfree: freeing free inode

If you try these options, please report whether they helped reduce your
bitmap corruption panics to Kirk McKusick at <mckusick@mckusick.com>
and to Matt Dillon <dillon@earth.backplane.com>.
2001-04-17 05:37:51 +00:00
Robert Watson
b114e127e6 In my first reading of POSIX.1e, I misinterpreted handling of the
ACL_USER_OBJ and ACL_GROUP_OBJ fields, believing that modification of the
access ACL could be used by privileged processes to change file/directory
ownership.  In fact, this is incorrect; ACL_*_OBJ (+ ACL_MASK and
ACL_OTHER) should have undefined ae_id fields; this commit attempts
to correct that misunderstanding.

o Modify arguments to vaccess_acl_posix1e() to accept the uid and gid
  associated with the vnode, as those can no longer be extracted from
  the ACL passed as an argument.  Perform all comparisons against
  the passed arguments.  This actually has the effect of simplifying
  a number of components of this call, as well as reducing the indent
  level, but now seperates handling of ACL_GROUP_OBJ from ACL_GROUP.

o Modify acl_posix1e_check() to return EINVAL if the ae_id field of
  any of the ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} entries is a value
  other than ACL_UNDEFINED_ID.  As a temporary work-around to allow
  clean upgrades, set the ae_id field to ACL_UNDEFINED_ID before
  each check so that this cannot cause a failure in the short term
  (this work-around will be removed when the userland libraries and
  utilities are updated to take this change into account).

o Modify ufs_sync_acl_from_inode() so that it forces
  ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} ae_id fields to ACL_UNDEFINED_ID
  when synchronizing the ACL from the inode.

o Modify ufs_sync_inode_from_acl to not propagate uid and gid
  information to the inode from the ACL during ACL update.  Also
  modify the masking of permission bits that may be set from
  ALLPERMS to (S_IRWXU|S_IRWXG|S_IRWXO), as ACLs currently do not
  carry none-ACCESSPERMS (S_ISUID, S_ISGID, S_ISTXT).

o Modify ufs_getacl() so that when it emulates an access ACL from
  the inode, it initializes the ae_id fields to ACL_UNDEFINED_ID.

o Clean up ufs_setacl() substantially since it is no longer possible
  to perform chown/chgrp operations using vop_setacl(), so all the
  access control for that can be eliminated.

o Modify ufs_access() so that it passes owner uid and gid information
  into vaccess_acl_posix1e().

Pointed out by:	jedger
Obtained from:	TrustedBSD Project
2001-04-17 04:33:34 +00:00
John Baldwin
abd9053ee4 Blow away the panic mutex in favor of using a single atomic_cmpset() on a
panic_cpu shared variable.  I used a simple atomic operation here instead
of a spin lock as it seemed to be excessive overhead.  Also, this can avoid
recursive panics if, for example, witness is broken.
2001-04-17 04:18:08 +00:00
John Baldwin
3c41f323c9 Check to see if enroll() returns NULL in the witness initialization. This
can happen if witness runs out of resources during initialization or if
witness_skipspin is enabled.

Sleuthing by:	Peter Jeremy <peter.jeremy@alcatel.com.au>
2001-04-17 03:35:38 +00:00
John Baldwin
7141f2ad46 Exit and re-enter the critical section while spinning for a spinlock so
that interrupts can come in while we are waiting for a lock.
2001-04-17 03:34:52 +00:00
John Hay
24dbea46a9 Update to the 2001-04-02 version of the nanokernel code from Dave Mills. 2001-04-16 13:05:05 +00:00
Brian Somers
56700d4634 Call strlen() once instead of twice. 2001-04-14 21:33:58 +00:00
Robert Watson
e9e7ff5b22 o Since uid checks in p_cansignal() are now identical between P_SUGID
and non-P_SUGID cases, simplify p_cansignal() logic so that the
  P_SUGID masking of possible signals is independent from uid checks,
  removing redundant code and generally improving readability.

Reviewed by:	tmm
Obtained from:	TrustedBSD Project
2001-04-13 14:33:45 +00:00
Alfred Perlstein
1375ed7eb7 convert if/panic -> KASSERT, explain what triggered the assertion 2001-04-13 10:15:53 +00:00
Murray Stokely
a4e6da691f Generate useful error messages. 2001-04-13 09:37:25 +00:00
Mark Murray
f0b60d7560 Handle a rare but fatal race invoked sometimes when SIGSTOP is
invoked.
2001-04-13 09:29:34 +00:00
John Baldwin
7a9aa5d372 - Add a comment at the start of the spin locks list.
- The alpha SMP code uses an "ap boot" spinlock as well.
2001-04-13 08:31:38 +00:00
Robert Watson
44c3e09cdc o Disallow two "allow this" exceptions in p_cansignal() restricting
the ability of unprivileged processes to deliver arbitrary signals
  to daemons temporarily taking on unprivileged effective credentials
  when P_SUGID is not set on the target process:
  Removed:
     (p1->p_cred->cr_ruid != ps->p_cred->cr_uid)
     (p1->p_ucred->cr_uid != ps->p_cred->cr_uid)
o Replace two "allow this" exceptions in p_cansignal() restricting
  the ability of unprivileged processes to deliver arbitrary signals
  to daemons temporarily taking on unprivileged effective credentials
  when P_SUGID is set on the target process:
  Replaced:
     (p1->p_cred->p_ruid != p2->p_ucred->cr_uid)
     (p1->p_cred->cr_uid != p2->p_ucred->cr_uid)
  With:
     (p1->p_cred->p_ruid != p2->p_ucred->p_svuid)
     (p1->p_ucred->cr_uid != p2->p_ucred->p_svuid)
o These changes have the effect of making the uid-based handling of
  both P_SUGID and non-P_SUGID signal delivery consistent, following
  these four general cases:
     p1's ruid equals p2's ruid
     p1's euid equals p2's ruid
     p1's ruid equals p2's svuid
     p1's euid equals p2's svuid
  The P_SUGID and non-P_SUGID cases can now be largely collapsed,
  and I'll commit this in a few days if no immediate problems are
  encountered with this set of changes.
o These changes remove a number of warning cases identified by the
  proc_to_proc inter-process authorization regression test.
o As these are new restrictions, we'll have to watch out carefully for
  possible side effects on running code: they seem reasonable to me,
  but it's possible this change might have to be backed out if problems
  are experienced.

Submitted by:		src/tools/regression/security/proc_to_proc/testuid
Reviewed by:		tmm
Obtained from:	TrustedBSD Project
2001-04-13 03:06:22 +00:00
Robert Watson
0489082737 o Disable two "allow this" exceptions in p_cansched()m retricting the
ability of unprivileged processes to modify the scheduling properties
  of daemons temporarily taking on unprivileged effective credentials.
  These cases (p1->p_cred->p_ruid == p2->p_ucred->cr_uid) and
  (p1->p_ucred->cr_uid == p2->p_ucred->cr_uid), respectively permitting
  a subject process to influence the scheduling of a daemon if the subject
  process has the same real uid or effective uid as the daemon's effective
  uid.  This removes a number of the warning cases identified by the
  proc_to_proc iner-process authorization regression test.
o As these are new restrictions, we'll have to watch out carefully for
  possible side effects on running code: they seem reasonable to me,
  but it's possible this change might have to be backed out if problems
  are experienced.

Reported by:	src/tools/regression/security/proc_to_proc/testuid
Obtained from:	TrustedBSD Project
2001-04-12 22:46:07 +00:00
Robert Watson
e386f9bda3 o Make kqueue's filt_procattach() function use the error value returned
by p_can(...P_CAN_SEE), rather than returning EACCES directly.  This
  brings the error code used here into line with similar arrangements
  elsewhere, and prevents the leakage of pid usage information.

Reviewed by:	jlemon
Obtained from:	TrustedBSD Project
2001-04-12 21:32:02 +00:00
Robert Watson
d34f8d3030 o Limit process information leakage by introducing a p_can(...P_CAN_SEE...)
in rtprio()'s RTP_LOOKIP implementation.

Obtained from:	TrustedBSD Project
2001-04-12 20:46:26 +00:00
Robert Watson
eb9e5c1d72 o Reduce information leakage into jails by adding invocations of
p_can(...P_CAN_SEE...) to getpgid(), getsid(), and setpgid(),
  blocking these operations on processes that should not be visible
  by the requesting process.  Required to reduce information leakage
  in MAC environments.

Obtained from:	TrustedBSD Project
2001-04-12 19:39:00 +00:00
Robert Watson
4c5eb9c397 o Replace p_cankill() with p_cansignal(), remove wrappage of p_can()
from signal authorization checking.
o p_cansignal() takes three arguments: subject process, object process,
  and signal number, unlike p_cankill(), which only took into account
  the processes and not the signal number, improving the abstraction
  such that CANSIGNAL() from kern_sig.c can now also be eliminated;
  previously CANSIGNAL() special-cased the handling of SIGCONT based
  on process session.  privused is now deprecated.
o The new p_cansignal() further limits the set of signals that may
  be delivered to processes with P_SUGID set, and restructures the
  access control check to allow it to be extended more easily.
o These changes take into account work done by the OpenBSD Project,
  as well as by Robert Watson and Thomas Moestl on the TrustedBSD
  Project.

Obtained from:  TrustedBSD Project
2001-04-12 02:38:08 +00:00
Robert Watson
40829dd2dc o Regenerated following introduction of __setugid() system call for
"options REGRESSION".

Obtained from:	TrustedBSD Project
2001-04-11 20:21:37 +00:00
Robert Watson
130d0157d1 o Introduce a new system call, __setsugid(), which allows a process to
toggle the P_SUGID bit explicitly, rather than relying on it being
  set implicitly by other protection and credential logic.  This feature
  is introduced to support inter-process authorization regression testing
  by simplifying userland credential management allowing the easy
  isolation and reproduction of authorization events with specific
  security contexts.  This feature is enabled only by "options REGRESSION"
  and is not intended to be used by applications.  While the feature is
  not known to introduce security vulnerabilities, it does allow
  processes to enter previously inaccessible parts of the credential
  state machine, and is therefore disabled by default.  It may not
  constitute a risk, and therefore in the future pending further analysis
  (and appropriate need) may become a published interface.

Obtained from:	TrustedBSD Project
2001-04-11 20:20:40 +00:00
John Baldwin
7b531e6037 Stick proc0 in the PID hash table. 2001-04-11 18:50:50 +00:00