1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-20 11:11:24 +00:00
Commit Graph

3942 Commits

Author SHA1 Message Date
Robert Watson
e8f7a95298 Add error checking for copyin() operations in posix4 scheduling code. 2001-06-28 22:53:42 +00:00
John Baldwin
ec178c1e4c Don't check witness assertions if the lock doesn't use witness or witness
is dead.
2001-06-28 22:22:20 +00:00
John Baldwin
cd2f721557 - Fix a mntvnode and vnode interlock reversal.
- Protect the mnt_vnode list with the mntvnode lock.
2001-06-28 04:05:54 +00:00
John Baldwin
5f36700a32 - Add trylock variants of shared and exclusive locks.
- The sx assertions don't actually need the internal sx mutex lock, so
  don't bother doing so.
- Add a new assertion SX_ASSERT_LOCKED() that asserts that either a
  shared or exclusive lock should be held.  This assertion should be used
  instead of SX_ASSERT_SLOCKED() in almost all cases.
- Adjust some KASSERT()'s to include file and line information.
- Use the new witness_assert() function in the WITNESS case for sx slock
  asserts to verify that the current thread actually owns a slock.
2001-06-27 06:39:37 +00:00
John Baldwin
04297fe609 - Add a new witness_assert() to perform arbitrary locking assertions.
- Clean up the KTR tracepoints to be slighlty more consistent and useful
- Fix a bug in WITNESS where we would recurse indefinitely and blow the
  stack when acquiring Giant after sleeping with a sleepable lock held.

Reported by:	tanimura (3)
2001-06-27 06:27:29 +00:00
John Baldwin
776e0b3693 - Always use the proc lock of the task leader to protect the peers list of
processes.
- Don't construct fake call args and then call kill().  psignal is not
  anymore complicated and is quicker and not prone to locking problems.
  Calling psignal() avoids having to do a pfind() since we already have a
  proc pointer and also allows us to keep the task leader locked while we
  kill all the peer processes so the list is kept coherent.
- When a kthread exits, do a wakeup() on its proc pointers.  This can be
  used by kernel modules that have kthreads and want to ensure they have
  safely exited before completely the MOD_UNLOAD event.

Connectivity provided by:	Usenix wireless
2001-06-27 06:15:44 +00:00
John Baldwin
b7e554f5d6 - Move the 'clk' spinlock below other spin locks since KTR trace events
may need the clock lock for nanotime().
- Add KTR trace events for lock list manipulations and other witness
  operations.
- Use a temporary variable instead of setting the lock list head directly
  and then setting up the links to add a new lock list entry to the lock
  list.  This small race could result in witness "forgetting" about all
  the locks held by this process temporarily during an interrupt.
- Close a more fatal race condition when removing a lock from a list.
  Removing a lock from the list entails both decrementing the count of
  items in this bucket as well as shuffling items in the current bucket up
  a notch to replace the gap left by the removed item.  Wrap these
  operations in a critical section.
2001-06-25 23:17:52 +00:00
John Baldwin
1715f07da3 - Replace the unused KTR_IDLELOOP trace class with a new KTR_WITNESS trace
class to trace witness events.
- Make the ktr_cpu field of ktr_entry be a standard field rather than one
  present only in the KTR_EXTEND case.
- Move the default definition of KTR_ENTRIES from sys/ktr.h to
  kern/kern_ktr.c.  It has not been needed in the header file since KTR
  was un-inlined.
- Minor include cleanup in kern/kern_ktr.c.
- Fiddle with the ktr_cpumask in ktr_tracepoint() to disable KTR events
  on the current CPU while we are processing an event.
- Set the current CPU inside of the critical section to ensure we don't
  migrate CPU's after the critical section but before we set the CPU.
2001-06-25 23:09:31 +00:00
John Baldwin
1d79f1bb9a - Sort includes.
- Count the context switches during shutdown when we give ithreads a chance
  to run as volutary context switches.

Submitted by:	bde (2)
2001-06-25 18:30:42 +00:00
John Baldwin
c4f7a18726 Count the context switch when blocking on a mutex as a voluntary context
switch.  Count the context switch when preempting the current thread to let
a higher priority thread blocked on a mutex we just released run as an
involuntary context switch.

Reported by:	bde
2001-06-25 18:29:32 +00:00
John Baldwin
84bbc4dbda Count the switch when an ithread goes idle as a voluntary context switch.
Submitted by:	bde
2001-06-25 18:27:33 +00:00
David Malone
db3cc2d09f Don't dereference a NULL pointer if we fail to get a sendfilebuf. 2001-06-24 12:27:30 +00:00
Matthew Dillon
c7503f60c4 After exhaustive discussions and some meandering and confusion, enough
people are on track with the cause and effect of this, and although
fixing this severely degenerate case appears to violate the letter of
POSIX.1-200x, Bruce and I (and enough others) agree that it should be
comitted.

So, this patch generates an ENOENT error for any attempt to do a path lookup
through an empty symlink (e.g. open(), stat()).

Submitted by: "Andrey A. Chernov" <ache@nagual.pp.ru>
Reviewed by: bde
Discussed exhaustively on: freebsd-current
Previously committed to: NetBSD 4 years ago
2001-06-24 05:24:41 +00:00
John Baldwin
1df95969b5 - Lock CURSIG() with the proc lock to close the signal race with psignal.
- Grab Giant around ktrace points.
- Clean up KTR_PROC tracepoints to not display the value of
  sched_lock.mtx_lock as it isn't really needed anymore and just obfuscates
  the messages.
- Add a few if conditions to replace gotos.
- Ensure that every msleep KTR event ends up with a matching msleep resume
  KTR event (this was broken when we didn't do a mi_switch()).
- Only note via ktrace that we resumed from a switch once rather than twice
  in several places in msleep().
- Remove spl's rom asleep and await as the proc lock and sched_lock provide
  all the needed locking.
- In mawait() add in a needed ktrace point for noting that we are about to
  switch out.
2001-06-22 23:11:26 +00:00
John Baldwin
87f9ffb805 - Lock CURSIG with the proc lock and don't release the proc lock until
after grabbing the sched lock to close a race.
- Lock ktrace points with Giant.
2001-06-22 23:06:38 +00:00
John Baldwin
06c836bbca - Grab the proc lock around CURSIG and postsig(). Don't release the proc
lock until after grabbing the sched_lock to avoid CURSIG racing with
  psignal.
- Don't grab Giant for addupc_task() as it isn't needed.

Reported by:	tegge (signal race), bde (addupc_task a while back)
2001-06-22 23:05:11 +00:00
John Baldwin
2ad7d3049a - Change CURSIG() and postsig() to require that the proc lock is held
rather than grabbing it and releasing it themselves.  This allows callers
  of these functions to get the lock to close race conditions.
- Grab Giant around ktrace in postsig.
- Count the switches performed on SIGSTOP's as involuntary context switches
  in the resource usage stats.

Reported by:	tegge (signal race), bde (missing csw stats)
2001-06-22 23:02:37 +00:00
Matt Jacob
2f7f966cb8 int -> size_t fix 2001-06-22 19:54:38 +00:00
Matt Jacob
8f5a1742c2 Temporary fix at least- define NCPU_PRESENT which will be mp_npcus for
SMP kernels, one (1) for non-SMP.
2001-06-22 16:03:23 +00:00
Jim Pirzyk
f83ae79fbe changed hostid from long to unsigned long to be able to store values > 2GB
on i386 platforms.  Also changed SYSCTL type from INT to ULONG and removed
comment about it.

PR:		kern/21132
MFC after:	1 month
2001-06-22 16:03:14 +00:00
Bosko Milekic
08442f8a82 Introduce numerous SMP friendly changes to the mbuf allocator. Namely,
introduce a modified allocation mechanism for mbufs and mbuf clusters; one
which can scale under SMP and which offers the possibility of resource
reclamation to be implemented in the future. Notable advantages:

 o Reduce contention for SMP by offering per-CPU pools and locks.
 o Better use of data cache due to per-CPU pools.
 o Much less code cache pollution due to excessively large allocation macros.
 o Framework for `grouping' objects from same page together so as to be able
   to possibly free wired-down pages back to the system if they are no longer
   needed by the network stacks.

 Additional things changed with this addition:

  - Moved some mbuf specific declarations and initializations from
    sys/conf/param.c into mbuf-specific code where they belong.
  - m_getclr() has been renamed to m_get_clrd() because the old name is really
    confusing. m_getclr() HAS been preserved though and is defined to the new
    name. No tree sweep has been done "to change the interface," as the old
    name will continue to be supported and is not depracated. The change was
    merely done because m_getclr() sounds too much like "m_get a cluster."
  - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and
    systat(1) (see TODO below).
  - Fixed systat(1) to display number of "free mbufs" based on new per-CPU
    stat structures.
  - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported
    per-CPU stat structures. All infos are fetched via sysctl.

 TODO (in order of priority):

  - Re-enable mbtypes statistics in both netstat(1) and systat(1) after
    introducing an SMP friendly way to collect the mbtypes stats under the
    already introduced per-CPU locks (i.e. hopefully don't use atomic() - it
    seems too costly for a mere stat update, especially when other locks are
    already present).
  - Optionally have systat(1) display not only "total free mbufs" but also
    "total free mbufs per CPU pool."
  - Fix minor length-fetching issues in netstat(1) related to recently
    re-enabled option to read mbuf stats from a core file.
  - Move reference counters at least for mbuf clusters into an unused portion
    of the cluster itself, to save space and need to allocate a counter.
  - Look into introducing resource freeing possibly from a kproc.

Reviewed by (in parts): jlemon, jake, silby, terry
Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha)
Preliminary performance measurements: jlemon (and me, obviously)
URL: http://people.freebsd.org/~bmilekic/mb_alloc/
2001-06-22 06:35:32 +00:00
John Baldwin
fbd26f7594 Fix some lock order reversals where we called free() while holding a proc
lock.  We now use temporary variables to save the process argument pointer
and just update the pointer while holding the lock.  We then perform the
free on the cached pointer after releasing the lock.
2001-06-20 23:10:06 +00:00
Bosko Milekic
f5eece3fb9 Change m_devget()'s outdated and unused `offset' argument to actually mean
something: offset into the first mbuf of the target chain before copying
the source data over.

Make drivers using m_devget() with a first argument "data - ETHER_ALIGN"
to use the offset argument to pass ETHER_ALIGN in. The way it was previously
done is potentially dangerous if the source data was at the top of a page
and the offset caused the previous page to be copied (if the
previous page has not yet been appropriately mapped).

The old `offset' argument in m_devget() is not used anywhere (it's always
0) and dates back to ~1995 (and earlier?) when support for ethernet trailers
existed. With that support gone, it was merely collecting dust.

Tested on alpha by: jlemon
Partially submitted by: jlemon
Reviewed by: jlemon
MFC after: 3 weeks
2001-06-20 19:48:35 +00:00
John Baldwin
2e1aacccac Preemption by an interrupt thread is an involuntary switch, not a voluntary
one.

Pointy-hat to:	me
2001-06-20 18:26:41 +00:00
Dag-Erling Smørgrav
0e79fe6f0e Constify (silence warnings introduced by last commit to sys/module.h) 2001-06-20 16:08:45 +00:00
Garrett Wollman
37336173d3 After one too many PRs on the subject, bite the bullet and define IOV_MAX
and its associated constants.  Implement _SC_IOV_MAX in the usual way.
Be a bit sloppy about the namespace question; this should get cleared up
in time for 5.0.

MFC after:	1 month
2001-06-18 20:24:54 +00:00
John Baldwin
6fad32afc9 Lock Giant in postsig() for the KTRACE case as ktrpsig() needs Giant when
it writes out to the trace file.

Reported by:	peter, gallatin, and others
2001-06-18 19:23:43 +00:00
Brian Somers
09dbb40410 Add linker_reference_module().
This function loads a module if required, otherwise bumps the reference
count -- the opposite of linker_file_unload().
2001-06-18 15:09:33 +00:00
Brian Somers
21ff14e0f9 Don't remove the SI_CHEAPCLONE for unsupported minors 2001-06-18 09:22:30 +00:00
Peter Wemm
b85db19691 Move setugid() a little sooner to before we release tracing in case
crdup() or change_e*id() block on malloc() or mutex.
2001-06-16 23:34:23 +00:00
Peter Wemm
5a280d9cd1 Add INTR_TYPE_AV so that we can get to the PI_AV priority in the ithread
handlers.  This is beneficial since it means that pcm's MPSAFE handler
can get run before things that will block on Giant in the shared irq
case.
2001-06-16 22:42:19 +00:00
Jonathan Lemon
9fa416ca19 Fix warnings:
112: warning: cast to pointer from integer of different size
125: warning: cast to pointer from integer of different size
2001-06-16 07:02:47 +00:00
Jonathan Lemon
7b748f0a21 Correctly hook up the write kqfilter to pipes.
Submitted by:  Niels Provos <provos@citi.umich.edu>
2001-06-15 20:45:01 +00:00
Peter Wemm
b93c3c5ed6 Fix some warnings in kern_environment.c. Make the getenv*() family
take a const 'name', since they dont modify anything.
159: warning: passing arg 1 of `getenv_int' discards qualifiers...
167: warning: passing arg 1 of `getenv' discards qualifiers from pointer..
2001-06-15 07:29:17 +00:00
Peter Wemm
ee24290963 As per comments in sys/linker_set.h:
BANG! BANG! BANG! BANG! BANG! BANG! CLICK! CLICK! CLICK! CLICK! CLICK!
<reload>
BANG! BANG! BANG! BANG! BANG! BANG! CLICK! CLICK! CLICK! CLICK! CLICK!
2001-06-14 01:28:56 +00:00
Peter Wemm
f41325db5f With this commit, I hereby pronounce gensetdefs past its use-by date.
Replace the a.out emulation of 'struct linker_set' with something
a little more flexible.  <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set.  They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>).  Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated.  This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by:	eivind
2001-06-13 10:58:39 +00:00
Peter Wemm
db957588c9 Patch up a blunder I made a few days ago. nmbcnt was being initialized
too late.

Noted by:      bmilekic
Pointy-hat to: peter
2001-06-13 00:36:41 +00:00
Peter Wemm
2398f0cd1d Hints overhaul:
- Replace some very poorly thought out API hacks that should have been
  fixed a long while ago.
- Provide some much more flexible search functions (resource_find_*())
- Use strings for storage instead of an outgrowth of the rather
  inconvenient temporary ioconf table from config().  We already had a
  fallback to using strings before malloc/vm was running anyway.
2001-06-12 09:40:04 +00:00
Dag-Erling Smørgrav
8f7e4eb568 Rename nextpid to lastpid and externalize it. 2001-06-11 21:54:19 +00:00
Dag-Erling Smørgrav
fe46349692 Blah, I cut out a tad too much in the previous commit. (thanks again, Jake!) 2001-06-11 18:43:32 +00:00
Dag-Erling Smørgrav
e3b373228c copyin(9) doesn't return ENAMETOOLONG. (thanks, Jake!) 2001-06-11 18:36:18 +00:00
Dag-Erling Smørgrav
b0def2b548 Add sbuf_copyin(). Also add 'b' variants of sbuf_{cat,copyin,cpy}() which
ignore NUL bytes in the source string.
2001-06-11 17:05:52 +00:00
Hajimu UMEMOTO
3384154590 Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
  - The definitions of SADB_* in sys/net/pfkeyv2.h are still different
    from RFC2407/IANA assignment because of binary compatibility
    issue.  It should be fixed under 5-CURRENT.
  - ip6po_m member of struct ip6_pktopts is no longer used.  But, it
    is still there because of binary compatibility issue.  It should
    be removed under 5-CURRENT.

Reviewed by:	itojun
Obtained from:	KAME
MFC after:	3 weeks
2001-06-11 12:39:29 +00:00
David Malone
c7fd62da6c Try to make the setting of the SIGCHLD handler the same as setting of
the NOCLDWAI flag. Susv2 seems to require this.

Submitted by:	Cejka Rudolf <cejkar@dcse.fee.vutbr.cz>
Reviewed by:	dillon
2001-06-11 09:15:41 +00:00
Dag-Erling Smørgrav
d647935801 sbuf_new(9) now returns a struct sbuf * instead of an int. If the caller
does not provide a struct sbuf, sbuf_new(9) will allocate one and return
a pointer to it.
2001-06-10 15:48:04 +00:00
Peter Wemm
0978669829 "Fix" the previous initial attempt at fixing TUNABLE_INT(). This time
around, use a common function for looking up and extracting the tunables
from the kernel environment.  This saves duplicating the same function
over and over again.  This way typically has an overhead of 8 bytes + the
path string, versus about 26 bytes + the path string.
2001-06-08 05:24:21 +00:00
Peter Wemm
4422746fdf Back out part of my previous commit. This was a last minute change
and I botched testing.  This is a perfect example of how NOT to do
this sort of thing. :-(
2001-06-07 03:17:26 +00:00
Thomas Moestl
c0a0fb85e2 Fix an instance of NDINIT in the extattrctl syscall: LOCKLEAF was or'ed
to the operation parameter, not to the flags as it should be.

Reviewed by:	rwatson
2001-06-06 23:34:38 +00:00
Peter Wemm
81930014ef Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros.  TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.
2001-06-06 22:17:08 +00:00
John Baldwin
5beb572b41 We don't need to hold a lock just to test a flag. 2001-06-06 22:05:48 +00:00
Ruslan Ermilov
4589be70fe Unbreak setregid(2).
Spotted by:	Alexander Leidinger <Alexander@Leidinger.net>
2001-06-06 13:58:03 +00:00
John Baldwin
262c9f8a3b Don't hold sched_lock across addupc_task().
Reported by:	David Taylor <davidt@yadt.co.uk>
Submitted by:	bde
2001-06-06 00:57:24 +00:00
Dima Dorfman
ddf5b79683 Add a line discipline close routine which restores some functionality
I accidently nuked in rev. 1.54.  Also rework the error handling in
snplwrite a little.
2001-06-05 05:07:53 +00:00
Dima Dorfman
f09f49f136 Style and cosmetic cleanups. This driver is now reasonably stlye(9)
compliant.  All the variable definitions and function names are
reasonably consistent, and the functions which should be static (i.e.,
all of them) are.  Other assorted fixes were made.  The majority of
the delta is indentation fixes.

Partially reviewed by:	bde
2001-06-05 05:00:17 +00:00
Dima Dorfman
7fd72392d9 Use the l_nullioctl exported from tty_conf.c rather than rolling our own. 2001-06-04 23:31:21 +00:00
Dima Dorfman
22cf0fb34d Unstaticize l_nullioctl; it is needed elsewhere (like in tty_snoop.c).
Suggested by:	bde
2001-06-04 23:30:47 +00:00
Matthew Dillon
1b3e974a71 The pipe_write() code was locking the pipe without busying it first in
certain cases, and a close() by another process could potentially rip the
pipe out from under the (blocked) locking operation.

Reported-by: Alexander Viro <viro@math.psu.edu>
2001-06-04 04:04:45 +00:00
Dima Dorfman
87826386e0 Remove unused includes, use *min() inline functions rather than a
home-grown macro, rewrite a confusing conditional in snpdevtotty(),
and change ibuf to 512 bytes instead of 1024 bytes in dsnwrite().

Reviewed by:	bde
2001-06-03 05:17:39 +00:00
Dima Dorfman
b8edb44cc3 When tring to find out if this is a request for a write in
kernel_sysctl and userland_sysctl, check for whether new is NULL, not
whether newlen is 0.  This allows one to set a string sysctl to "".
2001-06-03 04:58:51 +00:00
Dima Dorfman
c0b824f97d Include sys/mutex.h to silence a warning. 2001-06-03 02:19:07 +00:00
Jesper Skriver
5b86eac4e5 Revert the last bits of my bogus move of NMBCLUSTERS
to <sys/param.h>
2001-06-01 21:47:34 +00:00
Thomas Moestl
d279178df7 Clean up the code exporting interrupt statistics via sysctl a bit:
- move the sysctl code to kern_intr.c
- do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine
  the length of the intrcnt array
- move the declarations of intrnames, eintrnames, intrcnt and eintrcnt
  from machine-dependent include files to sys/interrupt.h
- remove the hw.nintr sysctl, it is not needed.
- fix various style bugs

Requested by:	bde
Reviewed by:	bde (some time ago)
2001-06-01 13:23:28 +00:00
Ruslan Ermilov
0b381bf1fd Remove vestiges of MFS. 2001-06-01 10:07:28 +00:00
David E. O'Brien
240ef84277 Back out jesper's 2001/05/31 14:58:11 PDT commit. It does not compile. 2001-06-01 09:51:14 +00:00
Jesper Skriver
e916d96e64 Move the definition of NMBCLUSTERS from src/sys/kern/uipc_mbuf.c
to <sys/param.h>, so it's available to src/sys/netinet/ip_input.c,
and remove the now unneeded includes of "opt_param.h".

MFC after:	1 week
2001-05-31 21:56:44 +00:00
Dima Dorfman
a723c4e173 Export via sysctl:
* all members of msginfo from sysv_msg.c;
  * msqids from sysv_msg.c;
  * sema from sysv_sem.c; and
  * shmsegs from sysv_shm.c;

These will be used by ipcs(1) in non-kvm mode.

Reviewed by:	tmm
2001-05-30 03:28:59 +00:00
Poul-Henning Kamp
22628ccf96 Remove the hack-around for the slice/label code, it didn't
cover the hole.
2001-05-29 18:19:57 +00:00
Ian Dowse
5f558fa42f Since the netexport struct was centralised to 'struct mount',
attempting to remove nonexistant exports with MNT_DELEXPORT returns
an error; before this change it always succeeded. This caused
mountd(8) to log "can't delete exports for /whatever" warnings.

Change the error code from EINVAL to a more specific ENOENT, and
make mountd ignore this error when deleting the export list. I
could have just restored the previous behaviour of returning success,
but I think an error return is a useful diagnostic.

Reviewed by:	phk
2001-05-29 17:46:52 +00:00
Poul-Henning Kamp
b63436919d Remove a comment which was past its shelf life.
PR:		18750
Submitted by:	Tony Finch <dot@dotat.at>
2001-05-29 09:22:22 +00:00
Poul-Henning Kamp
c01a009dc5 With the new kernel dev_t conversions done at release 4.X,
it becomes possible to trap in ptsstop() in kern/tty_pty.c
     if the slave side has never been opened during the life of a kernel.

     What happens is that calls to ttyflush() done from ptyioctl() for the
     controlling side end up calling ptsstop() [via (*tp->t_stop)(tp, <X>)]
     which evaluates the following:

	     struct pt_ioctl *pti = tp->t_dev->si_drv1;

     In order for tp->t_dev to be set, the slave device must first be
     opened in ttyopen() [kern/tty.c].

     It appears that the only problem is calls to (*tp->t_stop)(tp, <n>),
     so this could also happen with other ioctls initiated by the
     controlling side before the slave has been opened.

PR:		27698
Submitted by:	David Bein bein@netapp.com
MFC after:	6 days
2001-05-28 20:22:12 +00:00
Poul-Henning Kamp
507fbee0ad The disklabel/slice code is more twisted than I thought. Revert to
calling the cdevsw_add() unconditionally.
2001-05-28 16:12:55 +00:00
Brian Somers
04bd20e31d Handle NULL struct device *s 2001-05-28 01:00:03 +00:00
Robert Watson
823c224e95 o uifree() the cr_ruidinfo in crfree() as well as cr_uidinfo now that the real uid
info is in the credential also.

Submitted by:	egge
2001-05-27 21:43:46 +00:00
Robert Watson
7cb8e4d277 o pcred-removal changes included modifications to optimize the setting of
the saved uid and gid during execve().  Unfortunately, the optimizations
  were incorrect in the case where the credential was updated, skipping
  the setting of the saved uid and gid when new credentials were generated.
  This change corrects that problem by handling the newcred!=NULL case
  correctly.

Reported/tested by:	David Malone <dwmalone@maths.tcd.ie>

Obtained from:	TrustedBSD Project
2001-05-26 19:59:44 +00:00
Poul-Henning Kamp
3344c5a17e Create a general facility for making dev_t's depend on another
dev_t.  The dev_depends(dev_t, dev_t) function is for tying them
to each other.

When destroy_dev() is called on a dev_t, all dev_t's depending
on it will also be destroyed (depth first order).

Rewrite the make_dev_alias() to use this dependency facility.

kern/subr_disk.c:
Make the disk mini-layer use dependencies to make sure all
relevant dev_t's are removed when the disk disappears.

Make the disk mini-layer precreate some magic sub devices
which the disk/slice/label code expects to be there.

kern/subr_disklabel.c:
Remove some now unneeded variables.

kern/subr_diskmbr.c:
Remove some ancient, commented out code.

kern/subr_diskslice.c:
Minor cleanup.  Use name from dev_t instead of dsname()
2001-05-26 08:27:58 +00:00
John Baldwin
9d127f9ffb Add vm locking to sendfile(2) and sf_buf_free().
Reported by:	Tamiji Homma <thomma@BayNetworks.com>
Tested by:	Tamiji Homma <thomma@BayNetworks.com>
2001-05-25 19:23:04 +00:00
Robert Watson
b1fc0ec1a7 o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00
Poul-Henning Kamp
5696db457d Make the PTY drivers cloning algorithm create "CHEAPCLONE" dev_t,
so that some twit cannot allocate all 256 PTY's with "ls -l".
2001-05-25 13:23:42 +00:00
Poul-Henning Kamp
2613d3fec9 Use the name given to the dev_t, rather than creating our own.
This makes it possible to give sensible information for /dev/fd.720
and similar "special" devices.
2001-05-25 09:06:52 +00:00
Ruslan Ermilov
1166fb516b - sys/msdosfs moved to sys/fs/msdosfs
- msdos.ko renamed to msdosfs.ko
- /usr/include/msdosfs moved to /usr/include/fs/msdosfs
2001-05-25 08:14:14 +00:00
Poul-Henning Kamp
25e0288d07 Don't rely on cdevsw_add() when we hack about with dev_t's. 2001-05-24 20:28:06 +00:00
Poul-Henning Kamp
8576c652b4 Don't take the detour around devsw() to find out if the proto-cdevsw
is already initialized.
2001-05-24 20:27:16 +00:00
Alfred Perlstein
0cea693084 whitespace/style 2001-05-24 18:06:22 +00:00
Matthew Dillon
ac8f990bde This patch implements O_DIRECT about 80% of the way. It takes a patchset
Tor created a while ago, removes the raw I/O piece (that has cache coherency
problems), and adds a buffer cache / VM freeing piece.

Essentially this patch causes O_DIRECT I/O to not be left in the cache, but
does not prevent it from going through the cache, hence the 80%.  For
the last 20% we need a method by which the I/O can be issued directly to
buffer supplied by the user process and bypass the buffer cache entirely,
but still maintain cache coherency.

I also have the code working under -stable but the changes made to sys/file.h
may not be MFCable, so an MFC is not on the table yet.

Submitted by:	tegge, dillon
2001-05-24 07:22:27 +00:00
Dima Dorfman
028f979d1d Correct style bugs with regards to long lines and comments.
Reviewed by:	bde
2001-05-23 23:38:05 +00:00
John Baldwin
0dfefe6829 Don't acquire Giant just to call trap_fatal(), we are about to panic
anyway so we'd rather see the printf's then block if the system is
hosed.
2001-05-23 22:58:09 +00:00
John Baldwin
bdc60f5bd3 Don't release Giant around vm_oject_page_clean() in fsync() as the pager
putpages called will need Giant.
2001-05-23 22:55:13 +00:00
John Baldwin
8aa66068ed - Always call bfreekva() w/o vm_mtx held.
- Always call vfs_setdirty() with vm_mtx held.
- Fix an old comment: vm_hold_unload_pages is called vm_hold_free_pages()
  nowadays.
- Always call vm_hold_free_pages() w/o vm_mtx held.
2001-05-23 22:24:49 +00:00
John Baldwin
1b2555b243 - Lock the VM when initializing the vmspace for proc0.
- Don't bother releasing Giant while doing a lookup on the vm_map of
  initproc while starting up init.  We have to grab it again right after
  the lookup anyways.
2001-05-23 22:06:47 +00:00
John Baldwin
613c83cbf1 Lock the VM while twiddling the vmspace. 2001-05-23 22:05:08 +00:00
Bosko Milekic
629db60492 Increment mbstat.m_mpfail, not mbstat.m_mcfail, when m_pullup() fails.
This slipped in accidently a few commits back.
2001-05-23 20:44:54 +00:00
John Baldwin
5bd57bc8b7 Don't release the vm lock just to turn around and grab it again. 2001-05-23 19:51:12 +00:00
John Baldwin
b516d2f5e1 Add in assertions to ensure that we always call msleep or mawait with
either a timeout or a held mutex to detect unprotected infinite sleeps
that can easily lead to deadlock.

Submitted by:	alfred
2001-05-23 19:38:26 +00:00
Poul-Henning Kamp
4787f91d6b syslogd gets kernel log messages only once every 30 seconds or
at the top of the minute, whichever comes first.  It seems
logtimeout() is only called once after the kernel log is opened
and then never again after that.  So I guess syslogd only gets
kernel log messages by virtue of syncer(4)'s flushes ...?

PR:		27361
Submitted by:	pkern@utcc.utoronto.ca
MFC after:	1 week
2001-05-23 19:02:50 +00:00
Alfred Perlstein
53240603ee aquire vm_mutex a little bit earlier to protect a pmap call. 2001-05-23 10:26:36 +00:00
Ruslan Ermilov
99d300a1ec - FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file
systems were repo-copied from sys/miscfs to sys/fs.

- Renamed the following file systems and their modules:
  fdesc -> fdescfs, portal -> portalfs, union -> unionfs.

- Renamed corresponding kernel options:
  FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.

- Install header files for the above file systems.

- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
  Makefiles.
2001-05-23 09:42:29 +00:00
Dima Dorfman
0150c6e83d Unifdef DEV_SNP; snp(4) no longer requires these ugly hacks.
Silence by:	-hackers, -audit
2001-05-22 22:16:18 +00:00
Dima Dorfman
47eaa5f542 Convert this driver to (ab?)use line disciplines to get the input it
needs instead of relying on idiosyncratic hacks in the tty subsystem.
Also add module code since this can now be compiled as a module.

Silence by:	-hackers, -audit
2001-05-22 22:13:14 +00:00
Bruce Evans
1c1771cb5b Convert npx interrupts into traps instead of vice versa. This is much
simpler for npx exceptions that start as traps (no assembly required...)
and works better for npx exceptions that start as interrupts (there is
no longer a problem for nested interrupts).

Submitted by:	original (pre-SMPng) version by luoqi
2001-05-22 21:20:49 +00:00
Dima Dorfman
a8dbafbe87 Correct the vm_mtx handling; specifically, don't acquire it in
shm_deallocate_segment because shmexit_myhook calls it, and the latter
should always be called with it already held.

Submitted by:	dwmalone, dd
Approved by:	alfred
2001-05-22 03:56:26 +00:00