1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-16 10:20:30 +00:00
Commit Graph

3598 Commits

Author SHA1 Message Date
John Baldwin
45e889896e Fix a typo.
Reported by:	albert
2001-01-24 08:42:39 +00:00
Kirk McKusick
5ed57d323b Never reuse AUTO_OID values.
Approved by:	Alfred Perlstein <bright@wintelcom.net>
2001-01-24 04:35:13 +00:00
John Baldwin
0d6d6aa373 Don't grab Giant when calling kmem_alloc/kmem_free as this is just
encouraging other people to follow the same practice.  If this is going
to be done, then it should be done inside of those two functions instead.
2001-01-24 00:36:03 +00:00
John Baldwin
e5690aadaa Proc locking. 2001-01-24 00:35:12 +00:00
John Baldwin
a914fb6b27 - Proc locking.
- Protect calcru() with sched_lock.
2001-01-24 00:33:44 +00:00
John Baldwin
762dba203e - Proc locking.
- Protect calcru() with sched_lock.
2001-01-24 00:28:07 +00:00
John Baldwin
611d940790 Proc locking. 2001-01-24 00:27:28 +00:00
Matt Jacob
15516f16d2 Do not do the commenting out the way that saves bytes and looks cleaner
to you. Do it the way Vox Populi wants it.
2001-01-23 16:35:33 +00:00
Hajimu UMEMOTO
5d22597f3a Add mibs to hold the number of forks since boot. New mibs are:
vm.stats.vm.v_forks
	vm.stats.vm.v_vforks
	vm.stats.vm.v_rforks
	vm.stats.vm.v_kthreads
	vm.stats.vm.v_forkpages
	vm.stats.vm.v_vforkpages
	vm.stats.vm.v_rforkpages
	vm.stats.vm.v_kthreadpages

Submitted by:	Paul Herman <pherman@frenchfries.net>
Reviewed by:	alfred
2001-01-23 14:32:01 +00:00
Robert Watson
02b65ffb64 o The move to using VADMIN under vaccess() resulted in some system
calls returning EACCES instead of EPERM.  This patch modifies vaccess()
  to return EPERM instead of EACCES if VADMIN is among the requested
  rights.  This affects functions normally limited to the owners of
  a file, such as chmod(), as EPERM is the error indicating that
  privilege would allow the operation, rather than a chance in mandatory
  or discretionary rights.

Reported by:	bde
2001-01-23 04:15:19 +00:00
Matt Jacob
462574faf5 Move (now) unused variable declaration inside the block (now commented out). 2001-01-22 22:22:38 +00:00
Jason Evans
56771ca74b Print correct file name and line number in mtx_assert().
Noticed by:	jake
2001-01-22 05:56:55 +00:00
Jason Evans
0cde2e34af Move most of sys/mutex.h into kern/kern_mutex.c, thereby making the mutex
inline functions non-inlined.  Hide parts of the mutex implementation that
should not be exposed.

Make sure that WITNESS code is not executed during boot until the mutexes
are fully initialized by SI_SUB_MUTEX (the original motivation for this
commit).

Submitted by:	peter
2001-01-21 22:34:43 +00:00
Dag-Erling Smørgrav
a3ea6d41b9 First step towards an MP-safe zone allocator:
- have zalloc() and zfree() always lock the vm_zone.
 - remove zalloci() and zfreei(), which are now redundant.

Reviewed by:	bmilekic, jasone
2001-01-21 22:23:11 +00:00
Poul-Henning Kamp
1fd7b93f3f Convert a Debugger(3) to a panic(9) and a EINVAL.
Reminded by:	bde
2001-01-21 21:19:49 +00:00
Jake Burkholder
a448b62ac9 Make intr_nesting_level per-process, rather than per-cpu. Setup
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context.  This is also necessary
in order to context switch from sched_ithd() directly.

Reviewed By:	peter
2001-01-21 19:25:07 +00:00
Jason Evans
527c2fd277 Make the order of the static initializer for all_mtx match the order of
fields in struct mtx.

Found by:	jake
2001-01-21 11:05:02 +00:00
Peter Wemm
654c30a008 Remove APIC_INTR_DIAGNOSTIC - this has been disabled for some time now.
Remove some leftovers of removed SMP options.
2001-01-21 07:54:10 +00:00
Jason Evans
d1c1b8413e Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.

This change is necessary in order to avoid some circular bootstrapping
dependencies.
2001-01-21 07:52:20 +00:00
Jake Burkholder
3e899e1063 Remove the per-cpu pages used for copy and zero-ing pages of memory
for SMP; just use the same ones as UP.  These weren't used without
holding Giant anyway, and the routines that use them would have to
be protected from pre-emption to avoid migrating cpus.
2001-01-21 06:50:03 +00:00
John Baldwin
27e864e300 - All of proc_compare needs sched_lock, so hold it for the for loop that
calls it rather than obtaining and releasing it a lot in proc_compare.
- Collect all of the data gathering and stick it just after the
  proc_compare loop.  This way, we only have to grab sched_lock once now
  when handling SIGINFO.  All the printf's are done after the values are
  calculated.

Submitted mostly by:	bde
2001-01-20 23:03:20 +00:00
Bosko Milekic
56acb799b2 When short of mbufs or mbuf clusters, we sleep on appropriate "counters."
The counters are incremented when a thread goes to sleep and decremented
either when a thread is woken up by another thread or when the sleep
times out. There existed a race where the sleep count could be decremented
twice resulting in an eventual underflow.
Move the decrementing of the "counters" to the thread initiating the sleep
and thus remedy the problem.
2001-01-20 21:29:10 +00:00
John Baldwin
049ebc15a1 Temporarily disable the printf() for micruptime() going backwards, the
SIGXCPU signal, and killing of processes that exceed their allowed run
time until they can play nice with sched_lock.  Right now they are just
potentital panics waiting to happen.  The printf() has bitten several
people.
2001-01-20 02:57:59 +00:00
Jake Burkholder
c1ef8aac9e - Make npx_intr INTR_MPSAFE and move acquiring Giant into the
function itself.
- Remove a hack to allow acquiring Giant from the npx asm trap
  vector.
2001-01-20 02:30:58 +00:00
John Baldwin
4848fbae35 Be more careful with sched_lock in the SIGINFO handler. Specifically, do
not hold sched_lock while calling ttyprintf().  If we are on a serial
console, then ttyprintf() will end up getting the sio lock, resulting in
a lock order violation.

Noticed by:	des
2001-01-20 02:04:44 +00:00
Peter Wemm
558226eae7 Use #ifdef DEV_NPX from opt_npx.h instead of #if NNPX > 0 from npx.h 2001-01-19 13:19:02 +00:00
Peter Wemm
f7b6e45d5b apic_itrace_splz[] is unused 2001-01-19 10:48:35 +00:00
Peter Wemm
198c5b0891 Remove the static splXXX functions and replace them by static __inline
stubs.  Remove the xxx_imask variables which have been all but gone for
a while.
2001-01-19 09:57:29 +00:00
John Baldwin
568ae39fd5 Revert revision 1.102. I don't think p_nice needs to be protected with
sched_lock, and I'm fairly certain P_TRACED will be protected with the
proc lock instead.

Pointed out indirectly by:	bde
2001-01-19 08:23:22 +00:00
Matthew Dillon
bcc740c453 Do not cluster with B_LOCKED buffers.
This is an odd one.  This patch appears to fix a panic related to background
bitmap writes (for FFS), though neither Kirk, Ian, or I can figure out how
B_CLUSTEROK could possibly be set on a bitmap block to cause the clustering
code to improperly cluster with a buffer undergoing a background write.

In anycase, the clustering code is very fragile and this patch helps with
that, as well as possibly fixing a bug Andre was having.

Suggested by: Ian Dowse <iedowse@maths.tcd.ie>
Testing by: Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
2001-01-19 05:31:07 +00:00
Bosko Milekic
08812b3925 Implement MTX_RECURSE flag for mtx_init().
All calls to mtx_init() for mutexes that recurse must now include
the MTX_RECURSE bit in the flag argument variable. This change is in
preparation for an upcoming (further) mutex API cleanup.
The witness code will call panic() if a lock is found to recurse but
the MTX_RECURSE bit was not set during the lock's initialization.

The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to
MTX_RECURSED, which is more appropriate given its meaning.

The following locks have been made "recursive," thus far:
eventhandler, Giant, callout, sched_lock, possibly some others declared
in the architecture-specific code, all of the network card driver locks
in pci/, as well as some other locks in dev/ stuff that I've found to
be recursive.

Reviewed by: jhb
2001-01-19 01:59:14 +00:00
John Baldwin
dcfc09d931 Protect p_stat and p_oncpu with sched_lock in forward_signal(). 2001-01-18 08:19:25 +00:00
Bosko Milekic
35c05ac61b Add some KASSERTs valid if WITNESS is defined to verify that the mbuf
allocation routines are being called safely. Since we drop our relevant
mbuf mutex and acquire Giant before we call kmem_malloc(), we have
to make sure that this does not pave the way for a fatal lock order
reversal. Check that either Giant is already held (in which case it's safe
to grab it again and recurse on it) or, if Giant is not held, that no
other locks are held before we try to acquire Giant.

Similarily, add a KASSERT valid in the WITNESS case in m_reclaim() to
nail callers who end up in m_reclaim() and hold a lock.

Pointed out by: jhb
2001-01-16 01:53:13 +00:00
Jason Evans
238510fc46 Implement condition variables. 2001-01-16 01:00:43 +00:00
Poul-Henning Kamp
9039f19fa0 A bit of sanity-checking in bioqdisksort(): panic if we recurse. 2001-01-14 18:48:42 +00:00
Dag-Erling Smørgrav
faa784b70c Use predictable internal names for the sysvipc modules, so we have a
chance of getting dependencies working.
2001-01-14 18:04:30 +00:00
John Baldwin
b947e93403 - Use sched_lock to prevent the mutex name from changing out from under us
while we are copying it to the kinfo_proc structure.
- Test against p_stat to see if we are blocked on a mutex.
- Terminate ki_mtxname with a null char rather than ki_wmesg.
2001-01-13 23:08:34 +00:00
Ben Smithurst
4c061a9da1 Fix getsid() to use "=" instead of "==".
Not objected to by:	audit
2001-01-13 22:49:59 +00:00
Jake Burkholder
063415120b Change return ??? to return -1 in some #if 0'ed code. 2001-01-12 08:24:25 +00:00
David Malone
3b54736e19 Style improvements for last fix. Should be functionally the same.
Submitted by:	bde
2001-01-11 00:13:54 +00:00
Jake Burkholder
ef73ae4b0c Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables
other then curproc.
2001-01-10 04:43:51 +00:00
Bosko Milekic
d113d3857e In m_mballoc_wait(), drop the mmbfree mutex lock prior to calling
m_reclaim() and re-acquire it when m_reclaim() returns. This means that
we now call the drain routines without holding the mutex lock and
recursing into it. This was done for mainly two reasons:

(i) Avoid the long recursion; long recursions are typically bad and this
    is the case here because we block all other code from freeing mbufs
    if they need to. Doing that is kind of counter-productive, since we're
    really hoping that someone will free.

(ii) More importantly, avoid a potential lock order reversal. Right now,
     not all the locks have been added to our networking code; but
     without this change, we're introducing the possibility for deadlock.
     Consider for example ip_drain(). We will likely eventually introduce
     a lock for ipq there, and so ip_freef() will be called with ipq lock
     held. But, ip_freef() calls m_freem() which in turn acquires the
     mmbfree lock. Since we were previously calling ip_drain() with mmbfree
     held, our lock order would be: mmbfree->ipq->mmbfree. Some other code
     may very well lock ipq first and then call ip_freef(). This would
     result in the regular lock order, ipq->mmbfree. Clearly, we have
     deadlock if one thread acquires the ipq lock and sits waiting for
     mmbfree while another thread calling m_reclaim() acquires mmbfree
     and sits waiting for the ipq lock.

Also, make sure to add a comment above m_reclaim()'s definition briefly
explaining this. Also document this above the call to m_reclaim() in
m_mballoc_wait().

Suggested and reviewed by: alfred
2001-01-09 23:58:56 +00:00
Garrett Wollman
0a2c3d48c6 select() DKI is now in <sys/selinfo.h>. 2001-01-09 04:33:49 +00:00
Nick Hibma
11a8d6c202 Unset the devclass if the attach fails and the devclass was not set to
begin with.

Reviewed by:	dfr
2001-01-08 22:16:26 +00:00
David Malone
2ebaaccd47 If we failed to allocate the file discriptor for the write end of
the pipe, then we were corrupting the pipe_zone free list by calling
pipeclose on rpipe twice. NULL out rpipe to avoid this.

Reviewed by:	dillon
Reviewed by:	iedowse
2001-01-08 22:14:48 +00:00
Jake Burkholder
bb5c0622b7 Fix a warning. The type of globaldata.gd_prvspace has changed. 2001-01-08 15:25:45 +00:00
Alfred Perlstein
bdfa4f04d9 Don't use SCARG.
Pointed out by: bde
2001-01-08 07:22:06 +00:00
Alfred Perlstein
0bad156a91 Limit size of passed in data for utrace function.
Requested by: rwatson
Obtained from: NetBSD
2001-01-06 09:34:20 +00:00
John Baldwin
a0a7328bb0 - Move all of the KTR sysctl's under a new debug.ktr mib.
- Provide TUNABLE_INT() hooks for ktr_cpumask, ktr_mask, and ktr_verbose
  so that they can be set from the loader by their respective sysctl names.
  For example, to turn on KTR_INTR and KTR_PROC in ktr_mask, one could
  stick 'debug.ktr.mask="0x1200"' in /boot/loader.conf.
2001-01-06 06:51:43 +00:00
John Baldwin
5192404af2 Protect p_nice and P_TRACED in psignal() above the switch statement with
sched_lock.
2001-01-06 00:08:39 +00:00
Warner Losh
38c490a10f Make this file conform mostly to style(9):
o Use 8 space hard tabs
o Eliminate trailing white space (while I'm here, just in a couple of places)
o wrap mostly at 80 columns (printf literal strings being the notable
  exception)
o use return (foo) consistantly
o use 0 vs NULL more consistantly
o use queue(3) xxx_FOREACH macros where appropriate (some places used it
  before, others didn't).
o use BSD line continuation parameters

Pendants will likely notice minor style(9) violations, but for the
most part the file now looks much much closer to style(9) and is
mostly self-consistant.

Approved in principle by: dfr
Reviewed by: md5 (no changes to the .o)
2001-01-05 07:29:54 +00:00
John Baldwin
c5f9b6d075 - For dynamic sysctl's added at runtime, don't assume that the name passed
to the SYSCTL_ADD_FOO() macros is a constant that should be turned into
  a string via the pre-processor.  Instead, require it to be an explicit
  string so that names can be generated on the fly.
- Make some of the char * arguments to sysctl_add_oid() const to quiet
  warnings.
2001-01-05 07:00:45 +00:00
Nick Hibma
f0b8108f1d Fix a bug in both scripts: HEADER sections were not emitted to the header
file.

While there fix the layout of function headers (noticable in long headers)

Fix up some style nits. It's Perl and should be written in that style.
2001-01-04 13:41:24 +00:00
John Baldwin
3e6831f510 The previous commit wasn't entirely correct. At least one goto to the
out: label in psignal() did not grab sched_lock before trying to release
it.  Also, the previous version had several cases where it grabbed
sched_lock before jumping to out: unneccessarily, so rework this a bit.
The runfast: and out: labels must be called with sched_lock released, and
the run: label must be called with it held.  Appropriate mtx_assert()'s
have been added that should catch any bugs that may still be in this
code.

Noticed by:	bde
2001-01-02 18:54:09 +00:00
Poul-Henning Kamp
1550c317bf Fix the <sys/queue.h> abuse.
Submitted by:	Dima Dorfman <dima@unixfreak.org>
Reviewed by:	/sbin/md5
2001-01-02 11:51:55 +00:00
Poul-Henning Kamp
7f9cb01893 Add an XXX about a <sys/queue.h> transgression which needs cleaned up. 2001-01-02 10:34:09 +00:00
Poul-Henning Kamp
8eb6e436e8 Remove a bogus #ifdef KTR stanza.
Noticed by:	Alexander Langer <alex@big.endian.de>
2001-01-01 23:09:53 +00:00
John Baldwin
4bfba0cf19 Push down sched_lock in psignal(). sched_lock was being held across
recursive calls into psignal() as well as calls to signotify(),
forward_signal(), etc.
2001-01-01 02:31:08 +00:00
John Baldwin
ef8294075b Add in a missing release of the proctree lock.
Submitted by:	Sja <sakari.jalovaara@eqonline.fi>
2001-01-01 02:19:51 +00:00
Matt Jacob
56f29ddd1e there is no more miscfs/devfs 2000-12-31 23:12:20 +00:00
Seigo Tanimura
338c0bc664 Ignore a net interrupt if the corresponding handler is not
registered.

This fixes panic on my laptop where a spurious arp packet
is received when arp is not ready to run.
2000-12-31 01:31:55 +00:00
Paul Saab
e9df486f0a Backout rev 1.57 & 1.58. While the previous revisions fixed
attaching to running processes, it completely breaks normal debugging.
A better fix is in the works, but cannot be properly tested until
the problem with gdb hanging the system in -current is solved.
2000-12-31 01:30:27 +00:00
Paul Saab
894653d6fa Pass me the pointy hat. Do not hold sched_lock over psignal.
Submitted by:	alfred
2000-12-30 00:44:44 +00:00
Greg Lehey
7fcd0cb31a Partially revert revision 1.7: Only use getnanotime instead of
nanotime if we would run into trouble with nanotime (i.e. if we are
tracing KTR_LOCK).

Reviewed by: 	jhb
2000-12-29 06:27:39 +00:00
Dag-Erling Smørgrav
dd488b6dd8 Retire kernfs (kernel part). 2000-12-28 12:17:35 +00:00
Paul Saab
6a10f299b9 Send a SIGCONT when detaching or continuing the excution of a traced
process.  This fixes a problem when attaching to a process in gdb
and the process staying in the STOP'd state after quiting gdb.
This whole process seems a bit suspect, but this seems to work.

Reviewed by:	peter
2000-12-28 08:34:21 +00:00
Peter Wemm
4058c0f013 Pull out the module path from the loader. ie: if you boot from
/boot/kernel.foobar/* then that had better be in the path ahead of the
others.

Submitted by:  Daniel J. O'Connor <darius@dons.net.au>
PR: 23662
2000-12-28 08:14:58 +00:00
Matthew Dillon
2b6b0df712 This implements a better launder limiting solution. There was a solution
in 4.2-REL which I ripped out in -stable and -current when implementing the
low-memory handling solution.  However, maxlaunder turns out to be the saving
grace in certain very heavily loaded systems (e.g. newsreader box).  The new
algorithm limits the number of pages laundered in the first pageout daemon
pass.  If that is not sufficient then suceessive will be run without any
limit.

Write I/O is now pipelined using two sysctls, vfs.lorunningspace and
vfs.hirunningspace.  This prevents excessive buffered writes in the
disk queues which cause long (multi-second) delays for reads.  It leads
to more stable (less jerky) and generally faster I/O streaming to disk
by allowing required read ops (e.g. for indirect blocks and such) to occur
without interrupting the write stream, amoung other things.

NOTE: eventually, filesystem write I/O pipelining needs to be done on a
per-device basis.  At the moment it is globalized.
2000-12-26 19:41:38 +00:00
Jake Burkholder
98f03f9030 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
Matt Jacob
661f2768f4 Make sure we have a non-null proc pointer before referring to fields
off of it.
2000-12-23 07:33:32 +00:00
Bosko Milekic
2a0c503e7a * Rename M_WAIT mbuf subsystem flag to M_TRYWAIT.
This is because calls with M_WAIT (now M_TRYWAIT) may not wait
  forever when nothing is available for allocation, and may end up
  returning NULL. Hopefully we now communicate more of the right thing
  to developers and make it very clear that it's necessary to check whether
  calls with M_(TRY)WAIT also resulted in a failed allocation.
  M_TRYWAIT basically means "try harder, block if necessary, but don't
  necessarily wait forever." The time spent blocking is tunable with
  the kern.ipc.mbuf_wait sysctl.
  M_WAIT is now deprecated but still defined for the next little while.

* Fix a typo in a comment in mbuf.h

* Fix some code that was actually passing the mbuf subsystem's M_WAIT to
  malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the
  value of the M_WAIT flag, this could have became a big problem.
2000-12-21 21:44:31 +00:00
Poul-Henning Kamp
b80e3b4191 A last minute brucification resulted in syntax errors in the previous commit. 2000-12-20 22:07:59 +00:00
Poul-Henning Kamp
e2a09b2649 Replace logwakeup() with "int msgbuftrigger". There is little
point in calling a function just to set a flag.

Keep better track of the syslog FAC/PRI code and try to DTRT if
they mingle.

Log all writes to /dev/console to syslog with <console.info>
priority.  The formatting is not preserved, there is no robust,
way of doing it.  (Ideas with patches welcome).
2000-12-20 21:50:37 +00:00
John Baldwin
48786ef412 Fix another sched_sihand -> sched_swi in a KTR trace message. 2000-12-18 23:59:34 +00:00
Jake Burkholder
1156bc4de2 Whitespace. Fix a comment block and an if statement that were wider
than 80 characters.
2000-12-18 07:10:04 +00:00
Marcel Moolenaar
d96cfeae0c Fix a typo that allowed signals caused by traps to be delivered
to the process when said signal is masked.

PR: 23457
Submitted by: Yasuhiko Watanabe <yasu@mrit.mei.co.jp>
2000-12-16 21:03:48 +00:00
John Baldwin
a9b1370731 Delay waking up processes select'ing on the log device directly from
the kernel console.  Instead, change logwakeup() to set a flag in the
softc.  A callout then wakes up every so often and wakes up any processes
selecting on /dev/log (such as syslogd) if the flag is set.  By default
this callout fires 5 times a second, but that can be adjusted by the
sysctl kern.log_wakeups_per_second.

Reviewed by:	phk
2000-12-15 21:23:32 +00:00
John Baldwin
ffc831da27 Stick the kthread API in a kthread_* namespace, and the specialized kproc
functions in a kproc_* namespace.

Reviewed by:	-arch
2000-12-15 20:08:20 +00:00
Poul-Henning Kamp
f84ee0ff00 Don't clone impossible unit numbers for disks. 2000-12-15 17:55:24 +00:00
John Baldwin
de3622188a Add in MI implementations of the KTR trace buffer ddb commands. The
commands have also been slightly updated as follows:
- Use ktr_idx to find the newest entry rather than walking the buffer
  comparing timespecs.  Timespecs are not always unique after the change
  to use getnanotime(9).
- Add a new verbose setting.  When the verbose setting is on, then the
  timestamp is printed with each message.  If KTR_EXTEND is on, then the
  filename and line number are output as well.  By default this option is
  off.  It can be turned on with the 'v' modifier passed to the 'tbuf'
  and 'tall' commands.  For the 'tnext' command, the 'v' modifier toggles
  the verbose mode.
- Only display the cpu number for each message on SMP systems.
- Don't display anything for an empty entry that hasn't been used yet.
2000-12-15 00:01:20 +00:00
John Baldwin
562e4ffe86 - Add a new flag MTX_QUIET that can be passed to the various mtx_*
functions.  If this flag is set, then no KTR log messages are issued.
  This is useful for blocking excessive logging, such as with the internal
  mutex used by the witness code.
- Use MTX_QUIET on all of the mtx_enter/exit operations on the internal
  mutex used by the witness code.
- If we are in a panic, don't do witness checks in witness_enter(),
  witness_exit(), and witness_try_enter(), just return.
2000-12-13 21:53:42 +00:00
Dag-Erling Smørgrav
60ec413038 String buffer API 2000-12-13 19:51:07 +00:00
John Baldwin
05f9877c15 If we fail to emulate a vm86 trap in kernel mode, then we use
vm86_trap() to return to the calling program directly.  vm86_trap()
doesn't return, thus it was never returning to trap() to release
Giant.  Thus, release Giant before calling vm86_trap().
2000-12-13 18:57:15 +00:00
Kirk McKusick
0bf3b91d8a Use proper mutex locking when calling setrunnable from speedup_syncer().
Submitted by:	Tor.Egge@fast.no
2000-12-13 01:06:53 +00:00
Jake Burkholder
c0c2557090 - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead
of explicit calls to lockmgr.  Also provides macros for the flags
  pased to specify shared, exclusive or release which map to the
  lockmgr flags.  This is so that the use of lockmgr can be easily
  replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
2000-12-13 00:17:05 +00:00
Matt Jacob
1426b70df8 only include sys/proc.h once 2000-12-12 21:20:48 +00:00
David E. O'Brien
184265fd42 Include sys/proc.h so this compiles [on the Alpha]. 2000-12-12 21:18:13 +00:00
Matt Jacob
093d32e535 We reference curproc, ergo need <sys/proc.h> 2000-12-12 21:14:29 +00:00
Kirk McKusick
1f7d250182 Change the proc information returned from the kernel so that it
no longer contains kernel specific data structures, but rather
only scalar values and structures that are already part of the
kernel/user interface, specifically rusage and rtprio. It no
longer contains proc, session, pcred, ucred, procsig, vmspace,
pstats, mtx, sigiolst, klist, callout, pasleep, or mdproc. If
any of these changed in size, ps, w, fstat, gcore, systat, and
top would all stop working. The new structure has over 200 bytes
of unassigned space for future values to be added, yet is nearly
100 bytes smaller per entry than the structure that it replaced.
2000-12-12 07:25:57 +00:00
John Baldwin
06592dd188 - Convert the per-eventhandler list mutex to a lockmgr lock so that it can
be safely held across an eventhandler function call.
- Fix an instance of the head of an eventhandler list being read without
  the lock being held.
- Break down and use a SYSINIT at the new SI_SUB_EVENTHANDLER to initialize
  the eventhandler global mutex and the eventhandler list of lists rather
  than using a non-MP safe initialization during the first call to
  eventhandler_register().
- Add in a KASSERT() to eventhandler_register() to ensure that we don't try
  to register an eventhandler before things have been initialized.
2000-12-12 04:01:35 +00:00
Jake Burkholder
92cf772d8d - Add code to detect if a system call returns with locks other than Giant
held and panic if so (conditional on witness).
- Change witness_list to return the number of locks held so this is easier.
- Add kern/syscalls.c to the kernel build if witness is defined so that the
  panic message can contain the name of the offending system call.
- Add assertions that Giant and sched_lock are not held when returning from
  a system call, which were missing for alpha and ia64.
2000-12-12 01:14:32 +00:00
John Baldwin
d664747bfa - Don't bother taking a trace message if we have panic'd since doing so
can lead to further panics.
- Call getnanotime() instead of nanotime() for the timestamp.  nanotime()
  is more precise, but it also calls into the timer code, which results
  in mutex operations on the i386 arch.  If KTR_LOCK is turned on, then
  ktr_tracepoint() recurses on itself until it exhausts the kernel stack.
  Eventually this should change to use get_cyclecount() instead, but that
  can't happen if get_cyclecount() is calling nanotime() instead of
  getnanotime().
2000-12-12 00:43:50 +00:00
John Baldwin
428b4b5562 Oops, the witness mutex is a spin lock, so use MTX_SPIN in the call to
mtx_init().  Since the witness code ignores its internal mutex, this
doesn't result in any functional change.
2000-12-12 00:37:18 +00:00
David E. O'Brien
1a37aa566b Add `_PATH_DEVZERO'.
Use _PATH_* where where possible.
2000-12-09 09:35:55 +00:00
David Malone
7cc0979fd6 Convert more malloc+bzero to malloc+M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
2000-12-08 21:51:06 +00:00
Poul-Henning Kamp
959b7375ed Staticize some malloc M_ instances. 2000-12-08 20:09:00 +00:00
Poul-Henning Kamp
06b6617e0b Kill some bogus "register" keywords.
Go Ansi on the functions.
2000-12-08 06:57:39 +00:00
Matthew Dillon
a41ce5d30b Only call bwillwrite() for vnodes. Do not penalize devices or pipes. 2000-12-07 23:45:57 +00:00
Poul-Henning Kamp
5e1aea9fd7 Hide intrstate in the #ifdef where it belongs. 2000-12-07 22:38:22 +00:00
Matthew Dillon
9440653d07 Add necessary bwillwrite() in writev() entry point.
Deal with excessive dirty buffers when msync() syncs non-contiguous
dirty buffers by checking for the case in UFS *before* checking for
clusterability.
2000-12-06 20:55:09 +00:00
Peter Wemm
138e514cb5 Untangle vfsinit() a bit. Use seperate sysinit functions rather than
having a super-function calling bits all over the place.
2000-12-06 07:09:08 +00:00
Peter Wemm
7ca7bbb36b Simplify this a bit so that it doesn't have to generate silly redundant
__P() prototypes when an ansi-style static inline is a prototype already.
Since vnode_if.[ch] are generated on the fly, there are no CVS diffs to
mess up.
2000-12-06 06:59:38 +00:00
Peter Wemm
4366ac52ad This is kind of a nasty hack, but it appears to solve the Compaq DL360
SMP problem.  Compaq, in their infinite wisdom, forgot to put the IO apic
intpin #0 connection to the 8259 PIC into the mptable.  This hack is to
look and see if intpin #0 has *no* table entry and adds a fake ExtInt
entry for the remap routines to use.  isa/clock.c will still test the
interrupts.  This entry is only ever used on an already broken system.
2000-12-06 03:47:14 +00:00
John Baldwin
960d3c68ed Pass RFSTOPPED to fork1() in kthread_create() to avoid a race condition
where fork1() could put the process on the run queue where it could be
snatched up by another CPU before kthread_create() had set the proper
fork handler.  Instead, we put the new kthread on the runqueue after its
fork handler has been sent.

Noticed by:	jake
Looked over by:	peter
2000-12-06 03:45:15 +00:00
John Baldwin
7b29322c25 Add in #include of <sys/lock.h> since it was axed from <sys/proc.h>.
Noticed by:	Wesley Morgan <morganw@chemikals.org>
Pointy hat to:	me
2000-12-06 00:33:58 +00:00
Alfred Perlstein
89b54bffe9 Add forgotten SYSCALL_MODULE_HELPER() for msgsys() syscall.
Discovered by: Valentin Chopov <valentin@valcho.net>
2000-12-05 23:05:45 +00:00
Jake Burkholder
1eb44f0270 Remove the last of the MD netisr code. It is now all MI. Remove
spending, which was unused now that all software interrupts have
their own thread.  Make the legacy schednetisr use an atomic op
for setting bits in the netisr mask.

Reviewed by:	jhb
2000-12-05 00:36:00 +00:00
Peter Wemm
5ee171d264 Cleanup some leftover lint from the old interrupt system.
Also, while here, run up to 32 interrupt sources on APIC systems.
Normalize INTREN/INTRDIS so they are the same on both UP and SMP systems
rather than sometimes a macro, and sometimes a function.

Reviewed by:  jhb, jakeb
2000-12-04 21:15:14 +00:00
Jake Burkholder
8dd431fcf7 Whitespace. Fix indentation, align comments. 2000-12-04 10:23:29 +00:00
Jake Burkholder
f6a6e37a2c Whitespace. Fix an overly long line. 2000-12-04 09:52:39 +00:00
Jake Burkholder
85b039fe64 Remove if defined(tahoe) cobwebs. 2000-12-04 09:49:34 +00:00
David Greenman
8f9a5273a3 Changed second argument in a call to sf_buf_free() to be NULL instead of
PAGE_SIZE to match the prototype better. The argument is ignored, so this
is just to silence the compile-time warning.

Pointed out by:	jhb
2000-12-03 01:35:46 +00:00
John Baldwin
4971f62a86 - Add a mutex to the proc structure p_mtx that will be used to lock accesses
to each individual proc.
- Initialize the lock during fork1(), and destroy it in wait1().
2000-12-03 01:22:34 +00:00
Andrew Gallatin
19f085228f Correct int/long type mismatch in the proper place this time. freevnodes
and numvnodes are longs in the kernel.  They should remain longs in systat,
what really needs to change is that they should be using SYSCTL_LONG rather
than SYSCTL_INT.   I also changed wantfreevnodes to SYSCTL_LONG because I
happened to notice it.

I wish there was a way to find all of these automatically..

Pointed out by: bde
2000-12-02 20:08:33 +00:00
Jake Burkholder
a4bd171dbf Regen. 2000-12-02 05:45:32 +00:00
Jake Burkholder
86360fee54 Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup
from struct proc, which are now unused (p_nthread already was).
Remove process flag P_KTHREADP which was untested and only set
in vfs_aio.c (it should use kthread_create).  Move the yield
system call to kern_synch.c as kern_threads.c has been removed
completely.

moral support from:	alfred, jhb
2000-12-02 05:41:30 +00:00
John Baldwin
0ebabc93a4 Protect p_stat with sched_lock. 2000-12-02 01:32:51 +00:00
Bosko Milekic
794cd879fe Make sure to free the sf_buf if we've allocated it but fail to allocate
an mbuf (ENOBUFS) before returning so that we don't leak sf_bufs in
the case where we're out of mbufs.

Submitted by: David Greenman (dg)
2000-12-02 00:40:57 +00:00
John Baldwin
1c32c37c06 Protect p_stat with sched_lock. 2000-12-01 23:43:15 +00:00
John Baldwin
2925cbe569 Protect p_stat with sched_lock. 2000-12-01 16:59:02 +00:00
Alfred Perlstein
78525ce318 sysvipc loadable.
new syscall entry lkmressys - "reserved loadable syscall"

Make syscall_register allow overwriting of such entries (lkmressys).
2000-12-01 08:57:47 +00:00
Alfred Perlstein
3a4d365463 Add reserved lkmressys keyword. I swear, this script will die the
next time I need to hack on it.
2000-12-01 08:47:54 +00:00
Alfred Perlstein
1dc8643099 implement NOSTD syscall type, this creates the syscall args, but sticks
a lkmnosys into the sysent table so that SYSCALL_MODULE() works
2000-12-01 07:40:20 +00:00
Alfred Perlstein
c5a86b0ab9 Translate alfred to english.
Submitted by: bde
2000-12-01 06:59:18 +00:00
Jake Burkholder
1512b5d6ab Use an mp-safe callout for endtsleep. 2000-12-01 04:55:52 +00:00
John Baldwin
2191340786 Use msleep() instead of mtx_exit()/tsleep() so that we release the lock and
go to sleep as an "atomic" operation.
2000-12-01 03:43:33 +00:00
John Baldwin
472fd56ea5 Don't update p_stat in exit1() to SZOMB until after releasing the allproc
lock.  Otherwise, if we block on the backing mutex while releasing the
allproc lock, then when we resume, we will be at SRUN, and we will stay
that way all the way through cpu_exit.  As a result, our parent will never
harvest us.
2000-12-01 03:42:17 +00:00
Jake Burkholder
96fde7da19 Use msleep instead of mtx_exit; tsleep; mtx_enter, which is not safe. 2000-12-01 02:18:38 +00:00
John Baldwin
6936206ebd Split the WITNESS and MUTEX_DEBUG options apart so that WITNESS does not
depend on MUTEX_DEBUG.  The MUTEX_DEBUG option turns on extra assertions
and checks to verify that mutexes themselves are implemented properly.
The WITNESS option uses extra checks and diagnostics to verify that other
code is using mutexes properly.
2000-12-01 00:10:59 +00:00
Robert Watson
cf64863a1e o Add a comment to exec_check_permissions() to indicate that the
passed vnode must be locked; this is the case because of calls
  to VOP_GETATTR(), VOP_ACCESS(), and VOP_OPEN().  This becomes
  more of an issue when VOP_ACCESS() gets a bit more complicated,
  which it does when you introduce ACL, Capability, and MAC
  support.

Obtained from:	TrustedBSD Project
2000-11-30 21:06:05 +00:00
Alfred Perlstein
c6ab5768aa only call bwillwrite() to stall on IO when dealing with VNODEs otherwise
we will stall on non-disk IO for things like fifos and sockets
2000-11-30 20:23:14 +00:00
Alfred Perlstein
237710275e This is a fix for a problem described in PR kern/19572. It was
recently discussed at -hackers. The problem is a null-pointer
    dereference that happens in kern/vfs_lookup.c when accessing ".."
    with a v_mount entry for the current directory vnode of NULL. This
    happens when a volume is forcibly unmounted, and the vnode for a
    working directory in the mounted volume is cleared.

PR: 23191
Submitted by: Thomas Moestl <tmoestl@gmx.net>
2000-11-30 20:04:44 +00:00
Alfred Perlstein
1baf4aabbc use a oppurtunistic locking strategy with the uidinfo structures to avoid
locking the global hash on each uifree()

make struct uidinfo only visible to the kernel

make uihold() a function rather than a macro to reduce bloat

swap the order of a spl/mutex to maintain consistancy
2000-11-30 19:15:22 +00:00
Alfred Perlstein
5c3f70d7c0 make crfree into a function rather than a macro to avoid bloat because of
the mutex aquire/release

reorder struct ucred
2000-11-30 19:09:48 +00:00
Kirk McKusick
6d984dfa6a Get rid of a bogus mtx_exit (it was attempting to release an
already released mutex).

Submitted by:	"Chris Knight" <chris@aims.com.au>
2000-11-30 19:09:29 +00:00
Marcel Moolenaar
d034d459da Don't use p->p_sigstk.ss_flags to keep state of whether the
process is on the alternate stack or not. For compatibility
with sigstack(2) state is being updated if such is needed.

We now determine whether the process is on the alternate
stack by looking at its stack pointer. This allows a process
to siglongjmp from a signal handler on the alternate stack
to the place of the sigsetjmp on the normal stack. When
maintaining state, this would have invalidated the state
information and causing a subsequent signal to be delivered
on the normal stack instead of the alternate stack.

PR: 22286
2000-11-30 05:23:49 +00:00
John Baldwin
1bd0eefb4c Fix up priority propagation:
- Use a better test for determining when a process is running.
- Convert some checks to assertions.
- Remove unnecessary tests.
- Save the priority before acquiring a mutex rather than in msleep(9).
2000-11-30 00:51:16 +00:00
John Baldwin
86327ad8a4 Set p_mtxname when blocking on a mutex and clear it when waking up. 2000-11-29 20:17:15 +00:00
John Baldwin
62ca2477d8 Save a copy of p_mtxname in e_mtxname when creating an eproc. 2000-11-29 20:14:50 +00:00
John Baldwin
f404050e44 Use an atomic operation with an appropriate memory barrier when releasing
a contested sleep mutex in the case that at least two processes are blocked
on the contested mutex.
2000-11-29 18:41:19 +00:00
John Baldwin
8f838cb563 The sched_lock mutex goes after the sio mutex in the locking order since
a software interrupt can be scheduled in the sio interrupt handler while
the sio mutex is held.
2000-11-29 18:38:14 +00:00
John Baldwin
bbc7a98a31 Save the line number and filename of the last mtx_enter operation for
spin locks.  We already do this for sleep locks.
2000-11-29 18:37:01 +00:00
John Baldwin
e2979dcc85 Don't drop Giant and the passed in mutex incorrectly in the
cold || panicstr case.  Do drop the passed in mutex in that case if
PDROP is specified.
2000-11-29 18:32:50 +00:00
John Baldwin
2bcc63c545 Only print out APIC info on an SMP system during a panic if APIC_IO is
defined.
2000-11-29 01:33:15 +00:00
John Baldwin
8d9888d37a Don't wait forever for CPUs to stop or restart. Instead, give up after a
timeout.  If DIAGNOSTIC is turned on, then display a message to the console
with a map of which CPUs failed to stop or restart.  This gives an SMP box
at least a fighting chance of getting into DDB if one of the other CPUs has
interrupts disabled.
2000-11-28 23:52:36 +00:00
Jordan K. Hubbard
7022a92395 Kernel support for erase2 character.
Submitted by:	Rui Pedro Mendes Salgueiro <rps@mat.uc.pt>
2000-11-28 20:03:23 +00:00
Matthew N. Dodd
46aa504e42 Alter the return value and arguments of the GET_RESOURCE_LIST bus method.
Alter consumers of this method to conform to the new convention.
Minor cosmetic adjustments to bus.h.

This isn't of concern as this interface isn't in use yet.
2000-11-28 06:49:15 +00:00
Jake Burkholder
4f55983606 Use callout_reset instead of timeout(9). Most callouts are statically
allocated, 2 have been added to struct proc for setitimer and sleep.

Reviewed by:	jhb, jlemon
2000-11-27 22:52:31 +00:00
John Baldwin
91b7c97713 Drop Giant around the mi_switch() call in yield().
Submitted by:	tegge
2000-11-27 18:48:13 +00:00
Alfred Perlstein
1e5d626ad9 ucred system overhaul:
1) mpsafe (protect the refcount with a mutex).
2) reduce duplicated code by removing the inlined crdup() from crcopy()
   and make crcopy() call crdup().
3) use M_ZERO flag when allocating initial structs instead of calling bzero
   after allocation.
4) expand the size of the refcount from a u_short to an u_int, by using
   shorts we might have an overflow.

Glanced at by: jake
2000-11-27 00:09:16 +00:00
Alfred Perlstein
0931dcefb3 Move the #define of _KERN_MUTEX_C_ so that it's before any system headers
are included.  System headers can include sys/mutex.h and then certain
macros do not get defined.

Reviewed by: jake
2000-11-26 21:14:17 +00:00
Poul-Henning Kamp
a52585d77e Simplify the tprintf() API.
Loose the special <sys/tprintf.h> #include file.
2000-11-26 20:35:21 +00:00
Poul-Henning Kamp
4d88c4598f Make log(-1, ...) do what addlog(...) did.
Replace all uses of addlog(...) with log(-1, ...)

Remove bogus "register" keywords in subr_prf.c

Make log() return void.
2000-11-26 19:34:06 +00:00
Poul-Henning Kamp
cb7e609a3c Make diskerr() always log with printf. 2000-11-26 19:29:15 +00:00
Jake Burkholder
a5d5c61c12 Add uidinfo hash and uidinfo struct to the witness order list. 2000-11-26 15:05:46 +00:00
Alfred Perlstein
9c19bcddf0 Make uidinfo subsystem mpsafe
use a mutex lock when looking up/deleting entries on the hashlist
use a mutex lock on each uidinfo when updating fields

make uifree() a void function rather than 'int' since no one cares

allocate uidinfo structs with the M_ZERO flag and don't explicitly initialize
them

Assisted by: eivind, jhb, jakeb
2000-11-26 12:08:17 +00:00
Jonathan Lemon
e82ac18e52 Revert the last commit to the callout interface, and add a flag to
callout_init() indicating whether the callout is safe or not.  Update
the callers of callout_init() to reflect the new interface.

Okayed by: Jake
2000-11-25 06:22:16 +00:00
Jake Burkholder
249849e0b9 - Rename callout_reset to _callout_reset and add a flags argument.
- Add macros callout_reset, which does the obvious, and
  mp_callout_reset, which passes the CALLOUT_MPSAFE flag.
2000-11-25 03:34:49 +00:00
Jake Burkholder
553629ebc9 Protect the following with a lockmgr lock:
allproc
	zombproc
	pidhashtbl
	proc.p_list
	proc.p_hash
	nextpid

Reviewed by:	jhb
Obtained from:	BSD/OS and netbsd
2000-11-22 07:42:04 +00:00
John Baldwin
0959cc6680 Ahem, fix the disclaimer portion of the copyright so it disclaim's the
voices in my head.  You can sue the voices in Bill Paul's head all you
want.

Noticed by:	jhb
2000-11-21 21:10:15 +00:00
Jonathan Lemon
4a476efa51 Protect p_wchan with sched_lock in selwakeup(). 2000-11-21 20:22:34 +00:00
Alan Cox
c6fa9f78d2 Provide a new interface for the user of aio_read() and aio_write() to request
a kevent upon completion of the I/O.  Specifically, introduce a new type
of sigevent notification, SIGEV_EVENT.  If sigev_notify is SIGEV_EVENT,
then sigev_notify_kqueue names the kqueue that should receive the event
and sigev_value contains the "void *" is copied into the kevent's udata
field.

In contrast to the existing interface, this one: 1) works on
the Alpha 2) avoids the extra copyin() call for the kevent because all
of the information needed is in the sigevent and 3) could be
applied to request a single kevent upon completion of an entire lio_listio().

Reviewed by:	jlemon
2000-11-21 19:36:36 +00:00
Alfred Perlstein
830fedd28f Accept filters broke kernels compiled without options INET.
Make accept filters conditional on INET support to fix.

Pointed out by: bde
Tested and assisted by: Stephen J. Kiernan <sab@vegamuse.org>
2000-11-20 01:35:25 +00:00
Robert Watson
7f112b0489 o Export cp_time ("CPU time statistics") using SYSCTL_OPAQUE.
This removes a reason that systat requires setgid kmem.  More to
  come.
2000-11-20 00:44:58 +00:00
Robert Watson
aa5429970c o Export nchstats ("VFS cache effectiveness statistics") using
SYSCTL_OPAQUE.  This removes a reason that systat requires
  setgid kmem.  More to come.
2000-11-20 00:41:11 +00:00
David Malone
32af0d74f0 Make sbcompress use the new M_WRITABLE macro. Previously sbcompress
could not compress into clusters. This could result in lots of
wasted clusters while recieving small packets from an interface
that uses clusters for all it's packets.

Patch is partially from BSDi (limiting the size of the copy) and
based on a patch for 4.1 by Ian Dowse <iedowse@maths.tcd.ie> and
myself.

Reviewed by:	bmilekic
Obtained From:	BSDi
Submitted by:	iedowse
2000-11-19 22:22:47 +00:00
Jake Burkholder
fa2fbc3dac - Protect the callout wheel with a separate spin mutex, callout_lock.
- Use the mutex in hardclock to ensure no races between it and
  softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
  CALLOUT_MPSAFE, which specifies that a callout handler does not
  need giant.  There is still no way to set this flag when
  regstering a callout.

Reviewed by:	-smp@, jlemon
2000-11-19 06:02:32 +00:00
Matthew Dillon
936524aa02 Implement a low-memory deadlock solution.
Removed most of the hacks that were trying to deal with low-memory
    situations prior to now.

    The new code is based on the concept that I/O must be able to function in
    a low memory situation.  All major modules related to I/O (except
    networking) have been adjusted to allow allocation out of the system
    reserve memory pool.  These modules now detect a low memory situation but
    rather then block they instead continue to operate, then return resources
    to the memory pool instead of cache them or leave them wired.

    Code has been added to stall in a low-memory situation prior to a vnode
    being locked.

    Thus situations where a process blocks in a low-memory condition while
    holding a locked vnode have been reduced to near nothing.  Not only will
    I/O continue to operate, but many prior deadlock conditions simply no
    longer exist.

Implement a number of VFS/BIO fixes

	(found by Ian): in biodone(), bogus-page replacement code, the loop
        was not properly incrementing loop variables prior to a continue
        statement.  We do not believe this code can be hit anyway but we
        aren't taking any chances.  We'll turn the whole section into a
        panic (as it already is in brelse()) after the release is rolled.

	In biodone(), the foff calculation was incorrectly
        clamped to the iosize, causing the wrong foff to be calculated
        for pages in the case of an I/O error or biodone() called without
        initiating I/O.  The problem always caused a panic before.  Now it
        doesn't.  The problem is mainly an issue with NFS.

	Fixed casts for ~PAGE_MASK.  This code worked properly before only
        because the calculations use signed arithmatic.  Better to properly
        extend PAGE_MASK first before inverting it for the 64 bit masking
        op.

	In brelse(), the bogus_page fixup code was improperly throwing
        away the original contents of 'm' when it did the j-loop to
        fix the bogus pages.  The result was that it would potentially
        invalidate parts of the *WRONG* page(!), leading to corruption.

	There may still be cases where a background bitmap write is
        being duplicated, causing potential corruption.  We have identified
        a potentially serious bug related to this but the fix is still TBD.
        So instead this patch contains a KASSERT to detect the problem
  	and panic the machine rather then continue to corrupt the filesystem.
	The problem does not occur very often..  it is very hard to
	reproduce, and it may or may not be the cause of the corruption
	people have reported.

Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>)
Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>
2000-11-18 23:06:26 +00:00
Matthew Dillon
279d722604 This patchset fixes a large number of file descriptor race conditions.
Pre-rfork code assumed inherent locking of a process's file descriptor
    array.  However, with the advent of rfork() the file descriptor table
    could be shared between processes.  This patch closes over a dozen
    serious race conditions related to one thread manipulating the table
    (e.g. closing or dup()ing a descriptor) while another is blocked in
    an open(), close(), fcntl(), read(), write(), etc...

PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>
2000-11-18 21:01:04 +00:00
John Baldwin
b6b55e27a4 Release sched_lock very briefly to give interrupts a chance to fire if we
are in softclock() for a long time.  The old code already did an
splx()/slphigh() pair here, I just missed adding in the equivalent mutex
operations on sched_lock earlier.
2000-11-18 00:21:00 +00:00
Tor Egge
e5c5b82950 Don't attempt to cluster write buffers where the VMIO flag isn't set. 2000-11-17 23:40:08 +00:00
Jake Burkholder
7da6f97772 - Split the run queue and sleep queue linkage, so that a process
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.

Tested by:	John Hay <jhay@icomtek.csir.co.za>
		jhb
2000-11-17 18:09:18 +00:00
John Baldwin
cb799bfef9 The recent changes to msleep() and mawait() resulted in timeout() and
untimeout() not being called with Giant in those functions.  For now,
use the sched_lock to protect the callout wheel in softclock() and in
the various timeout and callout functions.

Noticed by:	tegge
2000-11-16 21:20:52 +00:00
John Baldwin
20cdcc5b73 Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch().  The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by:	witness
2000-11-16 02:16:44 +00:00
John Baldwin
92c79c7e3e Argh, add in a missing release of the sched_lock. 2000-11-16 01:16:54 +00:00
John Baldwin
95de685572 CURSIG() calls functions that acquire sleep mutexes, so it is not a good
idea to be holding the sched_lock while we are calling it.  As such,
release sched_lock before calling CURSIG() in msleep() and mawait() and
reacquire it after CURSIG() returns.

Submitted by:	witness
2000-11-16 01:07:19 +00:00
John Baldwin
b84988521c - Rename await() to mawait(). mawait() is to await() as msleep() is to
tsleep().  Namely, mawait() takes an extra argument which is a mutex
  to drop when going to sleep.  Just as with msleep(), if the priority
  argument includes the PDROP flag, then the mutex will be dropped and will
  not be reacquired when the process wakes up.
- Add in a backwards compatible macro await() that passes in NULL as the
  mutex argument to mawait().
2000-11-15 22:39:35 +00:00
John Baldwin
3ae4dd935b - Replace a KASSERT() that knew too much about mutex internals with a
mtx_assert() that ensures the mutex we release during msleep() is both
  not recursed and owned by the current process.
2000-11-15 22:30:48 +00:00
John Baldwin
f33a072eb9 - Convert references from tsleep() -> msleep()
- Fix a buglet in a comment above await()
2000-11-15 22:27:38 +00:00
John Baldwin
9c36c934a1 Include the right headers to get the DDB #define and the db_active variable. 2000-11-15 22:08:16 +00:00
John Baldwin
896c2303d4 - Replace some instances of sched_ithd with sched_swi in KTR tracepoints.
- Assert that Giant is not owned during the main loop of sithd_loop().
2000-11-15 22:05:23 +00:00
John Baldwin
59f857e4ea Declare the 'witness_spin_check' properly as a per-CPU variable in the
non-SMP case.
2000-11-15 22:02:05 +00:00
John Baldwin
ecbd8e3710 Don't perform witness checks in witness_enter() during a panic. 2000-11-15 22:00:31 +00:00
John Baldwin
22f1b34223 Make ktr_verbose a bit more useful:
- On SMP systems display the cpu number with each message
- If ktr_verbose > 1, then include the filename and line number with each
  trace message
2000-11-15 21:51:53 +00:00
Kirk McKusick
324d6bacc3 Bug fix for revision 1.14 on the replacement of CIRCLEQ with TAILQ.
Submitted by:	Warner Losh <imp@village.org>
2000-11-15 20:07:16 +00:00
Kirk McKusick
a077f63555 In preparation for deprecating CIRCLEQ macros in favor of TAILQ
macros which provide the same functionality and are a bit more
efficient, convert use of CIRCLEQ's in resource manager to TAILQ's.

Approved by:	Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
2000-11-14 20:46:02 +00:00
David Greenman
866746b6a6 Fixed a certain panic on IO error in sendfile(): Page must be set PG_BUSY
before calling vm_page_free() on it.
2000-11-12 14:51:15 +00:00
Bosko Milekic
e778918123 * Have m_pulldown() use the new M_WRITABLE() macro in order to determine
whether the given ext_buf is shared.

* Have the sf_bufs be setup with the mbuf subsystem using MEXTADD() with the
two new arguments.

Note: m_pulldown() is somewhat crotchy; the added comment explains the
situation.

Reviewed by: jlemon
2000-11-11 23:04:15 +00:00
Robert Watson
7f73938e96 o Fix a mis-transcription of sef's -STABLE protection fixes--only root
could debug processes after the commit that introduced the typo.
  Security is good, but security is not always the same as turning things
  off :-).

PR:		kern/22711
Obtained from:	brooks@one-eyed-alien.net
2000-11-10 23:57:48 +00:00
John Baldwin
20af769e69 Don't overwrite the filename for KTR_EXTEND with "../../kern/kern_ktr.c". 2000-11-10 22:30:44 +00:00
John Baldwin
9842fc8dda Axe some unused variables. 2000-11-10 21:54:19 +00:00
John Baldwin
bf619f9506 Fix SMP kernel compiles by #include'ing machine/globals.h to get the
cpuid variable.
2000-11-10 21:52:04 +00:00
John Baldwin
0fe4e534b1 Minor whitespace nit in a comment. 2000-11-10 21:21:20 +00:00
John Baldwin
b5d09a79b5 Ignore the INTR_MPSAFE flag when calculating the priority of an interrupt
thread.
2000-11-10 21:19:14 +00:00
Mike Smith
edcb5775ec Implement a trivial but effective interface for obtaining the kernel's
device tree and resource manager contents.  This is the kernel side of
the upcoming libdevinfo, which will expose this information to userspace
applications in a trivial fashion.

Remove the now-obsolete DEVICE_SYSCTLS code.
2000-11-09 10:21:23 +00:00
Marcel Moolenaar
806d7daafe Make MINSIGSTKSZ machine dependent, and have the sigaltstack
syscall compare against a variable sv_minsigstksz in struct
sysentvec as to properly take the size of the machine- and
ABI dependent struct sigframe into account.

The SVR4 and iBCS2 modules continue to have a minsigstksz of
8192 to preserve behavior. The real values (if different) are
not known at this time. Other ABI modules use the real
values.

The native MINSIGSTKSZ is now defined as follows:

Arch		MINSIGSTKSZ
----		-----------
alpha		    4096
i386		    2048
ia64		   12288

Reviewed by: mjacob
Suggested by: bde
2000-11-09 08:25:48 +00:00
John Baldwin
d8f03321bd - Remove much of the inlining of the KTR tracepoints into a ktr_tracepoint()
function declared in kern_ktr.c.  The only inline checks left are the
  checks that compare KTR_COMPILE with the supplied mask and thus should
  be optimized away into either nothing or a direct call to ktr_tracepoint().
- Move several KTR-related options to opt_ktr.h now that they are only
  needed by kern_ktr.c and not by ktr.h.
- Add in the ktr_verbose functionality if KTR_EXTEND is turned on.  If the
  global variable 'ktr_verbose' is non-zero, then KTR messages will be
  dumped to the console.  This variable can be set by either kernel code
  or via the 'debug.ktr_verbose' sysctl.  It defaults to off unless the
  KTR_VERBOSE kernel option is specified in which case it defaults to on.
  This can be useful when the machine locks up spinning in a loop with
  interrupts disabled as you might be able to see what it is doing when it
  locks up.

Requested by:	phk
2000-11-07 01:49:48 +00:00
John Baldwin
a924ab9741 Minor nit: missed ithd_loop -> sithd_loop in the KTR tracepoints. 2000-11-07 00:45:18 +00:00
David E. O'Brien
00910f2882 ELF kernels should use an ELF sysvec. This allows us to move a.out
specific files to those platforms that acutally support a.out.
2000-11-05 10:41:35 +00:00
Bosko Milekic
fe27eea9d1 Change the sf_bufs wakeups to be wakeup_one(), because we don't want to
wakeup all of the sleeping threads when we free only one buffer. This
avoids us having to needlessly try again (and fail, and go back to
sleep) for all the threads sleeping. We will now only wakeup the
thread we know will succeed.

Reviewed by: green
2000-11-04 21:55:25 +00:00