freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-24 11:29:10 +00:00

Author	SHA1	Message	Date
Jonathan Mini	d8f4f6a404	Remove trace_req(). Reviewed by: alfred, jhb, peter	2002-05-09 04:13:41 +00:00
Alfred Perlstein	8b43b53530	expand_name fixes: .) don't use MAXPATHLEN + 1, fix logic to compensate. .) style(9) function parameters. .) fix line wrapping. .) remove duplicated error and string handling code. .) don't NUL terminate already NUL terminated string. .) all string length variables changed from int to size_t. .) constify variables. .) catch when corename would be truncated. .) cast pid_t and uid_t args for format string. .) add parens around return arguments. Help and suggestions from: bde	2002-05-08 09:06:47 +00:00
Alfred Perlstein	b2bc3101a8	M_ZERO the temp buffer in expand_name() otherwise if an error occurs while logging we may pass a non NUL terminated string to log(9) for a %s format arg.	2002-05-07 23:37:07 +00:00
Bruce Evans	f5216b9a19	Return the correct error code (ENOSYS, not EINVAL) from nosys(). Getting killed by SIGSYS for unimlemented syscalls is bad enough. Obtained from: Lite2 branch The Lite2 branch has some other interesting unmerged (?) bits in this file. They are well hidden among cosmetic regressions.	2002-05-05 04:50:47 +00:00
John Baldwin	9b3b1c5fdf	- Reorder execve() so that it performs blocking operations before it locks the process. - Defer other blocking operations such as vrele()'s until after we release locks. - execsigs() now requires the proc lock to be held when it is called rather than locking the process internally.	2002-05-02 15:00:14 +00:00
Alfred Perlstein	f132072368	Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.	2002-05-01 20:44:46 +00:00
Ian Dowse	ba1551ca81	Avoid the user-visible effect of setting SA_NOCLDWAIT when the SIGCHLD handler is SIG_IGN. This is a reimplementation of the problematic revision 1.131 of kern_exit.c. To avoid accessing process UPAGES, we set a new procsig flag when the SIGCHLD handler is SIG_IGN and use that instead.	2002-04-27 22:41:41 +00:00
John Baldwin	ba626c1db2	Lock proctree_lock instead of pgrpsess_lock.	2002-04-16 17:11:34 +00:00
John Baldwin	9c1ab3e04a	- Change killpg1()'s first argument to be a thread instead of a process so we can use td_ucred. - In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB or not. We don't need sched_lock. - Close some races in psignal(). In psignal() there is a big switch statement based on p_stat. All the different cases are assuming that the process (or thread) isn't going to change state out from under it. To ensure this is true, just lock sched_lock for the entire switch. We practically held it the entire time already anyways. This also simplifies the locking somewhat and actually results in fewer lock operations. - Allow signotify() to be called with the sched_lock held since psignal() now does that. - Use td_ucred in a couple of places.	2002-04-13 23:33:36 +00:00
Bruce Evans	79065dba2a	Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask. Avoid locking in userret() in most of the remaining cases. Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon	2002-04-04 17:49:48 +00:00
Bruce Evans	179235b38b	Optimized the check for unmasked pending signals in CURSIG() using a new inline function sigsetmasked() and a new macro SIGPENDING(). CURSIG() will soon be moved out of the normal path of execution for syscalls and traps. Then its efficiency will be less important but the new interfaces will be useful for checking for unmasked pending signals in more places. Submitted by: luoqi (long ago, in a slightly different form) Assert that sched_lock is not held in CURSIG().	2002-04-04 15:19:41 +00:00
Bruce Evans	70f52b4845	Fixed some style bugs in the removal of __P(()). The main ones were not removing tabs before "__P((", and not outdenting continuation lines to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting and/or rewrap the whole prototype in some cases.	2002-03-24 05:09:11 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Poul-Henning Kamp	0f5c7c4b1c	Fix warning in !SMP case. Submitted by: Maxime Henrion <mux@mu.org>	2002-02-26 09:21:52 +00:00
Seigo Tanimura	f591779bb5	Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)	2002-02-23 11:12:57 +00:00
Bruce Evans	8c3d74f4bf	Fixed a typo in rev.1.65 that gave a reference to a nonexistent variable. This was not detected by LINT because LINT is missing COMPAT_SUNOS.	2002-02-15 03:54:01 +00:00
Julian Elischer	2c1007663f	In a threaded world, differnt priorirites become properties of different entities. Make it so. Reviewed by: jhb@freebsd.org (john baldwin)	2002-02-11 20:37:54 +00:00
Robert Watson	5da271f5a6	Add a comment indicating that VOP_GETATTR() is called without appropriate locking in the core dump code. This should be fixed.	2002-02-10 21:45:16 +00:00
Julian Elischer	079b7badea	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,	2002-02-07 20:58:47 +00:00
Robert Watson	2b87b6d4f4	o Revert kern_sig.c#1.143, as cr_cansignal() doesn't currently permit a number of desirable cases in which SIGIO/SIGURG are delivered. We'll keep tweaking. Reported by: Alexander Kabaev <ak03@gte.com>	2002-01-10 01:25:35 +00:00
Robert Watson	f8efde8991	- Teach SIGIO code to use cr_cansignal() instead of a custom CANSIGIO() macro. As a result, mandatory signal delivery policies will be applied consistently across the kernel. - Note that this subtly changes the protection semantics, and we should watch out for any resulting breakage. Previously, delivery of SIGIO in this circumstance was limited to situations where the subject was privileged, or where one of the subject's (ruid, euid) matched one of the object's (ruid, euid). In the new scenario, subject (ruid, euid) are matched against the object's (ruid, svuid), and the object uid's must be a subset of the subject uid's. Likewise, jail now affects delivery, and special handling for P_SUGID of the object is present. This change can always be reversed or tweaked if it proves to disrupt application behavior substantially. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-01-06 00:54:46 +00:00
John Baldwin	c86b6ff551	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha	2002-01-05 08:47:13 +00:00
Robert Watson	48f1ba5b0d	o Wording fix in comment. Submitted by: tanimura via p4	2001-12-14 00:38:01 +00:00
Peter Wemm	6c1534a73e	_SIG_MAXSIG (128) is the highest legal signal. The arrays are offset by one - see _SIG_IDX(). Revert part of my mis-correction in kern_sig.c (but signal 0 still has to be allowed) and fix _SIG_VALID() (it was rejecting ignal 128).	2001-11-03 13:26:15 +00:00
Peter Wemm	049954de94	Partial reversion of rev 1.138. kill and killpg allow a signal argument of 0. You cannot return EINVAL for signal 0. This broke (in 5 minutes of testing) at least ssh-agent and screen. However, there was a bug in the original code. Signal 128 is not valid. Pointy-hat to: des, jhb	2001-11-03 12:36:16 +00:00
Dag-Erling Smørgrav	2899d60638	We have a _SIG_VALID() macro, so use it instead of duplicating the test all over the place. Also replace a printf() + panic() with a KASSERT(). Reviewed by: jhb	2001-11-02 23:50:00 +00:00
Ian Dowse	80f42b555d	Fix a typo in do_sigaction() where sa_sigaction and sa_handler were confused. Since sa_sigaction and sa_handler alias each other in a union, the bug was completely harmless. This had been fixed as part of the SIGCHLD changes in revision 1.125, but it was reverted when they were backed out in revision 1.126.	2001-10-07 16:11:37 +00:00
Paul Saab	88b1d98f31	Lock the vnode while truncating the corefile. This fixes a panic with softupdates dangling deps. Submitted by: peter MFC: ASAP :)	2001-09-26 01:24:07 +00:00
Julian Elischer	fdd4e5c652	Replace line accidentally deleted during KSE additions. Symptom.. Stopped program unable to be restarted if it was stopped while already sleeping.	2001-09-17 20:42:25 +00:00
Robert Watson	9844fbc3b5	o Correct authorization check in CANSIGIO(), which suffered from incorrect transcription during the (pcred,ucred) merge; this was not used for the kill() system call, so does not affect direct explicit process signalling. Pointed out by: fenner	2001-09-15 22:34:46 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Matthew Dillon	06ae1e91c4	This brings in a Yahoo coredump patch from Paul, with additional mods by me (addition of vn_rdwr_inchunks). The problem Yahoo is solving is that if you have large process images core dumping, or you have a large number of forked processes all core dumping at the same time, the original coredump code would leave the vnode locked throughout. This can cause the directory vnode to get locked up, which can cause the parent directory vnode to get locked up, and so on all the way to the root node, locking the entire machine up for extremely long periods of time. This patch solves the problem in two ways. First it uses an advisory non-blocking lock to abort multiple processes trying to core to the same file. Second (my contribution) it chunks up the writes and uses bwillwrite() to avoid holding the vnode locked while blocking in the buffer cache. Submitted by: ps Reviewed by: dillon MFC after: 2 weeks	2001-09-08 20:02:33 +00:00
John Baldwin	df53e91c18	Call sendsig() with the proc lock held and return with it held.	2001-09-06 22:20:41 +00:00
Matthew Dillon	fb99ab8811	Giant Pushdown clock_gettime() clock_settime() nanosleep() settimeofday() adjtime() getitimer() setitimer() __sysctl() ogetkerninfo() sigaction() osigaction() sigpending() osigpending() osigvec() osigblock() osigsetmask() sigsuspend() osigsuspend() osigstack() sigaltstack() kill() okillpg() trapsignal() nosys()	2001-09-01 18:19:21 +00:00
Matthew Dillon	356861db03	Remove the MPSAFE keyword from the parser for syscalls.master. Instead introduce the [M] prefix to existing keywords. e.g. MSTD is the MP SAFE version of STD. This is prepatory for a massive Giant lock pushdown. The old MPSAFE keyword made syscalls.master too messy. Begin comments MP-Safe procedures with the comment: /* * MPSAFE / This comments means that the procedure may be called without Giant held (The procedure itself may still need to obtain Giant temporarily to do its thing). sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE sv_transtrap() is now MP SAFE and assumed to be MP SAFE ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown) trapsignal() is now MP SAFE (Giant Pushdown) Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant) test in syscall[2]() in /*/trap.c now do not. Instead they explicitly unlock Giant if they previously obtained it, and then assert that it is no longer held to catch broken system calls. Rebuild syscall tables.	2001-08-30 18:50:57 +00:00
Peter Pentchev	ccdbd10cb7	Prevent passing a null pointer as a filename to vn_open(), if for some reason expand_name() failed to build a core file name. PR: 29931 Submitted by: Foldi Tamas <crow@kapu.hu> Reviewed by: dd, -arch MFC after: 1 month	2001-08-24 15:49:30 +00:00
Peter Wemm	e8ebc08f80	Make COMPAT_43 optional again. XXX we need COMPAT_FBSD3 etc for this stuff.	2001-08-21 02:32:59 +00:00
Peter Wemm	aa7a4dae6d	Temporarily back out kern_sig.c rev 1.125 and kern_exit.c rev 1.131. This paniced my one of my machines one time too many :-( and there is no sign of a solution in the pipeline. The deltas are still easily available in cvs. The problem is that if the parent has been swapped out, the child process cannot grope around in the parent's UPAGES to see the sigact[] array or it will fault. This probably is a showstopper for this implementation anyway.	2001-08-01 20:35:24 +00:00
Matthew Dillon	4fec48c6fe	As per further discussions on hackers redo the SIGCHLD patch to not generate an unexpected user-visible side effect with the sigaction flags. Also cleanup a minor union issue. Submitted by: Rudolf Cejka <cejkar@dcse.fee.vutbr.cz> MFC addendum: MFC will be combined w/ original commit MFC after: 3 days	2001-07-22 18:47:31 +00:00
John Baldwin	64acb05b1c	Grab Giant around postsig() since sendsig() can call into the vm to grow the stack and we already needed Giant for KTRACE.	2001-07-03 05:27:53 +00:00
John Baldwin	2ad7d3049a	- Change CURSIG() and postsig() to require that the proc lock is held rather than grabbing it and releasing it themselves. This allows callers of these functions to get the lock to close race conditions. - Grab Giant around ktrace in postsig. - Count the switches performed on SIGSTOP's as involuntary context switches in the resource usage stats. Reported by: tegge (signal race), bde (missing csw stats)	2001-06-22 23:02:37 +00:00
John Baldwin	6fad32afc9	Lock Giant in postsig() for the KTRACE case as ktrpsig() needs Giant when it writes out to the trace file. Reported by: peter, gallatin, and others	2001-06-18 19:23:43 +00:00
David Malone	c7fd62da6c	Try to make the setting of the SIGCHLD handler the same as setting of the NOCLDWAI flag. Susv2 seems to require this. Submitted by: Cejka Rudolf <cejkar@dcse.fee.vutbr.cz> Reviewed by: dillon	2001-06-11 09:15:41 +00:00
Robert Watson	b1fc0ec1a7	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
John Baldwin	9081e5e826	- Remove unneeded include of sys/ipl.h. - Require the proc lock be held for killproc() to allow for the vmdaemon to kill a process when memory is exhausted while holding the lock of the process to kill.	2001-05-15 23:13:58 +00:00
Akinori MUSHA	3b26be6ae1	Properly copy the P_ALTSTACK flag in struct proc::p_flag to the child process on fork(2). It is the supposed behavior stated in the manpage of sigaction(2), and Solaris, NetBSD and FreeBSD 3-STABLE correctly do so. The previous fix against libc_r/uthread/uthread_fork.c fixed the problem only for the programs linked with libc_r, so back it out and fix fork(2) itself to help those not linked with libc_r as well. PR: kern/26705 Submitted by: KUROSAWA Takahiro <fwkg7679@mb.infoweb.ne.jp> Tested by: knu, GOTOU Yuuzou <gotoyuzo@notwork.org>, and some other people Not objected by: hackers MFC in: 3 days	2001-05-07 18:07:29 +00:00
John Baldwin	6caa8a1501	Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP. - It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the _process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the _process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the _process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the _process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in. Reviewed by: jake, peter Looked over by: eivind	2001-04-27 19:28:25 +00:00
John Baldwin	33a9ed9d0e	Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake	2001-04-24 00:51:53 +00:00
Robert Watson	4c5eb9c397	o Replace p_cankill() with p_cansignal(), remove wrappage of p_can() from signal authorization checking. o p_cansignal() takes three arguments: subject process, object process, and signal number, unlike p_cankill(), which only took into account the processes and not the signal number, improving the abstraction such that CANSIGNAL() from kern_sig.c can now also be eliminated; previously CANSIGNAL() special-cased the handling of SIGCONT based on process session. privused is now deprecated. o The new p_cansignal() further limits the set of signals that may be delivered to processes with P_SUGID set, and restructures the access control check to allow it to be extended more easily. o These changes take into account work done by the OpenBSD Project, as well as by Robert Watson and Thomas Moestl on the TrustedBSD Project. Obtained from: TrustedBSD Project	2001-04-12 02:38:08 +00:00
John Baldwin	5b3047d59f	Change stop() to require the sched_lock as well as p's process lock to avoid silly lock contention on sched_lock since in 2 out of the 3 places that we call stop(), we get sched_lock right after calling it and we were locking sched_lock inside of stop() anyways.	2001-04-03 01:39:23 +00:00
John Baldwin	1333047621	- Move the second stop() of process 'p' in issignal() to be after we send SIGCHLD to our parent process. Otherwise, we could block while obtaining the process lock for our parent process and switch out while we were in SSTOP. Even worse, when we try to resume from the mutex being blocked on our p_stat will be SRUN, not SSTOP. - Fix a comment above stop() to indicate that it requires that the proc lock be held, not a proctree lock. Reported by: markm Sleuthing by: jake	2001-04-02 17:26:51 +00:00
John Baldwin	1005a129e5	Convert the allproc and proctree locks from lockmgr locks to sx locks.	2001-03-28 11:52:56 +00:00
John Baldwin	c31146a14e	- Resort some includes to deal with the new witness code coming in shortly. - Make sure we have Giant locked before calling coredump() in sigexit(). Spotted by: peter (2)	2001-03-28 08:41:04 +00:00
John Baldwin	628d2653d6	- Proc locking. Most of signal handling is now MP safe and doesn't require Giant. The only exception is the CANSIGNAL() macro. Unlocking the proc lock around sendsig() in trapsignal() is also questionable. Note that the functions sigexit(), psignal(), and issignal() must be called with the proc lock of the process in question held. postsig() and trapsignal() should not be called with the proc lock held, but they also do not require Giant anymore either. - Remove spl's that are now no longer needed as they are fully replaced.	2001-03-07 02:59:54 +00:00
Bruce Evans	d2ef4060d7	Fixed a longstanding latency bug in signal delivery. When a signal is sent to a process, psignal() needs to schedule an AST for the process if the process is runnable, not just if it is current, so that pending signals get checked for on the next return of the process to user mode. This wasn't practical until recently because the AST flag was per-cpu so setting it for a non-current process would usually just cause a bogus AST for the current process. For non-current processes looping in user mode, it took accidental (?) magic to deliver signals at all. Signals were usually delivered late as a side effect of rescheduling (need_resched() sets astpending, etc.). In pre-SMPng, delivery was delayed by at most 1 quantum (the need_resched() call in roundrobin() is certain to occur within 1 quantum for looping processes). In -current, things are complicated by normal interrupt handlers being threads. Missing handling of the complications makes roundrobin() a bogus no-op, but preemptive scheduling sort of works anyway due to even larger bogons elsewhere.	2001-02-19 09:40:58 +00:00
Jake Burkholder	d5a08a6065	Implement a unified run queue and adjust priority levels accordingly. - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.	2001-02-12 00:20:08 +00:00
John Baldwin	142ba5f3d7	- Make astpending and need_resched process attributes rather than CPU attributes. This is needed for AST's to be properly posted in a preemptive kernel. They are backed by two new flags in p_sflag: PS_ASTPENDING and PS_NEEDRESCHED. They are still accesssed by their old macros: aston(), astoff(), etc. For completeness, an astpending() macro has been added to check for a pending AST, and clear_resched() has been added to clear need_resched(). - Rename syscall2() on the x86 back to syscall() to be consistent with other architectures.	2001-02-10 02:20:34 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
John Baldwin	40447cd4aa	- Proc locking. - Catch up to proc flag changes.	2001-01-24 11:08:02 +00:00
John Baldwin	568ae39fd5	Revert revision 1.102. I don't think p_nice needs to be protected with sched_lock, and I'm fairly certain P_TRACED will be protected with the proc lock instead. Pointed out indirectly by: bde	2001-01-19 08:23:22 +00:00
Jason Evans	238510fc46	Implement condition variables.	2001-01-16 01:00:43 +00:00
John Baldwin	5192404af2	Protect p_nice and P_TRACED in psignal() above the switch statement with sched_lock.	2001-01-06 00:08:39 +00:00
John Baldwin	3e6831f510	The previous commit wasn't entirely correct. At least one goto to the out: label in psignal() did not grab sched_lock before trying to release it. Also, the previous version had several cases where it grabbed sched_lock before jumping to out: unneccessarily, so rework this a bit. The runfast: and out: labels must be called with sched_lock released, and the run: label must be called with it held. Appropriate mtx_assert()'s have been added that should catch any bugs that may still be in this code. Noticed by: bde	2001-01-02 18:54:09 +00:00
John Baldwin	4bfba0cf19	Push down sched_lock in psignal(). sched_lock was being held across recursive calls into psignal() as well as calls to signotify(), forward_signal(), etc.	2001-01-01 02:31:08 +00:00
John Baldwin	ef8294075b	Add in a missing release of the proctree lock. Submitted by: Sja <sakari.jalovaara@eqonline.fi>	2001-01-01 02:19:51 +00:00
Jake Burkholder	98f03f9030	Protect proc.p_pptr and proc.p_children/p_sibling with the proctree_lock. linprocfs not locked pending response from informal maintainer. Reviewed by: jhb, -smp@	2000-12-23 19:43:10 +00:00
Marcel Moolenaar	d96cfeae0c	Fix a typo that allowed signals caused by traps to be delivered to the process when said signal is masked. PR: 23457 Submitted by: Yasuhiko Watanabe <yasu@mrit.mei.co.jp>	2000-12-16 21:03:48 +00:00
Jake Burkholder	c0c2557090	- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.	2000-12-13 00:17:05 +00:00
John Baldwin	1c32c37c06	Protect p_stat with sched_lock.	2000-12-01 23:43:15 +00:00
Marcel Moolenaar	d034d459da	Don't use p->p_sigstk.ss_flags to keep state of whether the process is on the alternate stack or not. For compatibility with sigstack(2) state is being updated if such is needed. We now determine whether the process is on the alternate stack by looking at its stack pointer. This allows a process to siglongjmp from a signal handler on the alternate stack to the place of the sigsetjmp on the normal stack. When maintaining state, this would have invalidated the state information and causing a subsequent signal to be delivered on the normal stack instead of the alternate stack. PR: 22286	2000-11-30 05:23:49 +00:00
Jake Burkholder	553629ebc9	Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd	2000-11-22 07:42:04 +00:00
Jake Burkholder	7da6f97772	- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock. Tested by: John Hay <jhay@icomtek.csir.co.za> jhb	2000-11-17 18:09:18 +00:00
John Baldwin	20cdcc5b73	Don't release and acquire Giant in mi_switch(). Instead, release and acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled. Submitted by: witness	2000-11-16 02:16:44 +00:00
Marcel Moolenaar	806d7daafe	Make MINSIGSTKSZ machine dependent, and have the sigaltstack syscall compare against a variable sv_minsigstksz in struct sysentvec as to properly take the size of the machine- and ABI dependent struct sigframe into account. The SVR4 and iBCS2 modules continue to have a minsigstksz of 8192 to preserve behavior. The real values (if different) are not known at this time. Other ABI modules use the real values. The native MINSIGSTKSZ is now defined as follows: Arch MINSIGSTKSZ ---- ----------- alpha 4096 i386 2048 ia64 12288 Reviewed by: mjacob Suggested by: bde	2000-11-09 08:25:48 +00:00
John Baldwin	35e0e5b311	Catch up to moving headers: - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h	2000-10-20 07:58:15 +00:00
Bruce Evans	33510ef17a	Unpessimized CURSIG(). The fast path through CURSIG() was broken in the 128-bit sigset_t changes by moving conditionally (rarely) executed code to the beginning where it is always executed, and since this code now involves 3 128-bit operations, the pessimization was relatively large. This change speeds up lmbench's pipe latency benchmark by 3.5%. Fixed style bugs in CURSIG().	2000-09-17 15:12:04 +00:00
Bruce Evans	fbbeeb6cd6	Uninlined CURSIG() and unpolluted <sys/signalvar.h>. CURSIG() had become very bloated, first with 128-bit sigset_t's, then with locking in the SMP case, then with locking in all cases. The space bloat was probably also time bloat, partly because the fast path through CURSIG() was pessimized by the sigset_t changes. This change speeds up lmbench's pipe-based latency benchmark by 4% on a Celeron. <sys/signalvar.h> had become very polluted to support the bloat.	2000-09-17 14:28:33 +00:00
Doug Rabson	36240ea5bf	Move the include of <sys/systm.h> so that KTR gets a declaration for snprintf().	2000-09-10 13:54:52 +00:00
Jason Evans	0384fff8c5	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh	2000-09-07 01:33:02 +00:00
Robert Watson	387d2c036b	o Centralize inter-process access control, introducing: int p_can(p1, p2, operation, privused) which allows specification of subject process, object process, inter-process operation, and an optional call-by-reference privused flag, allowing the caller to determine if privilege was required for the call to succeed. This allows jail, kern.ps_showallprocs and regular credential-based interaction checks to occur in one block of code. Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL, and P_CAN_DEBUG. p_can currently breaks out as a wrapper to a series of static function checks in kern_prot, which should not be invoked directly. o Commented out capabilities entries are included for some checks. o Update most inter-process authorization to make use of p_can() instead of manual checks, PRISON_CHECK(), P_TRESPASS(), and kern.ps_showallprocs. o Modify suser{,_xxx} to use const arguments, as it no longer modifies process flags due to the disabling of ASU. o Modify some checks/errors in procfs so that ENOENT is returned instead of ESRCH, further improving concealment of processes that should not be visible to other processes. Also introduce new access checks to improve hiding of processes for procfs_lookup(), procfs_getattr(), procfs_readdir(). Correct a bug reported by bp concerning not handling the CREATE case in procfs_lookup(). Remove volatile flag in procfs that caused apparently spurious qualifier warnigns (approved by bde). o Add comment noting that ktrace() has not been updated, as its access control checks are different from ptrace(), whereas they should probably be the same. Further discussion should happen on this topic. Reviewed by: bde, green, phk, freebsd-security, others Approved by: bde Obtained from: TrustedBSD Project	2000-08-30 04:49:09 +00:00
Marcel Moolenaar	31c8f3f0af	Make this file compile again when COMPAT_43 has not been defined. This boils down to conditionally compile the old signal syscalls. We might want to extend the types in syscalls.master to make these syscalls conditionally on something more appropriate than COMPAT_43.	2000-08-26 02:27:01 +00:00
Kirk McKusick	f2a2857bb3	Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).	2000-07-11 22:07:57 +00:00
Kirk McKusick	e6796b67d9	Move the truncation code out of vn_open and into the open system call after the acquisition of any advisory locks. This fix corrects a case in which a process tries to open a file with a non-blocking exclusive lock. Even if it fails to get the lock it would still truncate the file even though its open failed. With this change, the truncation is done only after the lock is successfully acquired. Obtained from: BSD/OS	2000-07-04 03:34:11 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Poul-Henning Kamp	2c9b67a8df	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude	2000-04-30 18:52:11 +00:00
Jonathan Lemon	cb679c385e	Introduce kqueue() and kevent(), a kernel event notification facility.	2000-04-16 18:53:38 +00:00
Matthew Dillon	7c8fdcbd19	Make the sigprocmask() and geteuid() system calls MP SAFE. Expand commentary for copyin/copyout to indicate that they are MP SAFE as well. Reviewed by: msmith	2000-04-02 17:52:43 +00:00
Matthew Dillon	db6a426158	The SMP cleanup commit broke UP compiles. Make UP compiles work again.	2000-03-28 18:06:49 +00:00
Matthew Dillon	36e9f877df	Commit major SMP cleanups and move the BGL (big giant lock) in the syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch	2000-03-28 07:16:37 +00:00
Paul Saab	e5a28db9f5	Add sysctl kern.coredump to enable/disable core dumps system wide.	2000-03-21 07:10:42 +00:00
Eivind Eklund	762e6b856c	Introduce NDFREE (and remove VOP_ABORTOP)	1999-12-15 23:02:35 +00:00
Poul-Henning Kamp	a9e0361b4a	Introduce the new function p_trespass(struct proc p1, struct proc p2) which returns zero or an errno depending on the legality of p1 trespassing on p2. Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one extra signal related check. Replace procfs.h:CHECKIO() macros with calls to p_trespass(). Only show command lines to process which can trespass on the target process.	1999-11-21 19:03:20 +00:00
Poul-Henning Kamp	da654d9070	s/p_cred->pc_ucred/p_ucred/g	1999-11-21 12:38:21 +00:00
Poul-Henning Kamp	2e3c8fcbd0	This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the Q_HEAD and Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914	1999-11-16 10:56:05 +00:00
Sean Eric Fagan	35a2598f80	Bail out of the process early if the coredumpfile limit is 0. PR: kern/14540 Reviewed by: Nate Williams	1999-10-30 18:55:11 +00:00
Marcel Moolenaar	6f841fb79d	Don't let osigaction and osigvec accept the new signal numbers. Fix style bugs caused by the sigset_t in general while I'm here. Submitted by: bde	1999-10-12 13:14:18 +00:00
Luoqi Chen	645682fd40	Add a per-signal flag to mark handlers registered with osigaction, so we can provide the correct context to each signal handler. Fix broken sigsuspend(): don't use p_oldsigmask as a flag, use SAS_OLDMASK as we did before the linuxthreads support merge (submitted by bde). Move ps_sigstk from to p_sigacts to the main proc structure since signal stack should not be shared among threads. Move SAS_OLDMASK and SAS_ALTSTACK flags from sigacts::ps_flags to proc::p_flag. Move PS_NOCLDSTOP and PS_NOCLDWAIT flags from proc::p_flag to procsig::ps_flag. Reviewed by: marcel, jdp, bde	1999-10-11 20:33:17 +00:00
Marcel Moolenaar	2c42a14602	sigset_t change (part 2 of 5) ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.	1999-09-29 15:03:48 +00:00
Sean Eric Fagan	f3a6cf7052	Make prototype match function.	1999-09-01 16:21:57 +00:00

1 2 3 4 5

213 Commits