freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-29 12:03:03 +00:00

Author	SHA1	Message	Date
Dag-Erling Smørgrav	f1ea6d813d	Mechanical whitespace cleanup + other minor style nits.	2004-01-11 19:56:42 +00:00
Dag-Erling Smørgrav	012b5531f4	Mechanical whitespace cleanup + minor style nits.	2004-01-11 19:43:14 +00:00
Dag-Erling Smørgrav	d41457da80	More unparenthesized return values.	2004-01-10 17:14:53 +00:00
Dag-Erling Smørgrav	b91a599717	Style: parenthesize return values.	2004-01-10 13:03:43 +00:00
Don Lewis	2b77864f1e	Add a somewhat redundant check on the len arguement to getsockaddr() to avoid relying on the minimum memory allocation size to avoid problems. The check is somewhat redundant because the consumers of the returned structure will check that sa_len is a protocol-specific larger size. Submitted by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: nectar MFC after: 30 days	2004-01-10 08:28:54 +00:00
Mike Silbersack	ddeb5b242e	Track three new sendfile-related statistics: - The number of times sendfile had to do disk I/O - The number of times sfbuf allocation failed - The number of times sfbuf allocation had to wait	2003-12-28 08:57:09 +00:00
David Malone	9322078275	In socket(2) we only need Giant around the call to socreate, so just grab it there.	2003-12-25 23:44:38 +00:00
Alfred Perlstein	9f144cff85	Add restrict qualifiers. PR: 44394 Submitted by: Craig Rodrigues <rodrige@attbi.com>	2003-12-24 18:47:43 +00:00
David Greenman	186e347f2c	Fixed a bug in sendfile(2) where the sent data would be corrupted due to sendfile(2) being erroneously automatically restarted after a signal is delivered. Fixed by converting ERESTART to EINTR prior to exiting. Updated manual page to indicate the potential EINTR error, its cause and consequences. Approved by: re@freebsd.org	2003-12-01 22:12:50 +00:00
Alan Cox	e45db9b837	- Modify alpha's sf_buf implementation to use the direct virtual-to- physical mapping. - Move the sf_buf API to its own header file; make struct sf_buf's definition machine dependent. In this commit, we remove an unnecessary field from struct sf_buf on the alpha, amd64, and ia64. Ultimately, we may eliminate struct sf_buf on those architecures except as an opaque pointer that references a vm page.	2003-11-16 06:11:26 +00:00
David Malone	e1419c08e2	falloc allocates a file structure and adds it to the file descriptor table, acquiring the necessary locks as it works. It usually returns two references to the new descriptor: one in the descriptor table and one via a pointer argument. As falloc releases the FILEDESC lock before returning, there is a potential for a process to close the reference in the file descriptor table before falloc's caller gets to use the file. I don't think this can happen in practice at the moment, because Giant indirectly protects closes. To stop the file being completly closed in this situation, this change makes falloc set the refcount to two when both references are returned. This makes life easier for several of falloc's callers, because the first thing they previously did was grab an extra reference on the file. Reviewed by: iedowse Idea run past: jhb	2003-10-19 20:41:07 +00:00
Alan Cox	411d10a600	Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy sockets into machine-dependent files. The rationale for this migration is illustrated by the modified amd64 allocator. It uses the amd64's direct map to avoid emphemeral mappings in the kernel's address space. On an SMP, the emphemeral mappings result in an IPI for TLB shootdown for each transmitted page. Yuck. Maintainers of other 64-bit platforms with direct maps should be able to use the amd64 allocator as a reference implementation.	2003-08-29 20:04:10 +00:00
Alexander Kabaev	660ebf0ef2	Drop Giant in recvit before returning an error to the caller to avoid leaking the Giant on the syscall exit.	2003-08-11 19:37:11 +00:00
Yaroslav Tykhiy	b81694ed13	If connect(2) has been interrupted by a signal and therefore the connection is to be established asynchronously, behave as in the case of non-blocking mode: - keep the SS_ISCONNECTING bit set thus indicating that the connection establishment is in progress, which is the case (clearing the bit in this case was just a bug); - return EALREADY, instead of the confusing and unreasonable EADDRINUSE, upon further connect(2) attempts on this socket until the connection is established (this also brings our connect(2) into accord with IEEE Std 1003.1.)	2003-08-06 14:04:47 +00:00
David Malone	d2cce3d6e8	Do some minor Giant pushdown made possible by copyin, fget, fdrop, malloc and mbuf allocation all not requiring Giant. 1) ostat, fstat and nfstat don't need Giant until they call fo_stat. 2) accept can copyin the address length without grabbing Giant. 3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit. 4) move Giant grabbing from each indivitual recv* syscall to recvit.	2003-08-04 21:28:57 +00:00
Alan Cox	efd02757c2	Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in sf_buf_init(). (See revision 1.140 of kern/sys_pipe.c for a detailed rationale.) Submitted by: tegge	2003-08-02 04:18:56 +00:00
Don Lewis	8d5f9131fc	VOP_GETVOBJECT() wants to be called with the vnode lock held.	2003-06-19 03:55:01 +00:00
Alan Cox	c10c537816	Finish the vm object locking in sendfile(2). More generally, the vm locking in sendfile(2) is complete.	2003-06-12 05:52:09 +00:00
Alan Cox	2ab3670aad	Lock the vm object when removing a page.	2003-06-11 21:23:04 +00:00
David E. O'Brien	677b542ea2	Use __FBSDID().	2003-06-11 00:56:59 +00:00
David Malone	de1cab2b60	Grab giant in sendit rather than kern_sendit because sockargs may allocate mbufs with M_TRYWAIT, which may require Giant. Reviewed by: bmilekic Approved by: re (scottl)	2003-05-29 18:36:26 +00:00
David Malone	710c5645af	Split sendit into two parts. The first part, still called sendit, that does the copyin stuff and then calls the second part kern_sendit to do the hard work. Don't bother holding Giant during the copyin phase. The intent of this is to allow the Linux emulator to impliment send* syscalls without using the stackgap.	2003-05-05 20:33:38 +00:00
Alan Cox	7be80f55ba	Recent changes to uipc_cow.c have eliminated the need for some sf_buf- related variables to be global. Make them either local to sf_buf_init() or static.	2003-03-31 06:25:42 +00:00
Alan Cox	9f6d45b1a4	Pass the vm_page's address to sf_buf_alloc(); map the vm_page as part of sf_buf_alloc() instead of expecting sf_buf_alloc()'s caller to map it. The ultimate reason for this change is to enable two optimizations: (1) that there never be more than one sf_buf mapping a vm_page at a time and (2) 64-bit architectures can transparently use their 1-1 virtual to physical mapping (e.g., "K0SEG") avoiding the overhead of pmap_qenter() and pmap_qremove().	2003-03-29 06:14:14 +00:00
Alan Cox	42de97a50a	Pass the sf buf to MEXTADD() as the optional argument. This permits the simplification of socow_iodone() and sf_buf_free(); they don't have to reverse engineer the sf buf from the data's address.	2003-03-16 07:19:12 +00:00
Alan Cox	7c4351aabd	Remove GIANT_REQUIRED from sf_buf_free().	2003-03-06 04:48:19 +00:00
Tor Egge	6a07a13944	Sync new socket nonblocking/async state with file flags in accept(). PR: 1775 Reviewed by: mbr	2003-02-23 23:00:28 +00:00
Olivier Houchard	d6bf23783f	Remove duplicate includes. Submitted by: Cyril Nguyen-Huu <cyril@ci0.org>	2003-02-20 03:26:11 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Hajimu UMEMOTO	12e4397ea3	Break out the bind and connect syscalls to intend to make calling these syscalls internally easy. This is preparation for force coming IPv6 support for Linuxlator. Submitted by: dwmalone MFC after: 10 days	2003-02-03 17:36:52 +00:00
Alfred Perlstein	8deebb0160	Consolidate MIN/MAX macros into one place (param.h). Submitted by: Hiten Pandya <hiten@unixdaemons.com>	2003-02-02 13:17:30 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Matthew Dillon	48e3128b34	Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.	2003-01-13 00:33:17 +00:00
Matthew Dillon	cd72f2180b	Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.	2003-01-12 01:37:13 +00:00
Poul-Henning Kamp	08c7670a8b	Move the declaration of the socket fileops from socketvar.h to file.h. This allows us to use the new typedefs and removes the needs for a number of forward struct declarations in socketvar.h	2002-12-23 22:46:47 +00:00
Robert Watson	b371c939ce	Integrate mac_check_socket_send() and mac_check_socket_receive() checks from the MAC tree: allow policies to perform access control for the ability of a process to send and receive data via a socket. At some point, we might also pass in additional address information if an explicit address is requested on send. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-06 14:39:15 +00:00
Don Lewis	91e97a8266	In an SMP environment post-Giant it is no longer safe to blindly dereference the struct sigio pointer without any locking. Change fgetown() to take a reference to the pointer instead of a copy of the pointer and call SIGIO_LOCK() before copying the pointer and dereferencing it. Reviewed by: rwatson	2002-10-03 02:13:00 +00:00
Archie Cobbs	f2f03122c3	accept(2) on a socket that has been shutdown(2) normally returns ECONNABORTED. Make this happen in the non-blocking case as well. The previous behavior was to return EAGAIN, which (a) is not consistent with the blocking case and (b) causes the application to think the socket is still valid. PR: bin/42100 Reviewed by: freebsd-net MFC after: 3 days	2002-08-28 20:56:01 +00:00
Robert Watson	9ca435893b	In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 20:55:08 +00:00
Robert Watson	4b9c2fa1fb	Fix return case for negative namelen by jumping to normal exit processing rather than immediately returning, or we may not unlock necessary locks. Noticed by: Mike Heffner <mheffner@acm.vt.edu>	2002-08-15 17:34:03 +00:00
David Greenman	9e63574ea4	Moved sf_buf_alloc and sf_buf_free function declarations to sys/socketvar.h so that they can be seen by external callers.	2002-08-13 19:03:19 +00:00
David Greenman	a370c70055	Remove obsolete comment about sf_buf_* functions being static. They were made un-static in rev 1.114.	2002-08-13 18:20:08 +00:00
Semen Ustimenko	87df4f8f18	Fix sendfile(), who was calling vn_rdwr() without aresid parameter and thus hiting EIO at the end of file. This is believed to be a feature (not a bug) of vn_rdwr(), so we turn it off by supplying aresid param. Reviewed by: rwatson, dg	2002-08-11 20:33:11 +00:00
Jacques Vidrine	5b770403b5	While we're at it, add range checks similar to those in previous commit to getsockname() and getpeername(), too.	2002-08-09 12:58:11 +00:00
Robert Watson	82d9ad331a	Add additional range checks for copyout targets. Submitted by: Silvio Cesare <silvio@qualys.com>	2002-08-09 05:50:32 +00:00
Robert Watson	f9d0d52459	Include file cleanup; mac.h and malloc.h at one point had ordering relationship requirements, and no longer do. Reminded by: bde	2002-08-01 17:47:56 +00:00
Robert Watson	62f5f684fb	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument connect(), listen(), and bind() system calls to invoke MAC framework entry points to permit policies to authorize these requests. This can be useful for policies that want to limit the activity of processes involving particular types of IPC and network activity. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 16:39:49 +00:00
Alan Cox	1161b86a15	o In do_sendfile(), replace vm_page_sleep_busy() by vm_page_sleep_if_busy() and extend the scope of the page queues lock to cover all accesses to the page's flags and busy fields.	2002-07-30 18:51:07 +00:00
Andrew R. Reiter	5d3232048e	- Make use of the VM_ALLOC_WIRED flag in the call to vm_page_alloc() in do_sendfile(). This allows us to rearrange an if statement in order to avoid doing an unnecesary call to vm_page_lock_queues(), and an attempt at re-wiring the pages (which were wired in the vm_page_alloc() call). Reviewed by: alc, jhb	2002-07-23 01:09:34 +00:00
Alan Cox	ae0ffa73cc	Lock accesses to the page queues by sendfile() and friends.	2002-07-13 03:10:55 +00:00
Alfred Perlstein	9c34129662	Create a bug-for-bug FreeBSD4 compatible version of sendfile and move the fixed sendfile over. This is needed to preserve binary compatibility from 4.x to 5.x.	2002-07-12 06:51:57 +00:00
Alfred Perlstein	a551e20e27	nuke more instances of caddr_t	2002-06-29 00:02:01 +00:00
Alfred Perlstein	64f0b9d749	remove or replace caddr_t with void. make the mbuf external free function take a void * rather than caddr_t.	2002-06-28 23:48:23 +00:00
Kenneth D. Merry	98cb733c67	At long last, commit the zero copy sockets code. MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes. ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls. man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links. jumbo.9: New man page describing the jumbo buffer allocator interface and operation. zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality. NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT. conf/files: Add uipc_jumbo.c and uipc_cow.c. conf/options: Add the 5 options mentioned above. kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1. uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack. uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive. uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on. Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions. uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c) if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails. The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)). ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives. if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers. Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface. Add header splitting support to the ti(4) driver. Tweak some of the default interrupt coalescing parameters to more useful defaults. Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off. if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13. Add defines needed for debugging. Remove the ti_stats structure, it is now defined in sys/tiio.h. ti_fw.h: 12.4.11 firmware. ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.) sys/jumbo.h: Jumbo buffer allocator interface. sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process. socketvar.h: Add prototype for socow_setup. tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions. uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable. ufs_readwrite.c:Update for new prototype of uiomoveco(). vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault. vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre. This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.) vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK. vm_object.h: Add prototype for vm_object_allocate_wait(). vm_page.c: Add page-based copy on write setup, clear and fault routines. vm_page.h: Add page based COW function prototypes and variable in the vm_page structure. Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.	2002-06-26 03:37:47 +00:00
Alfred Perlstein	c33c825169	Implement SO_NOSIGPIPE option for sockets. This allows one to request that an EPIPE error return not generate SIGPIPE on sockets. Submitted by: lioux Inspired by: Darwin	2002-06-20 18:52:54 +00:00
John Baldwin	60a9bb197d	Catch up to changes in ktrace API.	2002-06-07 05:37:18 +00:00
Seigo Tanimura	4cc20ab1f0	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
Seigo Tanimura	243917fe3b	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
Robert Watson	89e9e6e7c5	In sendfile(), use the vn_rdwr() helper function, rather than manually constructing a struct aio and invoking VOP_READ() directly. This cleans up the code a little, but also has the advantage of making sure almost all vnode read/write access in the kernel goes through the helper function, meaning that instrumentation of that helper function can impact almost all relevant read/write operations. In this case, it permits us to put MAC hooks into vn_rdwr() and not modify uipc_syscalls.c (yet). In general, if helper vn_*() functions exist, they should be used in preference to direct VOP's in system call service code. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-19 13:46:24 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Bruce Evans	70f52b4845	Fixed some style bugs in the removal of __P(()). The main ones were not removing tabs before "__P((", and not outdenting continuation lines to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting and/or rewrap the whole prototype in some cases.	2002-03-24 05:09:11 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
David Greenman	7228268aaa	Fixed bug in calculation of amount of file to send when nbytes !=0 and headers or trailers are supplied. Reported by Vladislav Shabanov <vs@rambler-co.ru>. PR: 33771 Submitted by: Maxim Konovalov <maxim@macomnet.ru> MFC after: 3 days	2002-01-22 17:32:10 +00:00
Alfred Perlstein	426da3bcfb	SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file fp); / increments reference count on a file / struct file fhold_locked(struct file fp); / like fhold but expects file to locked / struct file ffind_hold(struct thread , int fd); / finds the struct file in thread, adds one reference and returns it unlocked / struct file ffind_lock(struct thread , int fd); / ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.	2002-01-13 11:58:06 +00:00
Alfred Perlstein	078a4e8939	Sockets are called 'so' not 'sp'.	2002-01-09 02:47:00 +00:00
Robert Watson	9c4d63da6d	o Make the credential used by socreate() an explicit argument to socreate(), rather than getting it implicitly from the thread argument. o Make NFS cache the credential provided at mount-time, and use the cached credential (nfsmount->nm_cred) when making calls to socreate() on initially connecting, or reconnecting the socket. This fixes bugs involving NFS over TCP and ipfw uid/gid rules, as well as bugs involving NFS and mandatory access control implementations. Reviewed by: freebsd-arch	2001-12-31 17:45:16 +00:00
Matthew Dillon	b1e4abd246	Give struct socket structures a ref counting interface similar to vnodes. This will hopefully serve as a base from which we can expand the MP code. We currently do not attempt to obtain any mutex or SX locks, but the door is open to add them when we nail down exactly how that part of it is going to work.	2001-11-17 03:07:11 +00:00
Matthew Dillon	b064d43d8f	remove holdfp() Replace uses of holdfp() with fget() or fgetvp() calls as appropriate introduce fget(), fget_read(), fget_write() - these functions will take a thread and file descriptor and return a file pointer with its ref count bumped. introduce fgetvp(), fgetvp_read(), fgetvp_write() - these functions will take a thread and file descriptor and return a vref()'d vnode. _read() requires that the file pointer be FREAD, _write that it be FWRITE. This continues the cleanup of struct filedesc and struct file access routines which, when are all through with it, will allow us to then make the API calls MP safe and be able to move Giant down into the fo_* functions.	2001-11-14 06:30:36 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Matthew Dillon	df9987602f	Giant pushdown syscalls in kern/uipc_syscalls.c. Affected calls: recvmsg(), sendmsg(), recvfrom(), accept(), getpeername(), getsockname(), socket(), connect(), accept(), send(), recv(), bind(), setsockopt(), listen(), sendto(), shutdown(), socketpair(), sendfile()	2001-08-31 00:37:34 +00:00
Matthew Dillon	0cddd8f023	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
David Malone	db3cc2d09f	Don't dereference a NULL pointer if we fail to get a sendfilebuf.	2001-06-24 12:27:30 +00:00
John Baldwin	9d127f9ffb	Add vm locking to sendfile(2) and sf_buf_free(). Reported by: Tamiji Homma <thomma@BayNetworks.com> Tested by: Tamiji Homma <thomma@BayNetworks.com>	2001-05-25 19:23:04 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
Alfred Perlstein	06336fb26d	Sendfile is documented to return 0 on success, however if when a sf_hdtr is used to provide writev(2) style headers/trailers on the sent data the return value is actually either the result of writev(2) from the trailers or headers of no tailers are specified. Fix sendfile to comply with the documentation, by returning 0 on success. Ok'd by: dg	2001-04-26 00:14:14 +00:00
Greg Lehey	d98dc34f52	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
Bosko Milekic	4bde2ac539	Fix is a similar race condition as existed in the mbuf code. When we go into an interruptable sleep and we increment a sleep count, we make sure that we are the thread that will decrement the count when we wakeup. Otherwise, what happens is that if we get interrupted (signal) and we have to wake up, but before we get our mutex, some thread that wants to wake us up detects that the count is non-zero and so enters wakeup_one(), but there's nothing on the sleep queue and so we don't get woken up. The thread will still decrement the sleep count, which is bad because we will also decrement it again later (as we got interrupted) and are already off the sleep queue.	2001-03-08 19:21:45 +00:00
David Malone	2239c07de9	Make the wait for sendfile buffers interruptable. Stops one process consuming them all and then getting stuck. Reviewed by: dg Reviewed by: bmilekic Observed by: Andreas Persson <pap@garen.net>	2001-03-08 16:28:10 +00:00
John Baldwin	19eb87d22a	Grab the process lock while calling psignal and before calling psignal.	2001-03-07 03:37:06 +00:00
Jonathan Lemon	2fd7d53d36	Return ECONNABORTED from accept if connection is closed while on the listen queue, as well as the current behavior of a zero-length sockaddr. Obtained from: KAME Reviewed by: -net	2001-02-14 02:09:11 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
Poul-Henning Kamp	1550c317bf	Fix the <sys/queue.h> abuse. Submitted by: Dima Dorfman <dima@unixfreak.org> Reviewed by: /sbin/md5	2001-01-02 11:51:55 +00:00
Poul-Henning Kamp	7f9cb01893	Add an XXX about a <sys/queue.h> transgression which needs cleaned up.	2001-01-02 10:34:09 +00:00
Bosko Milekic	2a0c503e7a	* Rename M_WAIT mbuf subsystem flag to M_TRYWAIT. This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while. * Fix a typo in a comment in mbuf.h * Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.	2000-12-21 21:44:31 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
David Greenman	8f9a5273a3	Changed second argument in a call to sf_buf_free() to be NULL instead of PAGE_SIZE to match the prototype better. The argument is ignored, so this is just to silence the compile-time warning. Pointed out by: jhb	2000-12-03 01:35:46 +00:00
Bosko Milekic	794cd879fe	Make sure to free the sf_buf if we've allocated it but fail to allocate an mbuf (ENOBUFS) before returning so that we don't leak sf_bufs in the case where we're out of mbufs. Submitted by: David Greenman (dg)	2000-12-02 00:40:57 +00:00
Matthew Dillon	279d722604	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
David Greenman	866746b6a6	Fixed a certain panic on IO error in sendfile(): Page must be set PG_BUSY before calling vm_page_free() on it.	2000-11-12 14:51:15 +00:00
Bosko Milekic	e778918123	* Have m_pulldown() use the new M_WRITABLE() macro in order to determine whether the given ext_buf is shared. * Have the sf_bufs be setup with the mbuf subsystem using MEXTADD() with the two new arguments. Note: m_pulldown() is somewhat crotchy; the added comment explains the situation. Reviewed by: jlemon	2000-11-11 23:04:15 +00:00
Bosko Milekic	fe27eea9d1	Change the sf_bufs wakeups to be wakeup_one(), because we don't want to wakeup all of the sleeping threads when we free only one buffer. This avoids us having to needlessly try again (and fail, and go back to sleep) for all the threads sleeping. We will now only wakeup the thread we know will succeed. Reviewed by: green	2000-11-04 21:55:25 +00:00
Bosko Milekic	0eecc42758	Setup and put to use the mutex lock for sf_freelist, the sendfile(2) bufs freelist. Should now be thread-friendly, in part. Note: More work is needed in uipc_syscalls.c, but it will have to wait until the socket locking issues are at least 80% implemented and committed.	2000-11-04 07:16:08 +00:00
Boris Popov	9ff5ce6baf	Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency. Reviewed in general by: mckusick, dillon	2000-09-12 09:49:08 +00:00
David Malone	a5c4836d39	Replace the mbuf external reference counting code with something that should be better. The old code counted references to mbuf clusters by using the offset of the cluster from the start of memory allocated for mbufs and clusters as an index into an array of chars, which did the reference counting. If the external storage was not a cluster then reference counting had to be done by the code using that external storage. NetBSD's system of linked lists of mbufs was cosidered, but Alfred felt it would have locking issues when the kernel was made more SMP friendly. The system implimented uses a pool of unions to track external storage. The union contains an int for counting the references and a pointer for forming a free list. The reference counts are incremented and decremented atomically and so should be SMP friendly. This system can track reference counts for any sort of external storage. Access to the reference counting stuff is now through macros defined in mbuf.h, so it should be easier to make changes to the system in the future. The possibility of storing the reference count in one of the referencing mbufs was considered, but was rejected 'cos it would often leave extra mbufs allocated. Storing the reference count in the cluster was also considered, but because the external storage may not be a cluster this isn't an option. The size of the pool of reference counters is available in the stats provided by "netstat -m". PR: 19866 Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: alfred (glanced at by others on -net)	2000-08-19 08:32:59 +00:00
Brian Feldman	42ebfbf227	Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio * instead of a struct iovec * array and int len. Get rid of stupidly trying to allocate all of the memory and copyin()ing the entire iovec[], and instead just do the proper VOP_WRITE() in ktrwrite() using a copy of the struct uio that the syscall originally used. This solves the DoS which could easily be performed; to work around the DoS, one could also remove "options KTRACE" from the kernel. This is a very strong MFC candidate for 4.1. Found by: art@OpenBSD.org	2000-07-02 08:08:09 +00:00
Alfred Perlstein	8757e5bbc5	unstatic getfp() so that other subsystems can use it. make sendfile() use it. Approved by: dg	2000-06-12 18:06:12 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Jonathan Lemon	cb679c385e	Introduce kqueue() and kevent(), a kernel event notification facility.	2000-04-16 18:53:38 +00:00
Brian Feldman	f48b807fc0	This is Bosko Milekic's mbuf allocation waiting code. Basically, this means that running out of mbuf space isn't a panic anymore, and code which runs out of network memory will sleep to wait for it. Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: green, wollman	1999-12-12 05:52:51 +00:00
Poul-Henning Kamp	9b962c56a4	General clean-up of socket.h and associated sources to synchronise up with NetBSD and the Single Unix Specification v2. This updates some structures with other, almost equivalent types and effort is under way to get the whole more consistent. Also removes a double definition of INET6 and some other clean-ups. Reviewed by: green, bde, phk Some part obtained from: NetBSD, SUSv2 specification	1999-11-24 20:49:04 +00:00
Poul-Henning Kamp	2e3c8fcbd0	This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the Q_HEAD and Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914	1999-11-16 10:56:05 +00:00
Poul-Henning Kamp	923502ff91	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
Brian Feldman	afce003453	Add a missing spl lowering. Submitted by: Ville-Pertti Keinonen <will@iki.fi>	1999-10-14 05:16:16 +00:00
Peter Wemm	d1f088dab5	Trim unused options (or #ifdef for undoc options). Submitted by: phk	1999-10-11 15:19:12 +00:00
Guido van Rooij	bdf7fdcb6f	Plug a potential filedescriptor leak. This will probably almost never be triggered. Reviewed by: David Greenman	1999-09-30 19:13:17 +00:00
Brian Feldman	13ccadd4b0	This is what was "fdfix2.patch," a fix for fd sharing. It's pretty far-reaching in fd-land, so you'll want to consult the code for changes. The biggest change is that now, you don't use fp->f_ops->fo_foo(fp, bar) but instead fo_foo(fp, bar), which increments and decrements the fp refcount upon entry and exit. Two new calls, fhold() and fdrop(), are provided. Each does what it seems like it should, and if fdrop() brings the refcount to zero, the fd is freed as well. Thanks to peter ("to hell with it, it looks ok to me.") for his review. Thanks to msmith for keeping me from putting locks everywhere :) Reviewed by: peter	1999-09-19 17:00:25 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Brian Feldman	e32c66c539	Fix fd race conditions (during shared fd table usage.) Badfileops is now used in f_ops in place of NULL, and modifications to the files are more carefully ordered. f_ops should also be set to &badfileops upon "close" of a file. This does not fix other problems mentioned in this PR than the first one. PR: 11629 Reviewed by: peter	1999-08-04 18:53:50 +00:00
Matthew Dillon	d254af07a1	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 21:50:00 +00:00
Bill Fenner	ec42cbfc24	Don't free the socket address if soaccept() / pru_accept() doesn't return one.	1999-01-25 16:53:53 +00:00
Matthew Dillon	257aefa704	Addendum: The original code that the last commit 'fixed' actually did not have a bug in it, but the last commit did make it more readable so we are keeping it.	1999-01-24 03:49:58 +00:00
Matthew Dillon	89600e8663	There was a situation where sendfile() might attempt to initiate I/O on a PG_BUSY page, due to a bug in its sequencing of a conditional.	1999-01-24 01:15:58 +00:00
Matthew Dillon	0069f505eb	Fixed a potential bug ( but maybe not ), where sendfile() clears PG_BUSY on a page without testing for waiters. Also collapsed busy wait into new vm_page_sleep_busy() inline ( see vm/vm_page.h )	1999-01-21 09:00:26 +00:00
Matthew Dillon	1c7c3c6a86	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>	1999-01-21 08:29:12 +00:00
Archie Cobbs	f1d19042b0	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
David Greenman	911e8dbc2a	Fixed broken code in sendfile(2) when using file offsets.	1998-12-03 12:35:47 +00:00
Don Lewis	9d2b090975	We can't call fsetown() from sonewconn() because sonewconn() is be called from an interrupt context and fsetown() wants to peek at curproc, call malloc(..., M_WAITOK), and fiddle with various unprotected data structures. The fix is to move the code that duplicates the F_SETOWN/FIOSETOWN state of the original socket to the new socket from sonewconn() to accept1(), since accept1() runs in the correct context. Deferring this until the process calls accept() is harmless since the process can't do anything useful with SIGIO on the new socket until it has the descriptor for that socket. One could make the case for not bothering to duplicate the F_SETOWN/FIOSETOWN state and requiring the process to explicitly make the fcntl() or ioctl() call on the new socket, but this would be incompatible with the previous implementation and might break programs which rely on the old semantics. This bug was discovered by Andrew Gallatin <gallatin@cs.duke.edu>.	1998-11-23 00:45:39 +00:00
David Greenman	4f699173cb	Closed a very narrow and rare race condition that involved net interrupts, bio interrupts, and a truncated file that along with the precise alignment of the planets could result in a page being freed multiple times or a just-freed page being put onto the inactive queue.	1998-11-18 09:00:47 +00:00
David Greenman	efac52b4ab	In sendfile(2), check against sb_lowat when filling the socket buffer, rather than 0.	1998-11-15 16:55:09 +00:00
David Greenman	f2efb8e4c8	Fixed a couple of nits in sendfile(2): clear PG_ZERO before unbusying the page, and use passed-in "p" rather than curproc in uio struct.	1998-11-14 23:36:17 +00:00
David Greenman	bd81f199b5	Added support for non-blocking sockets to sendfile(2).	1998-11-06 19:16:30 +00:00
David Greenman	dd0b2081f4	Implemented zero-copy TCP/IP extensions via sendfile(2) - send a file to a stream socket. sendfile(2) is similar to implementations in HP-UX, Linux, and other systems, but the API is more extensive and addresses many of the complaints that the Apache Group and others have had with those other implementations. Thanks to Marc Slemko of the Apache Group for helping me work out the best API for this. Anyway, this has the "net" result of speeding up sends of files over TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU cycles) when compared to a traditional read/write loop.	1998-11-05 14:28:26 +00:00
Garrett Wollman	cfe8b629f1	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
Doug Rabson	2b605d0804	64bit fixes: don't cast p->p_retval to an int*.	1998-06-10 10:30:23 +00:00
Poul-Henning Kamp	115facb29d	Fix a minor mbuf leak created by the previous change. Reviewed by: phk Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1998-04-14 06:24:43 +00:00
Poul-Henning Kamp	aba558930b	setsockopt() transports user option data in an mbuf. if the user data is greater than MLEN, setsockopt is unable to pass it onto the protocol handler. Allocate a cluster in such case. PR: 2575 Reviewed by: phk Submitted by: Julian Assange proff@iq.org	1998-04-11 20:31:46 +00:00
Bruce Evans	08637435f2	Moved some #includes from <sys/param.h> nearer to where they are actually used.	1998-03-28 10:33:27 +00:00
Eivind Eklund	303b270b0a	Staticize.	1998-02-09 06:11:36 +00:00
Eivind Eklund	5591b823d1	Make COMPAT_43 and COMPAT_SUNOS new-style options.	1997-12-16 17:40:42 +00:00
Mike Smith	0bec68bf7c	Consult sa_len before trampling it with MSG_COMPAT set. PR: kern/5291 Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1997-12-15 02:29:11 +00:00
Mike Smith	5af7db2b73	As described by the submitter: ... fix a bug with orecvfrom() or recvfrom() called with the MSG_COMPAT flag on kernels compiled with the COMPAT_43 option. The symptom is that the fromaddr is not correctly returned. This affects the Linux emulator. Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1997-12-14 03:15:21 +00:00
Poul-Henning Kamp	cb226aaa62	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
Poul-Henning Kamp	a1c995b626	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde	1997-10-12 20:26:33 +00:00
Bruce Evans	e4ba6a82b0	Removed unused #includes.	1997-09-02 20:06:59 +00:00
Garrett Wollman	fa5cde129b	Delete a bit of debugging code that mistakenly crept in, and as a consequence revert rev. 1.28's header file additions which are no longer needed.	1997-08-17 19:47:28 +00:00
Tor Egge	19c0663e5e	Use KERNBASE, not 0xf0000000.	1997-08-17 17:40:11 +00:00
Garrett Wollman	57bf258e3d	Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.	1997-08-16 19:16:27 +00:00
Garrett Wollman	a29f300e80	The long-awaited mega-massive-network-code- cleanup. Part I. This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in. 2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials. 3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed. 4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family. 5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem. As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.	1997-04-27 20:01:29 +00:00
Bruce Evans	9dd8309d56	Removed support for OLD_PIPE. <sys/stat.h> is now missing the hack that supported nameless pipes being indistinguishable from fifos. We're not going back.	1997-04-09 16:53:45 +00:00
David Greenman	a91b87211d	In accept1(), falloc() is called after the process has awoken, but prior to removing the connection from the queue. The problem here is that falloc() may block and this would allow another process to accept the connection instead. If this happens to leave the queue empty, then the system will panic with an "accept: nothing queued". Also changed a wakeup() to a wakeup_one() to avoid the "thundering herd" problem on new connections in Apache (or any other application that has multiple processes blocked in accept() for the same socket).	1997-03-31 12:30:01 +00:00
Bruce Evans	3ac4d1ef0c	Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.	1997-03-23 03:37:54 +00:00
Peter Wemm	6875d25465	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
Jordan K. Hubbard	1130b656e5	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
Garrett Wollman	67f7ea2d71	Preserve file flags in accept(2). Submitted by: fredriks@mcs.com in PR#1775 (this implmentaion is different)	1996-10-15 19:28:44 +00:00
Peter Wemm	b12e5e82b6	The socketpair(0 syscall is bogusly returning the fd numbers through the primary and secondary return codes, causing it to not behave as documented. This probably originates from the ancient BSD kernels that had pipe(2) implemented by socketpair(2), there are no binaries left that we can run that do this. Pointed out by: Robert Withrow <witr@rwwa.com>, PR#731	1996-08-24 03:35:13 +00:00
Garrett Wollman	2c37256e5a	Modify the kernel to use the new pr_usrreqs interface rather than the old pr_usrreq mechanism which was poorly designed and error-prone. This commit renames pr_usrreq to pr_ousrreq so that old code which depended on it would break in an obvious manner. This commit also implements the new interface for TCP, although the old function is left as an example (#ifdef'ed out). This commit ALSO fixes a longstanding bug in the TCP timer processing (introduced by davidg on 1995/04/12) which caused timer processing on a TCB to always stop after a single timer had expired (because it misinterpreted the return value from tcp_usrreq() to indicate that the TCB had been deleted). Finally, some code related to polling has been deleted from if.c because it is not relevant t -current and doesn't look at all like my current code.	1996-07-11 16:32:50 +00:00
Garrett Wollman	82dab6ce62	Make it possible to return more than one piece of control information (PR #1178). Define a new SO_TIMESTAMP socket option for datagram sockets to return packet-arrival timestamps as control information (PR #1179). Submitted by: Louis Mamakos <loiue@TransSys.com>	1996-05-09 20:15:26 +00:00
David Greenman	be24e9e8fa	Changed socket code to use 4.4BSD queue macros. This includes removing the obsolete soqinsque and soqremque functions as well as collapsing so_q0len and so_qlen into a single queue length of unaccepted connections. Now the queue of unaccepted & complete connections is checked directly for queued sockets. The new code should be functionally equivilent to the old while being substantially faster - especially in cases where large numbers of connections are often queued for accept (e.g. http).	1996-03-11 15:37:44 +00:00
Poul-Henning Kamp	09bb5f7589	Make getsockopt() capable of handling more than one mbuf worth of data. Use this to read rules out of ipfw. Add the lkm code to ipfw.c	1996-02-24 13:38:28 +00:00
Garrett Wollman	dc915e7cfc	Kill XNS. While we're at it, fix socreate() to take a process argument. (This was supposed to get committed days ago...)	1996-02-13 18:16:31 +00:00
John Dyson	f982721359	Enable the new fast pipe code. The old pipes can be used with the "OLD_PIPE" config option.	1996-01-28 23:41:40 +00:00
Garrett Wollman	db6a20e23e	Converted two options over to the new scheme: USER_LDT and KTRACE.	1996-01-03 21:42:35 +00:00
Peter Wemm	6ee78bf046	Make pipe() return a set of bidirectional pipe fd's rather than one-way only just like on SVR4. This has no effect on any current programs in our source, but makes the use of SVR4 code a little easier. There is no code or implementation cost in the kernel.. This two-line change merely sets the modes on the ends of the pipes to be bidirectional. There are no other changes.	1996-01-01 10:28:21 +00:00
Bruce Evans	47daf5d5d6	Nuked ambiguous sleep message strings: old: new: netcls[] = "netcls" "soclos" netcon[] = "netcon" "accept", "connec" netio[] = "netio" "sblock", "sbwait"	1995-12-14 22:51:13 +00:00
Bruce Evans	5fdb832498	Simplify the pseudo-argument removal changes by not optimizing for the !COMPAT_43 case - use a common function even when there is no `old' function. The diffs for this are large because of code motion to restore the function order to what it was before the pseudo-argument changes. Include <sys/sysproto.h> to get correct args structs and prototypes. The diffs for this are large because the declarations of the args structs were moved to become comments in the function headers. The comments may actually match the automatically generated declarations right now. Add prototypes.	1995-10-23 15:42:12 +00:00
Steven Wallace	88c94611b1	Remove the '1' from getpeername1 and getsockname1 when NOT COMPAT_OLDSOCK. Left it in there by mistake.	1995-10-11 06:09:45 +00:00
Steven Wallace	93c9414e49	Remove compat_43 psuedo-argument hack, and replace with a better hack. Instead of using a fake "compat" argument, pass a real compat int to function if COMPAT_43 is defined. Functions involved: wait4, accept, recvfrom, getsockname. With the compat psuedo-argument, this introduces an argument structure that can have two possible sizes depending on compat options. This makes life difficult for lkm modules like ibcs2, which would have to guess what size used in kernel when compiled. Also, the prototype generator for these structures cannot generate proper sizes. Now there is only one fixed structure and makes everybody happy. I recommend these changes be introduced to 2.1 so that ibcs2, linux lkm's generated for 2.2 can still run on a 2.1 kernel.	1995-10-07 23:47:26 +00:00
Rodney W. Grimes	9b2e535452	Remove trailing whitespace.	1995-05-30 08:16:23 +00:00
Bruce Evans	b5e8ce9f12	Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.	1995-03-16 18:17:34 +00:00
Poul-Henning Kamp	797f2d22f0	All of this is cosmetic. prototypes, #includes, printfs and so on. Makes GCC a lot more silent.	1994-10-02 17:35:40 +00:00
David Greenman	3c4dd3568f	Added $Id$	1994-08-02 07:55:43 +00:00
Rodney W. Grimes	26f9a76710	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman	1994-05-25 09:21:21 +00:00
Rodney W. Grimes	df8bae1de4	BSD 4.4 Lite Kernel Sources	1994-05-24 10:09:53 +00:00

... 3 4 5 6 7 ...

366 Commits