freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-20 11:11:24 +00:00

Author	SHA1	Message	Date
Pawel Jakub Dawidek	57fd3d5572	When we do open, we should lock the vnode exclusively. This fixes few races: - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more... Discussed with: kib, ups Approved by: re (rwatson)	2007-07-26 16:58:09 +00:00
Konstantin Belousov	de10ffa527	Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls destroy_dev() from d_close() cdev method would self-deadlock. devfs_close() bump device thread reference counter, and destroy_dev() sleeps, waiting for si_threadcount to reach zero for cdev without d_purge method. destroy_dev_sched() could be used instead from d_close(), to schedule execution of destroy_dev() in another context. The destroy_dev_sched_drain() function can be used to drain the scheduled calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains the events clone to make sure no lingering devices are left after dev_clone event handler deregistered. make_dev_credf(MAKEDEV_REF) function should be used from dev_clone event handlers instead of make_dev()/make_dev_cred() to ensure that created device has reference counter bumped before cdev mutex is dropped inside make_dev(). Reviewed by: tegge (early versions), njl (programming interface) Debugging help and testing by: Peter Holm Approved by: re (kensmith)	2007-07-03 17:42:37 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Robert Watson	305759909e	Rename macdevfsdirent() to macdevfs() to synchronize with SEDarwin, where similar data structures exist to support devfs and the MAC Framework, but are named differently. Obtained from: TrustedBSD Project Sponsored by: SPARTA, Inc.	2007-04-23 13:36:54 +00:00
Tom Rhodes	164554dec4	In some cases, like whenever devfs file times are zero, the fix(aa) will not be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing. Discussed with: bde "Ok with me: phk	2007-04-20 01:47:05 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Kris Kennaway	6455de0029	Annotate that this giant acqusition is dependent on tty locking.	2007-03-26 21:56:46 +00:00
Tor Egge	61b9d89ff0	Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib	2007-03-13 01:50:27 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Konstantin Belousov	16f50bcd80	Update the access and modification times for dev while still holding thread reference on it. Reviewed by: tegge Approved by: pjd (mentor)	2006-10-20 08:03:42 +00:00
Konstantin Belousov	1663075c64	Fix the race between devfs_fp_check and devfs_reclaim. Derefence the vnode' v_rdev and increment the dev threadcount , as well as clear it (in devfs_reclaim) under the dev_lock(). Reviewed by: tegge Approved by: pjd (mentor)	2006-10-20 07:59:50 +00:00
Konstantin Belousov	828d6d12da	Properly lock the vnode around vgone() calls. Unlock the vnode in devfs_close() while calling into the driver d_close() routine. devfs_revoke() changes by: ups Reviewed and bugfixes by: tegge Tested by: mbr, Peter Holm Approved by: pjd (mentor) MFC after: 1 week	2006-10-18 11:17:14 +00:00
Tor Egge	5da56ddb21	Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().	2006-09-26 04:12:49 +00:00
Konstantin Belousov	af72db7175	Fix the bug in rev. 1.134. In devfs_allocv_drop_refs(), when not_found == 2 and drop_dm_lock is true, no unlocking shall be attempted. The lock is already dropped and memory is freed. Found with: Coverity Prevent(tm) CID: 1536 Approved by: pjd (mentor)	2006-09-19 14:03:02 +00:00
Konstantin Belousov	e7f9b74438	Resolve the devfs deadlock caused by LOR between devfs_mount->dm_lock and vnode lock in devfs_allocv. Do this by temporary dropping dm_lock around vnode locking. For safe operation, add hold counters for both devfs_mount and devfs_dirent, and DE_DOOMED flag for devfs_dirent. The facilities allow to continue after dropping of the dm_lock, by making sure that referenced memory does not disappear. Reviewed by: tegge Tested by: kris Approved by: kan (mentor) PR: kern/102335	2006-09-18 13:23:08 +00:00
Poul-Henning Kamp	9c499ad92f	Remove the NDEVFSINO and NDEVFSOVERFLOW options which no longer exists in DEVFS. Remove the opt_devfs.h file now that it is empty.	2006-07-17 09:07:02 +00:00
Stephan Uphoff	56eeb277cb	Add vnode interlocking to devfs. This prevents race conditions that can cause pagefaults or devfs to use arbitrary vnodes. MFC after: 1 week	2006-07-12 20:25:35 +00:00
Robert Watson	2551d4f66e	Remove now unneeded opt_mac.h and mac.h includes. MFC after: 3 days	2006-07-06 13:24:22 +00:00
Robert Watson	83ff52a7f3	Use #include "", not #include <> for opt_foo.h. MFC after: 3 days	2006-07-06 13:22:08 +00:00
Pawel Jakub Dawidek	00a480ac5c	Remove unused prototypes.	2006-04-12 12:17:29 +00:00
Jeff Roberson	23b77994f2	- Add a bogus vhold/vdrop around vgone() in devfs_revoke. Without this the vnode is never recycled. It is bogus because the reference really should be associated with the devfs dirent.	2006-03-31 23:37:29 +00:00
Jeff Roberson	f50b03bfd6	- We must hold a reference to a vnode before calling vgone() otherwise it may not be removed from the freelist. MFC After: 1 week Found by: kris	2006-02-22 09:05:40 +00:00
Jeff Roberson	3b77d80cdd	- Remove a stale comment. This function was rewritten to be SMP safe some time ago. Sponsored by: Isilon Systems, Inc.	2006-01-30 08:24:14 +00:00
Robert Watson	8f0d99d790	When returning EIO from DEVFSIO_RADD ioctl, drop the exclusive rule lock. Otherwise the system comes to a rather sudden and grinding halt. MFC after: 1 week	2006-01-03 09:49:10 +00:00
Doug White	16e35dcc39	This is a workaround for a complicated issue involving VFS cookies and devfs. The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator. This issue affects HEAD and RELENG_6 only. PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days	2005-11-09 22:03:50 +00:00
Poul-Henning Kamp	3b72f38b5e	Use correct cirteria for determining which directory entries we can purge right away and which we merely can hide. Beaten into my skull by: kris	2005-10-18 20:21:25 +00:00
Poul-Henning Kamp	e515ee7832	Make rule zero really magical, that way we don't have to do anything when we mount and get zero cost if no rules are used in a mountpoint. Add code to deref rules on unmount. Switch from SLIST to TAILQ. Drop SYSINIT, use SX_SYSINIT and static initializer of TAILQ instead. Drop goto, a break will do. Reduce double pointers to single pointers. Combine reaping and destroying rulesets. Avoid memory leaks in a some error cases.	2005-09-24 07:03:09 +00:00
Poul-Henning Kamp	e606a3c63e	Rewamp DEVFS internals pretty severely [1]. Give DEVFS a proper inode called struct cdev_priv. It is important to keep in mind that this "inode" is shared between all DEVFS mountpoints, therefore it is protected by the global device mutex. Link the cdev_priv's into a list, protected by the global device mutex. Keep track of each cdev_priv's state with a flag bit and of references from mountpoints with a dedicated usecount. Reap the benefits of much improved kernel memory allocator and the generally better defined device driver APIs to get rid of the tables of pointers + serial numbers, their overflow tables, the atomics to muck about in them and all the trouble that resulted in. This makes RAM the only limit on how many devices we can have. The cdev_priv is actually a super struct containing the normal cdev as the "public" part, and therefore allocation and freeing has moved to devfs_devs.c from kern_conf.c. The overall responsibility is (to be) split such that kern/kern_conf.c is the stuff that deals with drivers and struct cdev and fs/devfs handles filesystems and struct cdev_priv and their private liason exposed only in devfs_int.h. Move the inode number from cdev to cdev_priv and allocate inode numbers properly with unr. Local dirents in the mountpoints (directories, symlinks) allocate inodes from the same pool to guarantee against overlaps. Various other fields are going to migrate from cdev to cdev_priv in the future in order to hide them. A few fields may migrate from devfs_dirent to cdev_priv as well. Protect the DEVFS mountpoint with an sx lock instead of lockmgr, this lock also protects the directory tree of the mountpoint. Give each mountpoint a unique integer index, allocated with unr. Use it into an array of devfs_dirent pointers in each cdev_priv. Initially the array points to a single element also inside cdev_priv, but as more devfs instances are mounted, the array is extended with malloc(9) as necessary when the filesystem populates its directory tree. Retire the cdev alias lists, the cdev_priv now know about all the relevant devfs_dirents (and their vnodes) and devfs_revoke() will pick them up from there. We still spelunk into other mountpoints and fondle their data without 100% good locking. It may make better sense to vector the revoke event into the tty code and there do a destroy_dev/make_dev on the tty's devices, but that's for further study. Lots of shuffling of stuff and churn of bits for no good reason[2]. XXX: There is still nothing preventing the dev_clone EVENTHANDLER from being invoked at the same time in two devfs mountpoints. It is not obvious what the best course of action is here. XXX: comment out an if statement that lost its body, until I can find out what should go there so it doesn't do damage in the meantime. XXX: Leave in a few extra malloc types and KASSERTS to help track down any remaining issues. Much testing provided by: Kris Much confusion caused by (races in): md(4) [1] You are not supposed to understand anything past this point. [2] This line should simplify life for the peanut gallery.	2005-09-19 19:56:48 +00:00
Poul-Henning Kamp	59307b0dfe	Don't attempt to recurse lockmgr, it doesn't like it.	2005-09-15 21:16:43 +00:00
Poul-Henning Kamp	214c8ff0e4	Various minor polishing.	2005-09-15 10:28:19 +00:00
Poul-Henning Kamp	6556102dcb	Protect the devfs rule internal global lists with a sx lock, the per mount locks are not enough. Finer granularity (x)locking could be implemented, but I prefer to keep it simple for now.	2005-09-15 08:50:16 +00:00
Poul-Henning Kamp	ab32e95296	Absolve devfs_rule.c from locking responsibility and call it with all necessary locking held.	2005-09-15 08:36:37 +00:00
Poul-Henning Kamp	5e080af41f	Close a race which could result in unwarranted "ruleset %d already running" panics. Previously, recursion through the "include" feature was prevented by marking each ruleset as "running" when applied. This doesn't work for the case where two DEVFS instances try to apply the same ruleset at the same time. Instead introduce the sysctl vfs.devfs.rule_depth (default == 1) which limits how many levels of "include" we will traverse. Be aware that traversal of "include" is recursive and kernel stack size is limited. MFC: after 3 days	2005-09-15 06:57:28 +00:00
Poul-Henning Kamp	21806f30bc	Clean up prototypes.	2005-09-12 08:03:15 +00:00
Poul-Henning Kamp	80447bf701	Add a missing dev_relthread() call. Remove unused variable. Spotted by: Hans Petter Selasky <hselasky@c2i.net>	2005-08-29 11:14:18 +00:00
Poul-Henning Kamp	516ad423b1	Handle device drivers with D_NEEDGIANT in a way which does not penalize the 'good' drivers: Allocate a shadow cdevsw and populate it with wrapper functions which grab Giant	2005-08-17 08:19:52 +00:00
Poul-Henning Kamp	31cc57cdbd	Collect the devfs related sysctls in one place	2005-08-16 19:25:02 +00:00
Poul-Henning Kamp	9c0af1310c	Create a new internal .h file to communicate very private stuff from kern_conf.c to devfs. For now just two prototypes, more to come.	2005-08-16 19:08:01 +00:00
Poul-Henning Kamp	d785dfefa4	Eliminate effectively unused dm_basedir field from devfs_mount.	2005-08-15 19:40:53 +00:00
Robert Watson	6a113b3de7	Merge the dev_clone and dev_clone_cred event handlers into a single event handler, dev_clone, which accepts a credential argument. Implementors of the event can ignore it if they're not interested, and most do. This avoids having multiple event handler types and fall-back/precedence logic in devfs. This changes the kernel API for /dev cloning, and may affect third party packages containg cloning kernel modules. Requested by: phk MFC after: 3 days	2005-08-08 19:55:32 +00:00
Kris Kennaway	e29c976a58	devfs is not yet fully MPSAFE - for example, multiple concurrent devfs(8) processes can cause a panic when operating on rulesets. Approved by: phk	2005-07-29 23:00:56 +00:00
Simon L. B. Nielsen	02a4be3f74	Correct devfs ruleset bypass. Submitted by: csjp Reviewed by: phk Security: FreeBSD-SA-05:17.devfs Approved by: cperciva	2005-07-20 13:34:16 +00:00
Robert Watson	d26dd2d99e	When devfs cloning takes place, provide access to the credential of the process that caused the clone event to take place for the device driver creating the device. This allows cloned device drivers to adapt the device node based on security aspects of the process, such as the uid, gid, and MAC label. - Add a cred reference to struct cdev, so that when a device node is instantiated as a vnode, the cloning credential can be exposed to MAC. - Add make_dev_cred(), a version of make_dev() that additionally accepts the credential to stick in the struct cdev. Implement it and make_dev() in terms of a back-end make_dev_credv(). - Add a new event handler, dev_clone_cred, which can be registered to receive the credential instead of dev_clone, if desired. - Modify the MAC entry point mac_create_devfs_device() to accept an optional credential pointer (may be NULL), so that MAC policies can inspect and act on the label or other elements of the credential when initializing the skeleton device protections. - Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(), so that the pty clone credential is exposed to the MAC Framework. While currently primarily focussed on MAC policies, this change is also a prerequisite for changes to allow ptys to be instantiated with the UID of the process looking up the pty. This requires further changes to the pty driver -- in particular, to immediately recycle pty nodes on last close so that the credential-related state can be recreated on next lookup. Submitted by: Andrew Reisse <andrew.reisse@sparta.com> Obtained from: TrustedBSD Project Sponsored by: SPAWAR, SPARTA MFC after: 1 week MFC note: Merge to 6.x, but not 5.x for ABI reasons	2005-07-14 10:22:09 +00:00
Craig Rodrigues	fd225fe4a3	Do not declare a struct as extern, and then implement it as static in the same file. This is not legal C, and GCC 4.0 will issue an error. Reviewed by: phk Approved by: das (mentor)	2005-05-31 14:50:49 +00:00
Jeff Roberson	7b6b7657d2	- In devfs_open() and devfs_close() grab Giant if the driver sets NEEDGIANT. We still have to DROP_GIANT and PICKUP_GIANT when NEEDGIANT is not set because vfs is still sometime entered with Giant held.	2005-05-01 00:56:34 +00:00
Jeff Roberson	cd360e947b	- Mark devfs as MNTK_MPSAFE as I belive it does not require Giant. Sponsored by: Isilon Systems, Inc. Agreed in principle by: phk	2005-04-30 11:24:17 +00:00
Jeff Roberson	4585e3ac5a	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
Poul-Henning Kamp	f4f6abcb4e	Explicitly hold a reference to the cdev we have just cloned. This closes the race where the cdev was reclaimed before it ever made it back to devfs lookup.	2005-03-31 12:19:44 +00:00

1 2 3 4 5

206 Commits