freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-17 10:26:15 +00:00

Author	SHA1	Message	Date
Poul-Henning Kamp	964ebefd8d	Use system wide no-op vfs_start function.	2004-11-25 09:11:27 +00:00
Jeff Roberson	b646893f0f	- Eliminate the acquisition and release of the bqlock in bremfree() by setting the B_REMFREE flag in the buf. This is done to prevent lock order reversals with code that must call bremfree() with a local lock held. This also reduces overhead by removing two lock operations per buf for fsync() and similar. - Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock has been acquired so that we may remove ourself from the free-list. - Provide a bremfreef() function to immediately remove a buf from a free-list for use only by NFS. This is done because the nfsclient code overloads the b_freelist queue for its own async. io queue. - Simplify the numfreebuffers accounting by removing a switch statement that executed the same code in every possible case. - getnewbuf() can encounter locked bufs on free-lists once Giant is removed. Remove a panic associated with this condition and delay asserts that inspect the buf until after it is locked. Reviewed by: phk Sponsored by: Isilon Systems, Inc.	2004-11-18 08:44:09 +00:00
Poul-Henning Kamp	51ac12ab28	Be prepared to accept NULL mountargs as part of root-mounting.	2004-11-13 13:04:31 +00:00
Poul-Henning Kamp	cf5e414960	Put back the vfs_object_create() calls, they do make a difference when my test-setup does what I want it to instead of what I ask it to. Pointed out by: tegge	2004-11-12 10:27:14 +00:00
Poul-Henning Kamp	40ce27cb57	fix some comments	2004-11-10 06:53:31 +00:00
Poul-Henning Kamp	2e6649198a	Use mount flags instead of NULL path to detect root filesystem mount.	2004-11-09 23:38:10 +00:00
Poul-Henning Kamp	5e2ccaff7a	Stop pretending to have a vm_object backing the underlying disk vnode: it isn't used for anything anywhere and the vnode_pager would explode if we attempted to.	2004-11-09 23:12:45 +00:00
Poul-Henning Kamp	40c340aa5d	Don't grab the exclusive bit on a root filesystem until we are willing to mount it. Doing so prevented fsck to be run after a refused mount.	2004-11-04 09:11:22 +00:00
Poul-Henning Kamp	4392001125	Move UFS from DEVFS backing to GEOM backing. This eliminates a bunch of vnode overhead (approx 1-2 % speed improvement) and gives us more control over the access to the storage device. Access counts on the underlying device are not correctly tracked and therefore it is possible to read-only mount the same disk device multiple times: syv# mount -p /dev/md0 /var ufs rw 2 2 /dev/ad0 /mnt ufs ro 1 1 /dev/ad0 /mnt2 ufs ro 1 1 /dev/ad0 /mnt3 ufs ro 1 1 Since UFS/FFS is not a synchrousely consistent filesystem (ie: it caches things in RAM) this is not possible with read-write mounts, and the system will correctly reject this. Details: Add a geom consumer and a bufobj pointer to ufsmount. Eliminate the vnode argument from softdep_disk_prewrite(). Pick the vnode out of bp->b_vp for now. Eventually we should find it through bp->b_bufobj->b_private. In the mountcode, use g_vfs_open() once we have used VOP_ACCESS() to check permissions. When upgrading and downgrading between r/o and r/w do the right thing with GEOM access counts. Remove all the workarounds for not being able to do this with VOP_OPEN(). If we are the root mount, drop the exclusive access count until we upgrade to r/w. This allows fsck of the root filesystem and the MNT_RELOAD to work correctly. Set bo_private to the GEOM consumer on the device bufobj. Change the ffs_ops->strategy function to call g_vfs_strategy() In ufs_strategy() directly call the strategy on the disk bufobj. Same in rawread. In ffs_fsync() we will no longer see VCHR device nodes, so remove code which synced the filesystem mounted on it, in case we came there. I'm not sure this code made sense in the first place since we would have taken the specfs route on such a vnode. Redo the highly bogus readblock() function in the snapshot code to something slightly less bogus: Constructing an uio and using physio was really quite a detour. Instead just fill in a bio and ship it down.	2004-10-29 10:15:56 +00:00
Poul-Henning Kamp	570a7ddaa3	We only support backing UFS/FFS with disks.	2004-10-28 06:19:28 +00:00
Poul-Henning Kamp	a40a512387	Eliminate unnecessary KASSERTS.	2004-10-27 06:45:06 +00:00
Poul-Henning Kamp	93d244fb1a	KASSERT that we only get to prewrite() on writes.	2004-10-26 20:13:49 +00:00
Poul-Henning Kamp	8dd5650594	White space changes. Add missing static.	2004-10-26 20:13:21 +00:00
Poul-Henning Kamp	6e77a04170	The island council met and voted buf_prewrite() home. Give ffs it's own bufobj->bo_ops vector and create a private strategy routine, (currently misnamed for forwards compatibility), which is just a copy of the generic bufstrategy routine except we call softdep_disk_prewrite() directly instead of through the buf_prewrite() indirection. Teach UFS about the need for softdep_disk_prewrite() and call the function directly in FFS. Remove buf_prewrite() from the default bufstrategy() and from the global bio_ops method vector.	2004-10-26 10:44:10 +00:00
Poul-Henning Kamp	58883a1fe5	Fix syntax errors introduced by last commit. Why isn't DIRECTIO in NOTES/LINT ?	2004-10-26 09:04:20 +00:00
Poul-Henning Kamp	5d9d81e7ea	Put the I/O block size in bufobj->bo_bsize. We keep si_bsize_phys around for now as that is the simplest way to pull the number out of disk device drivers in devfs_open(). The correct solution would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth when filesystems sit on GEOM, so don't bother for now.	2004-10-26 07:39:12 +00:00
Poul-Henning Kamp	fae974f156	Degeneralize the per cdev copyonwrite callback. The only possible value is ffs_copyonwrite() and the only place it can be called from is FFS which would never want to call another filesystems copyonwrite method, should one exist, so there is no reason why anything generic should know about this.	2004-10-26 06:25:56 +00:00
Poul-Henning Kamp	156cb26583	Loose the v_dirty* and v_clean* alias macros. Check the count field where we just want to know the full/empty state, rather than using TAILQ_EMPTY() or TAILQ_FIRST().	2004-10-25 09:14:03 +00:00
Poul-Henning Kamp	ee1d0eb330	Remove vnode->v_bsize. This was a dead-end.	2004-10-25 07:50:59 +00:00
Poul-Henning Kamp	b792bebeea	Move the buffer method vector (buf->b_op) to the bufobj. Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().	2004-10-24 20:03:41 +00:00
Poul-Henning Kamp	494eb176e7	Add b_bufobj to struct buf which eventually will eliminate the need for b_vp. Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.	2004-10-22 08:47:20 +00:00
Poul-Henning Kamp	a76d8f4ec9	Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.	2004-10-21 15:53:54 +00:00
Robert Watson	60c9762920	Explicitly break out NETA license from Berkeley license to clearly indicate license grant, as well as to indicate that NETA is asserting only two clauses, not four clauses. Requested by: imp	2004-10-20 08:05:02 +00:00
Nate Lawson	894d8d3c03	Fix fsbtodb() for UFS1. This fixes an overflow for file sizes >1 TB, allowing for sizes up to 4 TB. This doesn't affect UFS2 since b is already a 64 bit type, coincidental with daddr_t. Submitted by: bde	2004-10-09 20:16:06 +00:00
Pawel Jakub Dawidek	8d02a378aa	Back out changes which were introduced to delay mounting root file system. Those changes were made on gmirror needs, but now gmirror handles this by itself.	2004-10-05 11:26:43 +00:00
Poul-Henning Kamp	4f116178ba	Remove support for accessing device nodes in UFS/FFS. Device nodes can still be created and exported with NFS.	2004-09-28 13:30:58 +00:00
Poul-Henning Kamp	961da2716b	Give cluster_write() an explicit vnode argument. In the future a struct buf will not automatically point out a vnode for us.	2004-09-27 19:14:10 +00:00
Pawel Jakub Dawidek	5a19f8b0c4	Introduce new /boot/loader.conf variable: root_mount_delay. It can be used to delay mounting root partition to give a chance to GEOM providers to show up. Now, when there is no needed provider, vfs_rootmount() function will look for it every second and if it can't be find in defined time, it'll ask for root device name (before this change it was done immediately). This will allow to boot from gmirror device in degraded mode.	2004-09-23 10:13:18 +00:00
Poul-Henning Kamp	d705e025d0	The getpages VOP was a good stab at getting scatter/gather I/O without too much kernel copying, but it is not the right way to do it, and it is in the way for straightening out the buffer cache. The right way is to pass the VM page array down through the struct bio to the disk device driver and DMA directly in to/out off the physical memory. Once the VM/buf thing is sorted out it is next on the list. Retire most of vnode method. ffs_getpages(). It is not clear if what is left shouldn't be in the default implementation which we now fall back to. Retire specfs_getpages() as well, as it has no users now.	2004-09-19 08:14:55 +00:00
Poul-Henning Kamp	b08c753baa	Do not traverse list of snapshots if there isn't one. Found by: scottl	2004-09-16 17:28:56 +00:00
Poul-Henning Kamp	b85e29f007	Missed a place where snapshots were allocated in my last commit to this file.	2004-09-16 15:58:18 +00:00
Poul-Henning Kamp	67673e6677	Create struct snapdata which contains the snapshot fields from cdev and the previously malloc'ed snapshot lock. Malloc struct snapdata instead of just the lock. Replace snapshot fields in cdev with pointer to snapdata (saves 16 bytes). While here, give the private readblock() function a vnode argument in preparation for moving UFS to access GEOM directly.	2004-09-13 07:29:45 +00:00
Poul-Henning Kamp	883d3c0c07	Remove the buffercache/vnode side of BIO_DELETE processing in preparation for integration of p4::phk_bufwork. In the future, local filesystems will talk to GEOM directly and they will consequently be able to issue BIO_DELETE directly. Since the removal of the fla driver, BIO_DELETE has effectively been a no-op anyway.	2004-09-13 06:50:42 +00:00
John Baldwin	b72ea57f3b	Generalize the UFS bad magic value used to determine when a filesystem has only been partly initialized via newfs(8) so that it applies to both UFS1 and UFS2. Submitted by: "Xin LI" delphij at frontfree dot net MFC: maybe?	2004-08-19 11:09:13 +00:00
John-Mark Gurney	ad3b9257c2	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
Poul-Henning Kamp	7ac439fec4	use bufdone() not biodone().	2004-08-08 13:23:05 +00:00
Poul-Henning Kamp	5e8c582ac2	Put a version element in the VFS filesystem configuration structure and refuse initializing filesystems with a wrong version. This will aid maintenance activites on the 5-stable branch. s/vfs_mount/vfs_omount/ s/vfs_nmount/vfs_mount/ Name our filesystems mount function consistently. Eliminate the namiedata argument to both vfs_mount and vfs_omount. It was originally there to save stack space. A few places abused it to get hold of some credentials to pass around. Effectively it is unused. Reorganize the root filesystem selection code.	2004-07-30 22:08:52 +00:00
Poul-Henning Kamp	d634f69316	Remove global variable rootdevs and rootvp, they are unused as such. Add local rootvp variables as needed. Remove checks for miniroot's in the swappartition. We never did that and most of the filesystems could never be used for that, but it had still been copy&pasted all over the place.	2004-07-28 20:21:04 +00:00
Alexander Kabaev	b403319b8d	Avoid using casts as lvalues. Introduce DIP_SET macro which sets proper inode field based on UFS version. Use DIP ro read values and DIP_SET to modify them throughout FFS code base.	2004-07-28 06:41:27 +00:00
Colin Percival	56f21b9d74	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
Poul-Henning Kamp	d8d3d4158b	Make sure to update the mnt_stats before UFS1 extattr tried to do I/O on the device. Otherwise the blocksize is undefined in the buffer cache.	2004-07-14 14:19:32 +00:00
Alfred Perlstein	f257b7a54b	Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.	2004-07-12 08:14:09 +00:00
Marcel Moolenaar	f65de26bf6	Update for the KDB debugger framework: o Make debugging code conditional upon KDB. o Use kdb_backtrace() instead of backtrace(). o Remove inclusion of opt_ddb.h.	2004-07-10 20:45:47 +00:00
Poul-Henning Kamp	c94cd5fc8c	Explicity initialize vp->v_bsize.	2004-07-07 20:04:06 +00:00
Poul-Henning Kamp	e3c5a7a4dd	When we traverse the vnodes on a mountpoint we need to look out for our cached 'next vnode' being removed from this mountpoint. If we find that it was recycled, we restart our traversal from the start of the list. Code to do that is in all local disk filesystems (and a few other places) and looks roughly like this: MNT_ILOCK(mp); loop: for (vp = TAILQ_FIRST(&mp...); (vp = nvp) != NULL; nvp = TAILQ_NEXT(vp,...)) { if (vp->v_mount != mp) goto loop; MNT_IUNLOCK(mp); ... MNT_ILOCK(mp); } MNT_IUNLOCK(mp); The code which takes vnodes off a mountpoint looks like this: MNT_ILOCK(vp->v_mount); ... TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes); ... MNT_IUNLOCK(vp->v_mount); ... vp->v_mount = something; (Take a moment and try to spot the locking error before you read on.) On a SMP system, one CPU could have removed nvp from our mountlist but not yet gotten to assign a new value to vp->v_mount while another CPU simultaneously get to the top of the traversal loop where it finds that (vp->v_mount != mp) is not true despite the fact that the vnode has indeed been removed from our mountpoint. Fix: Introduce the macro MNT_VNODE_FOREACH() to traverse the list of vnodes on a mountpoint while taking into account that vnodes may be removed from the list as we go. This saves approx 65 lines of duplicated code. Split the insmntque() which potentially moves a vnode from one mount point to another into delmntque() and insmntque() which does just what the names say. Fix delmntque() to set vp->v_mount to NULL while holding the mountpoint lock.	2004-07-04 08:52:35 +00:00
Jun Kuriyama	86030e4a00	Avoid deadlock which is caused by locking VDIR of parent and VREG of snapshot itself in wrong order. We can skip unlink check of that directory because it must have snapshot in it. Reviewed by: mckusick and current@	2004-06-18 14:35:17 +00:00
Poul-Henning Kamp	89c9c53da0	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
Julian Elischer	fa88511615	Nice, is a property of a process as a whole.. I mistakenly moved it to the ksegroup when breaking up the process structure. Put it back in the proc structure.	2004-06-16 00:26:31 +00:00
Stefan Farfeleder	1a5ff9285a	Avoid assignments to cast expressions. Reviewed by: md5 Approved by: das (mentor)	2004-06-08 13:08:19 +00:00
Tim J. Robbins	fa2a4d0595	Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid having to acquire sched_lock when manipulating it in lockmgr(), uiomove(), and uiomove_fromphys(). Reviewed by: jhb	2004-06-03 01:47:37 +00:00

1 2 3 4 5 ...

709 Commits