1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-15 10:17:20 +00:00
Commit Graph

192 Commits

Author SHA1 Message Date
John Dyson
e85c1afb7c Allow vfs_ioopt to be enabled with a (temporary) config option. 1998-03-16 02:13:03 +00:00
John Dyson
bef608bd7e Some VM improvements, including elimination of alot of Sig-11
problems.  Tor Egge and others have helped with various VM bugs
lately, but don't blame him -- blame me!!!

pmap.c:
1)	Create an object for kernel page table allocations.  This
	fixes a bogus allocation method previously used for such, by
	grabbing pages from the kernel object, using bogus pindexes.
	(This was a code cleanup, and perhaps a minor system stability
	 issue.)

pmap.c:
2)	Pre-set the modify and accessed bits when prudent.  This will
	decrease bus traffic under certain circumstances.

vfs_bio.c, vfs_cluster.c:
3)	Rather than calculating the beginning virtual byte offset
	multiple times, stick the offset into the buffer header, so
	that the calculated offset can be reused.  (Long long multiplies
	are often expensive, and this is a probably unmeasurable performance
	improvement, and code cleanup.)

vfs_bio.c:
4)	Handle write recursion more intelligently (but not perfectly) so
	that it is less likely to cause a system panic, and is also
	much more robust.

vfs_bio.c:
5)	getblk incorrectly wrote out blocks that are incorrectly sized.
	The problem is fixed, and writes blocks out ONLY when B_DELWRI
	is true.

vfs_bio.c:
6)	Check that already constituted buffers have fully valid pages.  If
	not, then make sure that the B_CACHE bit is not set. (This was
	a major source of Sig-11 type problems.)

vfs_bio.c:
7)	Fix a potential system deadlock due to an incorrectly specified
	sleep priority while waiting for a buffer write operation.  The
	change that I made opens the system up to serious problems, and
	we need to examine the issue of process sleep priorities.

vfs_cluster.c, vfs_bio.c:
8)	Make clustered reads work more correctly (and more completely)
	when buffers are already constituted, but not fully valid.
	(This was another system reliability issue.)

vfs_subr.c, ffs_inode.c:
9)	Create a vtruncbuf function, which is used by filesystems that
	can truncate files.  The vinvalbuf forced a file sync type operation,
	while vtruncbuf only invalidates the buffers past the new end of file,
	and also invalidates the appropriate pages.  (This was a system reliabiliy
	and performance issue.)

10)	Modify FFS to use vtruncbuf.

vm_object.c:
11)	Make the object rundown mechanism for OBJT_VNODE type objects work
	more correctly.  Included in that fix, create pager entries for
	the OBJT_DEAD pager type, so that paging requests that might slip
	in during race conditions are properly handled.  (This was a system
	reliability issue.)

vm_page.c:
12)	Make some of the page validation routines be a little less picky
	about arguments passed to them.  Also, support page invalidation
	change the object generation count so that we handle generation
	counts a little more robustly.

vm_pageout.c:
13)	Further reduce pageout daemon activity when the system doesn't
	need help from it.  There should be no additional performance
	decrease even when the pageout daemon is running.  (This was
	a significant performance issue.)

vnode_pager.c:
14)	Teach the vnode pager to handle race conditions during vnode
	deallocations.
1998-03-16 01:56:03 +00:00
John Dyson
26300b34f1 Disable the vfs.ioopt option for now, so that we don't get gratuitious
bugreports.  I might not be able to fix the problems before 3.0, due
to other, more important things.
1998-03-14 19:50:36 +00:00
Tor Egge
8293f20aee Don't misuse vnode interlocks in routines that can be called from interrupts.
PR:		5893
1998-03-14 02:55:01 +00:00
Julian Elischer
b1897c197c Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman)
Submitted by:	Kirk McKusick (mcKusick@mckusick.com)
Obtained from:  WHistle development tree
1998-03-08 09:59:44 +00:00
John Dyson
8f9110f6a1 This mega-commit is meant to fix numerous interrelated problems. There
has been some bitrot and incorrect assumptions in the vfs_bio code.  These
problems have manifest themselves worse on NFS type filesystems, but can
still affect local filesystems under certain circumstances.  Most of
the problems have involved mmap consistancy, and as a side-effect broke
the vfs.ioopt code.  This code might have been committed seperately, but
almost everything is interrelated.

1)	Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that
	are fully valid.
2)	Rather than deactivating erroneously read initial (header) pages in
	kern_exec, we now free them.
3)	Fix the rundown of non-VMIO buffers that are in an inconsistent
	(missing vp) state.
4)	Fix the disassociation of pages from buffers in brelse.  The previous
	code had rotted and was faulty in a couple of important circumstances.
5)	Remove a gratuitious buffer wakeup in vfs_vmio_release.
6)	Remove a crufty and currently unused cluster mechanism for VBLK
	files in vfs_bio_awrite.  When the code is functional, I'll add back
	a cleaner version.
7)	The page busy count wakeups assocated with the buffer cache usage were
	incorrectly cleaned up in a previous commit by me.  Revert to the
	original, correct version, but with a cleaner implementation.
8)	The cluster read code now tries to keep data associated with buffers
	more aggressively (without breaking the heuristics) when it is presumed
	that the read data (buffers) will be soon needed.
9)	Change to filesystem lockmgr locks so that they use LK_NOPAUSE.  The
	delay loop waiting is not useful for filesystem locks, due to the
	length of the time intervals.
10)	Correct and clean-up spec_getpages.
11)	Implement a fully functional nfs_getpages, nfs_putpages.
12)	Fix nfs_write so that modifications are coherent with the NFS data on
	the server disk (at least as well as NFS seems to allow.)
13)	Properly support MS_INVALIDATE on NFS.
14)	Properly pass down MS_INVALIDATE to lower levels of the VM code from
	vm_map_clean.
15)	Better support the notion of pages being busy but valid, so that
	fewer in-transit waits occur.  (use p->busy more for pageouts instead
	of PG_BUSY.)  Since the page is fully valid, it is still usable for
	reads.
16)	It is possible (in error) for cached pages to be busy.  Make the
	page allocation code handle that case correctly.  (It should probably
	be a printf or panic, but I want the system to handle coding errors
	robustly.  I'll probably add a printf.)
17)	Correct the design and usage of vm_page_sleep.  It didn't handle
	consistancy problems very well, so make the design a little less
	lofty.  After vm_page_sleep, if it ever blocked, it is still important
	to relookup the page (if the object generation count changed), and
	verify it's status (always.)
18)	In vm_pageout.c, vm_pageout_clean had rotted, so clean that up.
19)	Push the page busy for writes and VM_PROT_READ into vm_pageout_flush.
20)	Fix vm_pager_put_pages and it's descendents to support an int flag
	instead of a boolean, so that we can pass down the invalidate bit.
1998-03-07 21:37:31 +00:00
John Dyson
59228495d7 Change vfs.ioopt default back to '0'. 1998-03-01 23:07:45 +00:00
John Dyson
ffc82b0a70 1) Use a more consistent page wait methodology.
2)	Do not unnecessarily force page blocking when paging
	pages out.
3)	Further improve swap pager performance and correctness,
	including fixing the paging in progress deadlock (except
	in severe I/O error conditions.)
4)	Enable vfs_ioopt=1 as a default.
5)	Fix and enable the page prezeroing in SMP mode.

All in all, SMP systems especially should show a significant
improvement in "snappyness."
1998-03-01 04:18:54 +00:00
John Dyson
64d3c7e32d Clean-up the vget mechanism by permanently attaching VM objects to
vnodes, therefore vget doesn't need to do so anymore.  Other minor
improvements include the temp free vnode queue obeying the VAGE
flag and a printf that warns of to-be-removed code being executed.
1998-02-23 06:59:52 +00:00
KATO Takenori
1b11919b2b Fixed vnode interlock handling.
Reviewed by:	Bruce Evans <bde@zeta.org.au>
            	Tor Egge <Tor.Egge@idi.ntnu.no>
1998-02-10 02:54:24 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
KATO Takenori
16e3b0b67a When the vp is lcoked, vget() calls vfs_object_create() with
waslocked = TRUE.  This change may fix lockmgr panic in umapfs/nullfs.

PR:		5634
Reviewed by:	"John S. Dyson" <toor@dyson.iquest.net>
Suggested by:	Bruce Evans <bde@zeta.org.au>
1998-02-07 08:44:31 +00:00
Eivind Eklund
0b08f5f737 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
John Dyson
95461b450d 1) Start using a cleaner and more consistant page allocator instead
of the various ad-hoc schemes.
2)	When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3)	When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
	processor errata, and to minimize redundant processor updating of page
	tables.
4)	Modify pmap_protect so that it can only remove permissions (as it
	originally supported.)  The additional capability is not needed.
5)	Streamline read-only to read-write page mappings.
6)	For pmap_copy_page, don't enable write mapping for source page.
7)	Correct and clean-up pmap_incore.
8)	Cluster initial kern_exec pagin.
9)	Removal of some minor lint from kern_malloc.
10)	Correct some ioopt code.
11)	Remove some dead code from the MI swapout routine.
12)	Correct vm_object_deallocate (to remove backing_object ref.)
13)	Fix dead object handling, that had problems under heavy memory load.
14)	Add minor vm_page_lookup improvements.
15)	Some pages are not in objects, and make sure that the vm_page.c can
	properly support such pages.
16)	Add some more page deficit handling.
17)	Some minor code readability improvements.
1998-02-05 03:32:49 +00:00
Eivind Eklund
47cfdb166d Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
Tor Egge
d09a16d804 Update freevnodes when adding a vnode to the head of the free list. 1998-01-31 01:17:58 +00:00
John Dyson
50ce7ff499 Add better support for larger I/O clusters, including larger physical
I/O.  The support is not mature yet, and some of the underlying implementation
needs help.  However, support does exist for IDE devices now.
1998-01-24 02:01:46 +00:00
John Dyson
2d8acc0f4a VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
John Dyson
4722175765 Tie up some loose ends in vnode/object management. Remove an unneeded
config option in pmap.  Fix a problem with faulting in pages.  Clean-up
some loose ends in swap pager memory management.

The system should be much more stable, but all subtile bugs aren't fixed yet.
1998-01-17 09:17:02 +00:00
John Dyson
53f6f08545 Fix another vnode leak. 1998-01-12 03:15:01 +00:00
John Dyson
925a3a419a Fix some vnode management problems, and better mgmt of vnode free list.
Fix the UIO optimization code.
Fix an assumption in vm_map_insert regarding allocation of swap pagers.
Fix an spl problem in the collapse handling in vm_object_deallocate.
When pages are freed from vnode objects, and the criteria for putting
the associated vnode onto the free list is reached, either put the
vnode onto the list, or put it onto an interrupt safe version of the
list, for further transfer onto the actual free list.
Some minor syntax changes changing pre-decs, pre-incs to post versions.
Remove a bogus timeout (that I added for debugging) from vn_lock.

PHK will likely still have problems with the vnode list management, and
so do I, but it is better than it was.
1998-01-12 01:46:33 +00:00
John Dyson
857d737ed6 Disable io optimizations again, minor bug found, and will be fixed in
a few days.
1998-01-07 09:26:29 +00:00
John Dyson
95e5e988e0 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
John Dyson
483140ead1 Add the vnode interlock back around vref. 1997-12-29 16:54:03 +00:00
John Dyson
60f8d46448 Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem
with the object cache removal.
1997-12-29 01:03:55 +00:00
John Dyson
2be70f79f6 Lots of improvements, including restructring the caching and management
of vnodes and objects.  There are some metadata performance improvements
that come along with this.  There are also a few prototypes added when
the need is noticed.  Changes include:

1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.

Be gentle, and please give me feedback asap.
1997-12-29 00:25:11 +00:00
John Dyson
1efb74fbcc Some performance improvements, and code cleanups (including changing our
expensive OFF_TO_IDX to btoc whenever possible.)
1997-12-19 09:03:37 +00:00
Garrett Wollman
1cbbd625cc Add support for poll(2) on files. vop_nopoll() now returns POLLNVAL
if one of the new poll types is requested; hopefully this will not break
any existing code.  (This is done so that programs have a dependable
way of determining whether a filesystem supports the extended poll types
or not.)

The new poll types added are:

	POLLWRITE - file contents may have been modified
	POLLNLINK - file was linked, unlinked, or renamed
	POLLATTRIB - file's attributes may have been changed
	POLLEXTEND - file was extended

Note that the internal operation of poll() means that it is impossible
for two processes to reliably poll for the same event (this could
be fixed but may not be worth it), so it is not possible to rewrite
`tail -f' to use poll at this time.
1997-12-15 03:09:59 +00:00
Bruce Evans
cb451ebdbd Staticized. 1997-11-22 08:35:46 +00:00
Julian Elischer
b1f4a44b03 Reviewed by: various.
Ever since I first say the way the mount flags were used I've hated the
fact that modes, and events, internal and exported, and short-term
and long term flags are all thrown together. Finally it's annoyed me enough..
This patch to the entire FreeBSD tree adds a second mount flag word
to the mount struct. it is not exported to userspace. I have moved
some of the non exported flags over to this word. this means that we now
have 8 free bits in the mount flags. There are another two that might
well move over, but which I'm not sure about.
The only user visible change would have been in pstat -v, except
that davidg has disabled it anyhow.
I'd still like to move the state flags and the 'command' flags
apart from each other.. e.g. MNT_FORCE really doesn't have the
same semantics as MNT_RDONLY, but that's left  for another day.
1997-11-12 05:42:33 +00:00
Poul-Henning Kamp
4a11ca4e29 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
Poul-Henning Kamp
dba3870c10 VFS interior redecoration.
Rename vn_default_error to vop_defaultop all over the place.
Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite.
Use vop_null instead of nullop.
Move vop_nopoll from vfs_subr.c to vfs_default.c
Move vop_sharedlock from vfs_subr.c to vfs_default.c
Move vop_nolock from vfs_subr.c to vfs_default.c
Move vop_nounlock from vfs_subr.c to vfs_default.c
Move vop_noislocked from vfs_subr.c to vfs_default.c
Use vop_ebadf instead of *_ebadf.
Add vop_defaultop for getpages on master vnode in MFS.
1997-10-26 20:55:39 +00:00
Poul-Henning Kamp
a1c995b626 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
Poul-Henning Kamp
55166637cd Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
Poul-Henning Kamp
f7891f9adb Dike out a weird warning. 1997-10-11 07:34:27 +00:00
Poul-Henning Kamp
d047b580c6 I lost a bit of my change in the last commit, this is more like it.
Noticed by:	bde
1997-09-26 08:08:58 +00:00
Poul-Henning Kamp
87b1940afa Reduce the target number of vnodes on the freelist from desiredvnodes
(usually a couple of thousand) to 25.  The measured impact on cache-hits
doesn't justify spending memory this way:

Target number of free vnodes versus namecache hit rate in % during a
make world:
          10    98.5316
         200    98.5479
         500    98.5546
        1000    98.5709
        3000    98.6006
        4000    98.6126
1997-09-25 16:17:57 +00:00
Poul-Henning Kamp
0054419366 A couple of handles to tweak, more statistics. 1997-09-24 07:46:54 +00:00
Bruce Evans
514ede0953 Fixed gratuitous ANSIisms. 1997-09-16 11:44:05 +00:00
Peter Wemm
7fab77996c Provide a 'return true' poll vnode op rather than duplicating the
'do nothing' case all over the various filesystems.
1997-09-14 02:49:06 +00:00
Peter Wemm
557fe2c5e1 print correct function name in a panic (vop_nolock -> vop_sharedlock) 1997-09-13 15:02:28 +00:00
Bruce Evans
41fadeeb28 Removed yet more vestiges of config-time swap configuration and/or
cleaned up nearby cruft.
1997-09-07 16:21:11 +00:00
Bruce Evans
a910e75cdc Removed vestiges of config-time "argument processing" configuration. 1997-09-07 13:49:56 +00:00
Poul-Henning Kamp
fd9d9ff13e Hmm, this is hopefully better. 1997-09-03 13:29:41 +00:00
Poul-Henning Kamp
7cb22688e9 Revert the v_usecount handling in relation to VOP_INACTIVE. 1997-09-03 09:18:48 +00:00
Bruce Evans
e4ba6a82b0 Removed unused #includes. 1997-09-02 20:06:59 +00:00
Poul-Henning Kamp
a051452ae2 Change the 0xdeadb hack to a flag called VDOOMED.
Introduce VFREE which indicates that vnode is on freelist.
Rename vholdrele() to vdrop().
Create vfree() and vbusy() to add/delete vnode from freelist.
Add vfree()/vbusy() to keep (v_holdcnt != 0 || v_usecount != 0)
  vnodes off the freelist.
Generalize vhold()/v_holdcnt to mean "do not recycle".
Fix reassignbuf()s lack of use of vhold().
Use vhold() instead of checking v_cache_src list.
Remove vtouch(), the vnodes are always vget'ed soon enough
  after for it to have any measuable effect.
Add sysctl debug.freevnodes to keep track of things.
Move cache_purge() up in getnewvnodes to avoid race.
Decrement v_usecount after VOP_INACTIVE(), put a vhold() on
  it during VOP_INACTIVE()
Unmacroize vhold()/vdrop()
Print out VDOOMED and VFREE flags (XXX: should use %b)

Reviewed by:		dyson
1997-08-31 07:32:39 +00:00
Bruce Evans
d3114049c1 Restored rev.1.92 which was clobbered by the previous commit. 1997-08-26 11:59:20 +00:00
John Dyson
a5db4bf475 Back out some incorrect changes that was worse than the original bug. 1997-08-26 04:36:27 +00:00
John Dyson
89721f6f1a This is a trial improvement for the vnode reference count while on the vnode
free list problem.  Also, the vnode age flag is no longer used by the
vnode pager.  (It is actually incorrect to use then.)  Constructive
feedback welcome -- just be kind.
1997-08-22 03:56:37 +00:00
Bruce Evans
b1037dcd53 #include <machine/limits.h> explicitly in the few places that it is required. 1997-08-21 20:33:42 +00:00
Garrett Wollman
57bf258e3d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
John Dyson
23be6be853 Fix a problem with the vfs vnode caching that it doesn't grow quickly
enough and can cause some strange performance problems.  Specifically, at
or near startup time is when the problem is worst.  To reproduce
the problem, run "lat_syscall stat" from the alpha lmbench code right
after bootup.  A positive side effect of this mod is that the name
cache can be set to grow again by sysctl.  A noticable positive
performance impact is realized due to a larger namecache being available
as needed (or tuned.)
1997-08-04 07:43:28 +00:00
Doug Rabson
f6b4c28555 Merge WebNFS support from NetBSD
Obtained from:	NetBSD
1997-07-17 07:17:33 +00:00
John Dyson
3c631446d3 Remove a window during running down a file vnode. Also, the OBJ_DEAD
flag wasn't being respected during vref(), et. al.  Note that this
isn't the eventual fix for the locking problem.  Fine grained SMP
in the VM and VFS code will require (lots) more work.
1997-06-22 03:00:24 +00:00
David Greenman
2e58c0f892 Disabled the kern.vnode sysctl variable. It's causing system crashes on
large systems and needs to be re-thinked or removed wholesale.
1997-06-10 02:48:08 +00:00
Poul-Henning Kamp
8670684a2f Fix a race condition that did, after all, exist.
Reviewed by:	phk
Submitted by:	dfr
1997-05-06 15:19:38 +00:00
Poul-Henning Kamp
b15a966ec6 1. Add a {pointer, v_id} pair to the vnode to store the reference to the
".." vnode.  This is cheaper storagewise than keeping it in the
    namecache, and it makes more sense since it's a 1:1 mapping.

2.  Also handle the case of "." more intelligently rather than stuff
    the namecache with pointless entries.

3.  Add two lists to the vnode and hang namecache entries which go from
    or to this vnode.  When cleaning a vnode, delete all namecache
    entries it invalidates.

4.  Never reuse namecache enties, malloc new ones when we need it, free
    old ones when they die.  No longer a hard limit on how many we can
    have.

5.  Remove the upper limit on namelength of namecache entries.

6.  Make a global list for negative namecache entries, limit their number
    to a sysctl'able (debug.ncnegfactor) fraction of the total namecache.
    Currently the default fraction is 1/16th.  (Suggestions for better
    default wanted!)

7.  Assign v_id correctly in the face of 32bit rollover.

8.  Remove the LRU list for namecache entries, not needed.  Remove the
    #ifdef NCH_STATISTICS stuff, it's not needed either.

9.  Use the vnode freelist as a true LRU list, also for namecache accesses.

10. Reuse vnodes more aggresively but also more selectively, if we can't
    reuse, malloc a new one.  There is no longer a hard limit on their
    number, they grow to the point where we don't reuse potentially
    usable vnodes.  A vnode will not get recycled if still has pages in
    core or if it is the source of namecache entries (Yes, this does
    indeed work :-)  "." and ".." are not namecache entries any longer...)

11. Do not overload the v_id field in namecache entries with whiteout
    information, use a char sized flags field instead, so we can get
    rid of the vpid and v_id fields from the namecache struct.  Since
    we're linked to the vnodes and purged when they're cleaned, we don't
    have to check the v_id any more.

12. NFS knew about the limitation on name length in the namecache, it
    shouldn't and doesn't now.

Bugs:
        The namecache statistics no longer includes the hits for ".."
        and "." hits.

Performance impact:
        Generally in the +/- 0.5% for "normal" workstations, but
        I hope this will allow the system to be selftuning over a
        bigger range of "special" applications.  The case where
        RAM is available but unused for cache because we don't have
        any vnodes should be gone.

Future work:
        Straighten out the namecache statistics.

        "desiredvnodes" is still used to (bogusly ?) size hash
        tables in the filesystems.

        I have still to find a way to safely free unused vnodes
        back so their number can shrink when not needed.

        There is a few uses of the v_id field left in the filesystems,
        scheduled for demolition at a later time.

        Maybe a one slot cache for unused namecache entries should
        be implemented to decrease the malloc/free frequency.
1997-05-04 09:17:38 +00:00
John Dyson
82b8e119b4 Staticize an unnecessarily global function: vputrele.
Submitted by:	 Michael Hancock <michaelh@cet.co.jp>
1997-04-30 03:09:15 +00:00
Peter Wemm
5f61c81d66 copyin the export network mask to the correct variable.
Submitted by: Mike Hibler <mike@marker.cs.utah.edu>, PR#3380
1997-04-25 06:47:12 +00:00
Doug Rabson
de15ef6aef Add a function vop_sharedlock which a copy of vop_nolock without the
implementation #ifdef out.  This can be used for now by NFS.  As soon
as all the other filesystems' locking is fixed, this can go away.

Print the vnode address in vprint for easier debugging.
1997-04-04 17:46:21 +00:00
Bruce Evans
0f1adf65ab Use OID_AUTO instead of magic number for the Lite2 sysctl debug.busyprt.
Removed declaration of vfs_unmountroot() again.

Staticized vgonel().
1997-04-01 13:05:34 +00:00
David Greenman
2f2160da3b Fixed splbio problems in vinvalbuf. Closes PR#2875, although fixed
differently by me.
1997-03-05 04:54:54 +00:00
Bruce Evans
4a8b966013 Attach vfs_sysctl() one level lower so that only the levels below
VFS_GENERIC aren't done in the FreeBSD way.  The previous commit
broke the nfs sysctls.
1997-03-04 18:31:56 +00:00
Bruce Evans
3a76a5949b Merged Lite2's vfs_sysctl(). It doesn't fit very well into FreeBSD's
(phk's) sysctl framework, and I needed special code to disambiguate
the VFS_GENERIC node from the VFS_VFSCONF leaf, so I only converted
the leaves to the FreeBSD framework.  The error handling isn't quite
right.  CSRGS's sysctls seem to return ENOTDIR too much and FreeBSD's
sysctls don't agree with the man page.
1997-03-03 12:58:20 +00:00
Bruce Evans
dc91a89e83 Restored some pre-Lite2-merge source-level compatibility to the mount()
and getvfsbyname() interfaces.  The new interfaces are now hidden from
applications unless _NEW_VFSCONF is defined.  The new vfsconf interfaces
don't work yet.
1997-03-02 17:53:37 +00:00
Bruce Evans
a896f0256c Moved vfs sysctls to where Lite2 put them. No code changes yet. 1997-03-02 11:06:22 +00:00
Bruce Evans
b98afd0d00 Fixed Lite2 merge of spechash simplelocking. It was misplaced in
checkalias() and missing in vfinddev() and vcount().
1997-02-27 16:08:43 +00:00
John Dyson
fd7f690f94 Fix the previous simple_lock fix breakage in the combined
vput/vrele routine.  Fix a panic message.  Fix the vop_nounlock
routine so that "special" filesystems that use it work correctly.
1997-02-27 05:28:58 +00:00
John Dyson
0d955f71a1 Fix the simple_lock problem with the physical I/O buffer code, and
also fix the missing simple_unlock in vrele, and improve vrele/vput
by merging them into one routine.  BDE pointed these problems out.
1997-02-27 02:57:03 +00:00
Bruce Evans
7c1557c4af Fixed unmounting of the root fs. vfs_unmountroot() wasn't fully updated
to do Lite2 locking and vfs_unmountall() wasn't as simple as the Lite2
version.
1997-02-26 15:35:42 +00:00
Bruce Evans
c35e283a80 Merged some missing locking from Lite2:
- getnewvnode() and vref() were missing one simple_unlock() each.
- the Lite2 locking changes weren't merged at all in
  printlockedvnodes() or sysctl_vnode().  Merging these undid
  some KNF style regressions.
1997-02-25 19:33:23 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
John Dyson
996c772f58 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00
Bruce Evans
5131d64e0c Removed option EXTRAVNODES. All versions of FreeBSD-2.x have a sysctl
variable `kern.maxvnodes' which gives much better control over vnode
allocation than EXTRAVNODES (except in -current between 1995/10/28 and
1996/11/12, kern.maxvnodes was read-only and thus useless).
1997-01-16 13:16:10 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
John Dyson
8b612c4b4a This commit is the embodiment of some VFS read clustering improvements.
Firstly, now our read-ahead clustering is on a file descriptor basis and not
on a per-vnode basis.  This will allow multiple processes reading the
same file to take advantage of read-ahead clustering.  Secondly, there
previously was a problem with large reads still using the ramp-up
algorithm.  Of course, that was bogus, and now we read the entire
"chunk" off of the disk in one operation.   The read-ahead clustering
algorithm should use less CPU than the previous also (I hope :-)).

NOTE:  THAT LKMS MUST BE REBUILT!!!
1996-12-29 02:45:28 +00:00
Bruce Evans
b83ddf9c86 Restored writability of kern.maxvnodes. It was broken a year ago in
rev.1.29 of kern_sysctl.c.

Should be in 2.2.
1996-11-12 09:24:31 +00:00
Poul-Henning Kamp
19060a3ad9 init_main.c: pass -d to init if DEVFS_ROOT
kern_conf.c:	gd driver is a disk.
vfs_subr.c:	include opt_devfs.h
1996-10-28 11:34:57 +00:00
Jordan K. Hubbard
0082fb4657 I'm not sure why, but Netcon's TFS filesystem code doesn't want to
add free vnodes back to the freelist.  They must do their own vnode
management.  Anyway, this change is *only* activated with their filesystem
and doesn't affect anyone else.  Whoops, forgot the submitted-by lines
in my previous commits too.. :-(
Submitted-By: Tony Ardolino <tony@netcon.com>
1996-10-17 17:56:07 +00:00
John Dyson
ad98052216 Clean up the rundown of the object backing a vnode. This should fix
NFS problems associated with forcible dismounts.
1996-10-17 02:49:35 +00:00
John Dyson
a8f42fa9a6 Correct vget by removing a window where a vnode can potentially go away. 1996-09-28 03:36:07 +00:00
Nate Williams
030e2e9ebb In sys/time.h, struct timespec is defined as:
/*
         * Structure defined by POSIX.4 to be like a timeval.
         */
        struct timespec {
                time_t  ts_sec;         /* seconds */
                long    ts_nsec;        /* and nanoseconds */
        };

        The correct names of the fields are tv_sec and tv_nsec.

Reminded by:	James Drobina <jdrobina@infinet.com>
1996-09-19 18:21:32 +00:00
John Dyson
6476c0d204 Even though this looks like it, this is not a complex code change.
The interface into the "VMIO" system has changed to be more consistant
and robust.  Essentially, it is now no longer necessary to call vn_open
to get merged VM/Buffer cache operation, and exceptional conditions
such as merged operation of VBLK devices is simpler and more correct.

This code corrects a potentially large set of problems including the
problems with ktrace output and loaded systems, file create/deletes,
etc.

Most of the changes to NFS are cosmetic and name changes, eliminating
a layer of subroutine calls.  The direct calls to vput/vrele have
been re-instituted for better cross platform compatibility.

Reviewed by: davidg
1996-08-21 21:56:23 +00:00
John Dyson
619594e898 Certain vnode buffer list operations were not being spl protected,
and they needed to be.  Brelse for example can be called at interrupt
level, and the buffer list operations were not being protected from it.
1996-08-15 06:45:01 +00:00
Bruce Evans
8c2ff39670 Only use the special bdevvp() for DEVFS if DEVFS_ROOT is defined. This
makes option DEVFS safe to use again (although mounting devfs is unsafe).
1996-07-30 18:00:32 +00:00
Poul-Henning Kamp
e83cf165d6 DEVFS needs a special bdevvp(). 1996-07-24 21:21:43 +00:00
Bruce Evans
cba2a7c614 Staticized a few variables.
Fixed warnings about unused variables.
1996-07-12 07:41:34 +00:00
Peter Wemm
114a8cff43 Add an option "EXTRA_VNODES" to cause an extra number of vnode structures
to be allocated at boot time.  This is an expensive option, as they
consume physical ram and are not pageable etc.  In certain situations,
this kind of option is quite useful, especially for news servers that
access a large number of directories at random and torture the name cache.
Defining 5000 or 10000 extra vnodes should cut down the amount of vnode
recycling somewhat, which should allow better name and directory caching
etc.

This is a "your mileage may vary" option, with no real indication of
what works best for your machine except trial and error.  Too many will
cost you ram that you could otherwise use for disk buffers etc.

This is based on something John Dyson mentioned to me a while ago.
1996-05-31 00:20:34 +00:00
John Dyson
e5fadd05f2 Put the "free vnode isn't" check back in the right place. 1996-03-09 06:43:19 +00:00
John Dyson
bd7e5f992e Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
	overhead for merged cache.
Efficiency improvement for vfs_cluster.  It used to do alot of redundant
	calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files.  Additionally,
	fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources.  The pageout code
	will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
	page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
	thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
	that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
	case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
	buffers.
1996-01-19 04:00:31 +00:00
Garrett Wollman
0e41ee3037 Convert DDB to new-style option. 1996-01-04 21:13:23 +00:00
David Greenman
864ef7d17e Moved the #ifdef DIAGNOSTIC in vrele() so that the check for negative
v_usecount is always performed and only the call to vprint is conditional.
1996-01-02 18:13:20 +00:00
Poul-Henning Kamp
27a0b398a7 Staticize.
Unstaticize a function in scsi/scsi_base that was used, with an undocumented
option.
My last count on the LINT kernel shows:
Total symbols:  3647
unref symbols:   463
undef symbols:     4
1 ref symbols:  1751
2 ref symbols:   485
Approaching the pain threshold now.
1995-12-17 21:23:44 +00:00
John Dyson
a316d390bd Changes to support 1Tb filesizes. Pages are now named by an
(object,index) pair instead of (object,offset) pair.
1995-12-11 04:58:34 +00:00
David Greenman
efeaf95a41 Untangled the vm.h include file spaghetti. 1995-12-07 12:48:31 +00:00
Poul-Henning Kamp
65d0bc1387 A couple of minor tweaks to the sysctl stuff. 1995-12-06 13:27:39 +00:00
Bruce Evans
98d938220c Completed function declarations and/or added prototypes. 1995-12-02 18:58:56 +00:00
Poul-Henning Kamp
2d0b1d708c A test was backwards.
Noticed by:	Cheng, Hsiao-Yang <sycheng@cis.ufl.edu>
1995-11-29 11:28:00 +00:00
Poul-Henning Kamp
4b2af45f4b Mega commit for sysctl.
Convert the remaining sysctl stuff to the new way of doing things.
the devconf stuff is the reason for the large number of files.
Cleaned up some compiler warnings while I were there.
1995-11-20 12:42:39 +00:00