1
0
mirror of https://git.FreeBSD.org/src.git synced 2025-01-12 14:29:28 +00:00
Commit Graph

2115 Commits

Author SHA1 Message Date
Alan Cox
98fe9a0ddf Eliminate another unnecessary call to vm_page_busy(). (See revision 1.333
for a detailed explanation.)
2004-12-17 18:54:51 +00:00
Alan Cox
06c98c5dcc Enable debug.mpsafevm by default on alpha. 2004-12-17 17:17:36 +00:00
Alan Cox
85f5b24573 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
Alan Cox
90688d137c With the removal of kern/uipc_jumbo.c and sys/jumbo.h,
vm_object_allocate_wait() is not used.  Remove it.
2004-12-08 05:01:47 +00:00
Alan Cox
2ad036b657 Almost nine years ago, when support for 1TB files was introduced in
revision 1.55, the address parameter to vnode_pager_addr() was changed
from an unsigned 32-bit quantity to a signed 64-bit quantity.  However,
an out-of-range check on the address was not updated.  Consequently,
memory-mapped I/O on files greater than 2GB could cause a kernel panic.
Since the address is now a signed 64-bit quantity, the problem resolution
is simply to remove a cast.

Reviewed by: bde@ and tegge@
PR: 73010
MFC after: 1 week
2004-12-07 22:05:38 +00:00
Alan Cox
d8fed1d050 Correct a sanity check in vnode_pager_generic_putpages(). The cast used
to implement the sanity check should have been changed when we converted
the implementation of vm_pindex_t from 32 to 64 bits.  (Thus, RELENG_4 is
not affected.)  The consequence of this error would be a legimate write to
an extremely large file being treated as an errant attempt to write meta-
data.

Discussed with: tegge@
2004-12-05 21:48:11 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
Olivier Houchard
6fc96493ac Remove useless casts. 2004-11-26 15:04:26 +00:00
Xin LI
8e33bced3c Try to close a potential, but serious race in our VM subsystem.
Historically, our contigmalloc1() and contigmalloc2() assumes
that a page in PQ_CACHE can be unconditionally reused by busying
and freeing it.  Unfortunatelly, when object happens to be not
NULL, the code will set m->object to NULL and disregard the fact
that the page is actually in the VM page bucket, resulting in
page bucket hash table corruption and finally, a filesystem
corruption, or a 'page not in hash' panic.

This commit has borrowed the idea taken from DragonFlyBSD's fix
to the VM fix by Matthew Dillon[1].  This version of patch will
do the following checks:

	- When scanning pages in PQ_CACHE, check hold_count and
	  skip over pages that are held temporarily.
	- For pages in PQ_CACHE and selected as candidate of being
	  freed, check if it is busy at that time.

Note:  It seems that this is might be unrelated to kern/72539.

Obtained from:	DragonFlyBSD, sys/vm/vm_contig.c,v 1.11 and 1.12 [1]
Reminded by:	Matt Dillon
Reworked by:	alc
MFC After:	1 week
2004-11-24 18:56:13 +00:00
David Schultz
9799b417d5 Disable U area swapping and remove the routines that create, destroy,
copy, and swap U areas.

Reviewed by:	arch@
2004-11-20 02:29:00 +00:00
Poul-Henning Kamp
9c83534dd8 Make VOP_BMAP return a struct bufobj for the underlying storage device
instead of a vnode for it.

The vnode_pager does not and should not have any interest in what
the filesystem uses for backend.

(vfs_cluster doesn't use the backing store argument.)
2004-11-15 09:18:27 +00:00
Poul-Henning Kamp
5c6e573ffb Add pbgetbo()/pbrelbo() lighter weight versions of pbgetvp()/pbrelvp(). 2004-11-15 08:47:18 +00:00
Poul-Henning Kamp
287013d287 More kasserts. 2004-11-15 08:33:09 +00:00
Poul-Henning Kamp
d7fe1f51ad style polishing. 2004-11-15 08:22:38 +00:00
Poul-Henning Kamp
a752aa8f17 Move pbgetvp() and pbrelvp() to vm_pager.c with the rest of the pbuf stuff. 2004-11-15 08:12:50 +00:00
Poul-Henning Kamp
e8a7bef39e expect the caller to have called pbrelvp() if necessary. 2004-11-15 08:07:26 +00:00
Poul-Henning Kamp
676f3ee26c Explicitly call pbrelvp() 2004-11-15 08:06:05 +00:00
Poul-Henning Kamp
d20b2f76cc Improve readability with a bunch of typedefs for the pager ops.
These can also be used for prototypes in the pagers.
2004-11-09 13:43:20 +00:00
Dag-Erling Smørgrav
7419d1e25f #include <vm/vm_param.h> instead of <machine/vmparam.h> (the former
includes the latter, but also declares variables which are defined
in kern/subr_param.c).

Change som VM parameters from quad_t to unsigned long.  They refer to
quantities (size limits for text, heap and stack segments) which must
necessarily be smaller than the size of the address space, so long is
adequate on all platforms.

MFC after:	1 week
2004-11-08 18:20:02 +00:00
Alan Cox
dad740e967 Eliminate an unnecessary atomic operation. Articulate the rationale in
a comment.
2004-11-06 21:48:45 +00:00
Robert Watson
dc2c7965c0 Abstract the logic to look up the uma_bucket_zone given a desired
number of entries into bucket_zone_lookup(), which helps make more
clear the logic of consumers of bucket zones.

Annotate the behavior of bucket_init() with a comment indicating
how the various data structures, including the bucket lookup tables,
are initialized.
2004-11-06 11:43:30 +00:00
Poul-Henning Kamp
a7f06e2bd4 Remove dangling variable 2004-11-06 11:33:11 +00:00
Robert Watson
f9d27e7524 Annotate what bucket_size[] array does; staticize since it's used only
in uma_core.c.
2004-11-06 11:24:40 +00:00
David Schultz
8bc61209d4 Fix the last known race in swapoff(), which could lead to a spurious panic:
swapoff: failed to locate %d swap blocks

The race occurred because putpages() can block between the time it
allocates swap space and the time it updates the swap metadata to
associate that space with a vm_object, so swapoff() would complain
about the temporary inconsistency.  I hoped to fix this by making
swp_pager_getswapspace() and swp_pager_meta_build() a single atomic
operation, but that proved to be inconvenient.  With this change,
swapoff() simply doesn't attempt to be so clever about detecting when
all the pageout activity to the target device should have drained.
2004-11-06 07:17:50 +00:00
Alan Cox
19187819b7 Move a call to wakeup() from vm_object_terminate() to vnode_pager_dealloc()
because this call is only needed to wake threads that slept when they
discovered a dead object connected to a vnode.  To eliminate unnecessary
calls to wakeup() by vnode_pager_dealloc(), introduce a new flag,
OBJ_DISCONNECTWNT.

Reviewed by: tegge@
2004-11-06 05:33:02 +00:00
John Baldwin
57ea1265bd - Set the priority of the page zeroing thread using sched_prio() when the
thread is created rather than adjusting the priority in the main
  function.  (kthread_create() should probably take the initial priority
  as an argument.)
- Only yield the CPU in the !PREEMPTION case if there are any other
  runnable threads.  Yielding when there isn't anything else better to do
  just wastes time in pointless context switches (albeit while the system
  is idle.)
2004-11-05 19:14:02 +00:00
Alan Cox
34d9e6fdae During traversal of the inactive queue, try locking the page's containing
object before accessing the page's flags or the object's reference count.
2004-11-05 06:24:05 +00:00
Alan Cox
b546ac5490 Eliminate another unnecessary call to vm_page_busy() that immediately
precedes a call to vm_page_rename().  (See the previous revision for a
detailed explanation.)
2004-11-05 05:40:45 +00:00
David Schultz
b3fed13e9d Close a race in swapoff(). Here are the gory details:
In order to avoid livelock, swapoff() skips over objects with a
  nonzero pip count and makes another pass if necessary.  Since it is
  impossible to know which objects we care about, it would choose an
  arbitrary object with a nonzero pip count and wait for it before
  making another pass, the theory being that this object would finish
  paging about as quickly as the ones we care about.  Unfortunately,
  we may have slept since we acquired a reference to this object.
  Hack around this problem by tsleep()ing on the pointer anyway, but
  timeout after a fixed interval.  More elegant solutions are possible,
  but the ones I considered unnecessarily complicate this rare case.

Also, kill some nits that seem to have crept into the swapoff() code
in the last 75 revisions or so:

- Don't pass both sp and sp->sw_used to swap_pager_swapoff(), since
  the latter can be derived from the former.

- Replace swp_pager_find_dev() with something simpler.  There's no
  need to iterate over the entire list of swap devices just to determine
  if a given block is assigned to the one we're interested in.

- Expand the scope of the swhash_mtx in a couple of places so that it
  isn't released and reacquired once for every hash bucket.

- Don't drop the swhash_mtx while holding a reference to an object.
  We need to lock the object first.  Unfortunately, doing so would
  violate the established lock order, so use VM_OBJECT_TRYLOCK() and
  try again on a subsequent pass if the object is already locked.

- Refactor swp_pager_force_pagein() and swap_pager_swapoff() a bit.
2004-11-05 05:36:56 +00:00
Poul-Henning Kamp
6e67e2a710 Retire b_magic now, we have the bufobj containing the same hint. 2004-11-04 09:48:18 +00:00
Poul-Henning Kamp
c5d3d25e4f De-couple our I/O bio request from the embedded bio in buf by explicitly
copying the fields.
2004-11-04 08:38:07 +00:00
Poul-Henning Kamp
c569065139 Remove buf->b_dev field. 2004-11-04 07:59:57 +00:00
Alan Cox
d19ef81437 The synchronization provided by vm object locking has eliminated the
need for most calls to vm_page_busy().  Specifically, most calls to
vm_page_busy() occur immediately prior to a call to vm_page_remove().
In such cases, the containing vm object is locked across both calls.
Consequently, the setting of the vm page's PG_BUSY flag is not even
visible to other threads that are following the synchronization
protocol.

This change (1) eliminates the calls to vm_page_busy() that
immediately precede a call to vm_page_remove() or functions, such as
vm_page_free() and vm_page_rename(), that call it and (2) relaxes the
requirement in vm_page_remove() that the vm page's PG_BUSY flag is
set.  Now, the vm page's PG_BUSY flag is set only when the vm object
lock is released while the vm page is still in transition.  Typically,
this is when it is undergoing I/O.
2004-11-03 20:17:31 +00:00
Alan Cox
f790de29b9 Introduce a Boolean variable wakeup_needed to avoid repeated, unnecessary
calls to wakeup() by vm_page_zero_idle_wakeup().
2004-10-31 19:32:57 +00:00
Alan Cox
b86e6ec007 During traversal of the active queue by vm_pageout_page_stats(), try
locking the page's containing object before accessing the page's flags.
2004-10-30 23:30:53 +00:00
Alan Cox
af117d1278 Eliminate an unused but initialized variable. 2004-10-30 20:11:23 +00:00
Alan Cox
4b8a5c4095 Add an assignment statement that I omitted from the previous revision. 2004-10-30 07:09:46 +00:00
Alan Cox
f4d49654ae Assert that the containing vm object is locked in vm_page_cache() and
vm_page_try_to_cache().
2004-10-28 05:26:21 +00:00
Bosko Milekic
a5a262c6db Fix a INVARIANTS-only bug introduced in Revision 1.104:
IF INVARIANTS is defined, and in the rare case that we have
allocated some objects from the slab and at least one initializer
on at least one of those objects failed, and we need to fail the
allocation and push the uninitialized items back into the slab
caches -- in that scenario, we would fail to [re]set the
bucket cache's ub_bucket item references to NULL, which would
eventually trigger a KASSERT.
2004-10-27 21:19:35 +00:00
Alan Cox
b08abf6cc0 During traversal of the active queue, try locking the page's containing
object before accessing the page's flags or the object's reference count.
If the trylock fails, handle the page as though it is busy.
2004-10-27 18:29:17 +00:00
Poul-Henning Kamp
6229cc508b Also check that the sectormask is bigger than zero.
Wrap this overly long KASSERT and remove newline.
2004-10-26 19:51:57 +00:00
Poul-Henning Kamp
5d9d81e7ea Put the I/O block size in bufobj->bo_bsize.
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open().  The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
2004-10-26 07:39:12 +00:00
Poul-Henning Kamp
ff4782b57d Don't clear flags we just checked were not set. 2004-10-26 05:57:29 +00:00
Alan Cox
63bb7041cc Assert that the containing vm object is locked in vm_page_flash(). 2004-10-25 19:52:44 +00:00
Alan Cox
75d0533847 Assert that the containing vm object is locked in vm_page_busy() and
vm_page_wakeup().
2004-10-24 23:53:47 +00:00
Poul-Henning Kamp
b792bebeea Move the buffer method vector (buf->b_op) to the bufobj.
Extend it with a strategy method.

Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.

Rename ibwrite to bufwrite().

Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.

Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().

Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
2004-10-24 20:03:41 +00:00
Alan Cox
e5526e6aa5 Acquire the vm object lock before rather than after calling
vm_page_sleep_if_busy().  (The motivation being to transition
synchronization of the vm_page's PG_BUSY flag from the global page queues
lock to the per-object lock.)
2004-10-24 19:32:19 +00:00
Alan Cox
ddf4bb37c8 Use VM_ALLOC_NOBUSY instead of calling vm_page_wakeup(). 2004-10-24 18:46:32 +00:00
Alan Cox
0f9f9bcb53 Introduce VM_ALLOC_NOBUSY, an option to vm_page_alloc() and vm_page_grab()
that indicates that the caller does not want a page with its busy flag set.
In many places, the global page queues lock is acquired and released just
to clear the busy flag on a just allocated page.  Both the allocation of
the page and the clearing of the busy flag occur while the containing vm
object is locked.  So, the busy flag might as well never be set.
2004-10-24 06:15:36 +00:00
Poul-Henning Kamp
494eb176e7 Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.
Initialize b_bufobj for all buffers.

Make incore() and gbincore() take a bufobj instead of a vnode.

Make inmem() local to vfs_bio.c

Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj)
also VI_MTX() to BO_MTX(),

Make buf_vlist_add() take a bufobj instead of a vnode.

Eliminate other uses of bp->b_vp where bp->b_bufobj will do.

Various minor polishing: remove "register", turn panic into KASSERT,
use new function declarations, TAILQ_FOREACH_SAFE() etc.
2004-10-22 08:47:20 +00:00
Poul-Henning Kamp
a76d8f4ec9 Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write
count on a bufobj.  Bufobj_wdrop() replaces vwakeup().

Use these functions all relevant places except in ffs_softdep.c where
the use if interlocked_sleep() makes this impossible.

Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
2004-10-21 15:53:54 +00:00
Alan Cox
1e96d2a217 Correct two errors in PG_BUSY management by vm_page_cowfault(). Both
errors are in rarely executed paths.
1. Each time the retry_alloc path is taken, the PG_BUSY must be set again.
   Otherwise vm_page_remove() panics.
2. There is no need to set PG_BUSY on the newly allocated page before
   freeing it.  The page already has PG_BUSY set by vm_page_alloc().
   Setting it again could cause an assertion failure.

MFC after: 2 weeks
2004-10-18 08:11:59 +00:00
Alan Cox
36aeb90e34 Assert that the containing object is locked in vm_page_io_start() and
vm_page_io_finish().  The motivation being to transition synchronization of
the vm_page's busy field from the global page queues lock to the per-object
lock.
2004-10-17 22:33:40 +00:00
Alan Cox
950d5f7a99 Remove unnecessary check for curthread == NULL. 2004-10-17 20:29:28 +00:00
Peter Wemm
a7bc3102c4 Put on my peril sensitive sunglasses and add a flags field to the internal
sysctl routines and state.  Add some code to use it for signalling the need
to downconvert a data structure to 32 bits on a 64 bit OS when requested by
a 32 bit app.

I tried to do this in a generic abi wrapper that intercepted the sysctl
oid's, or looked up the format string etc, but it was a real can of worms
that turned into a fragile mess before I even got it partially working.

With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have
it not abort.  Things like netstat, ps, etc have a long way to go.

This also fixes a bug in the kern.ps_strings and kern.usrstack hacks.
These do matter very much because they are used by libc_r and other things.
2004-10-11 22:04:16 +00:00
Brian Feldman
55fc8c1146 In the previous revision, I did not intend to change the default value
of "nosleepwithlocks."

Submitted by:	ru
2004-10-09 18:51:32 +00:00
Brian Feldman
ab14a3f7aa Fix critical stability problems that can cause UMA mbuf cluster
state management corruption, mbuf leaks, general mbuf corruption,
and at least on i386 a first level splash damage radius that
encompasses up to about half a megabyte of the memory after
an mbuf cluster's allocation slab.  In short, this has caused
instability nightmares anywhere the right kind of network traffic
is present.

When the polymorphic refcount slabs were added to UMA, the new types
were not used pervasively.  In particular, the slab management
structure was turned into one for refcounts, and one for non-refcounts
(supposed to be mostly like the old slab management structure),
but the latter was almost always used through out.  In general, every
access to zones with UMA_ZONE_REFCNT turned on corrupted the
"next free" slab offset offset and the refcount with each other and
with other allocations (on i386, 2 mbuf clusters per 4096 byte slab).

Fix things so that the right type is used to access refcounted zones
where it was not before.  There are additional errors in gross
overestimation of padding, it seems, that would cause a large kegs
(nee zones) to be allocated when small ones would do.  Unless I have
analyzed this incorrectly, it is not directly harmful.
2004-10-08 20:19:29 +00:00
David Schultz
f6bcadc4fc Don't look for swap blocks in objects that aren't swap-backed.
I expect that this will fix the following panic, reported by Jun:
	swap_pager_isswapped: failed to locate all swap meta blocks

MT5 candidate
2004-09-24 16:04:20 +00:00
Poul-Henning Kamp
891822a853 XXX mark two places where we do not hold a threadcount on the dev when
frobbing the cdevsw.

In both cases we examine only the cdevsw and it is a good question if we
weren't better off copying those properties into the cdev in the first
place.  This question will be revisited.
2004-09-24 08:32:36 +00:00
Poul-Henning Kamp
751fdd08fe Use dev_re[fl]thread() to maintain a ref on the device driver while
we call the ->d_mmap function.
2004-09-24 05:59:11 +00:00
David Schultz
8daa8c602a The zone from which proc structures are allocated is marked
UMA_ZONE_NOFREE to guarantee type stability, so proc_fini() should
never be called.  Move an assertion from proc_fini() to proc_dtor()
and garbage-collect the rest of the unreachable code.  I have retained
vm_proc_dispose(), since I consider its disuse a bug.
2004-09-19 18:34:17 +00:00
Poul-Henning Kamp
7ce1979be6 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
Alan Cox
5e4bdb57cb System maps are prohibited from mapping vnode-backed objects. Take
advantage of this restriction to avoid acquiring and releasing Giant when
wiring pages within a system map.

In collaboration with: tegge@
2004-09-11 18:49:59 +00:00
Poul-Henning Kamp
1a31a6c3b2 add KASSERTS 2004-09-07 07:32:40 +00:00
Alan Cox
b102e653ad Enable debug.mpsafevm by default on amd64 and i386. This enables copy-on-
write and zero-fill faults to run without holding Giant.  It is still
possible to disable Giant-free operation by setting debug.mpsafevm to 0 in
loader.conf.
2004-09-04 05:51:54 +00:00
Alan Cox
94ddc7076d Push Giant deep into vm_forkproc(), acquiring it only if the process has
mapped System V shared memory segments (see shmfork_myhook()) or requires
the allocation of an ldt (see vm_fault_wire()).
2004-09-03 05:11:32 +00:00
Scott Long
9923b511ed Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
Alan Cox
c2296e999c Remove dead code. 2004-09-01 19:58:37 +00:00
Alan Cox
1a95d74419 In vm_fault_unwire() eliminate the acquisition and release of Giant in the
case of non-kernel pmaps.
2004-09-01 19:18:59 +00:00
Julian Elischer
2630e4c90c Give setrunqueue() and sched_add() more of a clue as to
where they are coming from and what is expected from them.

MFC after:	2 days
2004-09-01 02:11:28 +00:00
Alan Cox
9b98b79683 Move the acquisition and release of the lock on the object at the head of
the shadow chain outside of the loop in vm_object_madvise(), reducing the
number of times that this lock is acquired and released.
2004-08-29 20:14:10 +00:00
Ian Dowse
632ca524c3 Prevent vm_page_zero_idle_wakeup() from attempting to wake up the
page zeroing thread before it has been created. It was possible for
calls to free() very early in the boot process to panic here because
the sleep queues were not yet initialised. Specifically, sysinit_add()
running at SI_SUB_KLD would trigger this if the array of pointers
became big enough to require uma_large_alloc() allocations.

Submitted by:	peter
2004-08-29 01:02:33 +00:00
Marcel Moolenaar
2bbb534b56 Move the cow field between wire_count and hold_count. This is the
position that is 64-bit aligned and makes sure that the valid and
dirty fields are also 64-bit aligned. This means that if PAGE_SIZE
is 32K, the size of the vm_page structure is only increased by 8
bytes instead of 16 bytes. More importantly, the vm_page structure
is either 120 or 128 bytes on ia64. These are "interesting" sizes.
2004-08-22 20:52:23 +00:00
Alan Cox
3268a1bf75 In the previous revision, I failed to condition an early release of Giant
in vm_fault() on debug_mpsafevm.  If debug_mpsafevm was not set, the result
was an assertion failure early in the boot process.

Reported by: green@
2004-08-22 00:08:43 +00:00
Alan Cox
b99e61353f Further reduce the use of Giant by vm_fault(): Giant is held only when
manipulating a vnode, e.g., calling vput().  This reduces contention for
Giant during many copy-on-write faults, resulting in some additional
speedup on SMPs.

Note: debug_mpsafevm must be enabled for this optimization to take effect.
2004-08-21 19:20:21 +00:00
Alan Cox
0cb507cb20 Acquire and release Giant around a call to VOP_BMAP(). (This is a
prerequisite to any further reduction in Giant's use by vm_fault().)
2004-08-19 02:37:12 +00:00
Alan Cox
c1fbc251cd - Introduce and use a new tunable "debug.mpsafevm". At present, setting
"debug.mpsafevm" results in (almost) Giant-free execution of zero-fill
   page faults.  (Giant is held only briefly, just long enough to determine
   if there is a vnode backing the faulting address.)

   Also, condition the acquisition and release of Giant around calls to
   pmap_remove() on "debug.mpsafevm".

   The effect on performance is significant.  On my dual Opteron, I see a
   3.6% reduction in "buildworld" time.

 - Use atomic operations to update several counters in vm_fault().
2004-08-16 06:16:12 +00:00
Brian Feldman
7c938963a4 Rather than bringing back all of the changes to make VM map deletion
wait for system wires to disappear, do so (much more trivially) by
instead only checking for system wires of user maps and not kernel maps.

Alternative by:	tor
Reviewed by:	alc
2004-08-16 03:11:09 +00:00
Alan Cox
6eaee3fee4 Remove spl calls. 2004-08-14 18:57:41 +00:00
Alan Cox
0164e05781 Replace the linear search in vm_map_findspace() with an O(log n)
algorithm built into the map entry splay tree.  This replaces the
first_free hint in struct vm_map with two fields in vm_map_entry:
adj_free, the amount of free space following a map entry, and
max_free, the maximum amount of free space in the entry's subtree.
These fields make it possible to find a first-fit free region of a
given size in one pass down the tree, so O(log n) amortized using
splay trees.

This significantly reduces the overhead in vm_map_findspace() for
applications that mmap() many hundreds or thousands of regions, and
has a negligible slowdown (0.1%) on buildworld.  See, for example, the
discussion of a micro-benchmark titled "Some mmap observations
compared to Linux 2.6/OpenBSD" on -hackers in late October 2003.

OpenBSD adopted this approach in March 2002, and NetBSD added it in
November 2003, both with Red-Black trees.

Submitted by: Mark W. Krentel
2004-08-13 08:06:34 +00:00
Tor Egge
19dc560756 The vm map lock is needed in vm_fault() after the page has been found,
to avoid later changes before pmap_enter() and vm_fault_prefault()
has completed.

Simplify deadlock avoidance by not blocking on vm map relookup.

In collaboration with: alc
2004-08-12 20:14:49 +00:00
Brian Feldman
c5f60ffccf Re-delete the comment from r1.352. 2004-08-12 17:22:28 +00:00
Brian Feldman
0ada205ee6 Back out all behavioral chnages. 2004-08-10 14:42:48 +00:00
Brian Feldman
9689d5e5ee Revamp VM map wiring.
* Allow no-fault wiring/unwiring to succeed for consistency;
  however, the wired count remains at zero, so it's a special case.

* Fix issues inside vm_map_wire() and vm_map_unwire() where the
  exact state of user wiring (one or zero) and system wiring
  (zero or more) could be confused; for example, system unwiring
  could succeed in removing a user wire, instead of being an
  error.

* Require all mappings to be unwired before they are deleted.
  When VM space is still wired upon deletion, it will be waited
  upon for the following unwire.  This makes vslock(9) work
  rather than allowing kernel-locked memory to be deleted
  out from underneath of its consumer as it would before.
2004-08-09 19:52:29 +00:00
Alan Cox
8673599662 Make two changes to vm_fault().
1. Move a comment to its proper place, updating it.  (Except for white-
   space, this comment had been unchanged since revision 1.1!)
2. Remove spl calls.
2004-08-09 18:46:39 +00:00
Alan Cox
91491c3566 Remove a stale comment from vm_map_lookup() that pertains to share maps.
(The last vestiges of the share map code were removed in revisions 1.153
and 1.159.)
2004-08-09 18:15:46 +00:00
Alan Cox
eebf3286a6 Make two changes to vm_fault().
1. Retain the map lock until after the calls to pmap_enter() and
   vm_fault_prefault().
2. Remove a stale comment.  Submitted by: tegge@
2004-08-09 06:01:46 +00:00
Poul-Henning Kamp
5721c9c76a Tag all geom classes in the tree with a version number. 2004-08-08 07:57:53 +00:00
Alan Cox
e0b47a134b Remove dead code. A vm_map's first_free is never NULL (even if the map is
full).

(This is preparation for an O(log n) implementation of vm_map_findspace().)

Submitted by:	Mark W. Krentel
2004-08-07 05:58:31 +00:00
Robert Watson
3659f747f1 Generate KTR trace records for uma_zalloc_arg() and uma_zfree_arg().
This doesn't trace every event of interest in UMA, but provides
enough basic information to explain lock traces and sleep patterns.
2004-08-06 21:52:38 +00:00
Brian Feldman
28775a6130 Turn on the new contigmalloc(9) by default. There should not actually
be a reason to use the old contigmalloc(9), but if desired, it the
vm.old_contigmalloc setting can be tuned/sysctld back to 0 for now.
2004-08-05 21:54:11 +00:00
Poul-Henning Kamp
23fc1a90ea Remove a product specific workaround for wrong modes when mmap(2)'ing
devices.  They have had plenty of time to adjust now.
2004-08-05 07:04:33 +00:00
Alan Cox
684a62b7bf - Push down the acquisition and release of Giant into pmap_enter_quick()
on those architectures without pmap locking.
 - Eliminate the acquisition and release of Giant in vm_map_pmap_enter().
2004-08-04 22:03:16 +00:00
Doug Rabson
c413d99c4e In dev_pager_updatefake, m->valid is typically 0 on entry. It
should be set to VM_PAGE_BITS_ALL before returning, to ensure that
neither vm_pager_get_pages nor vm_fault calls vm_page_zero_invalid
after dev_pager_getpages has returned.

Submitted by: tegge
2004-08-04 08:58:58 +00:00
Alan Cox
21c125453c Eliminate the acquisition and release of Giant around the call to
pmap_mincore() in mincore(2).  Either pmap locking exists (alpha, amd64,
i386, ia64) or pmap_mincore() is unimplemented (arm, powerpc, sparc64).
2004-08-02 03:31:05 +00:00
Brian Feldman
b23f72e98a * Add a "how" argument to uma_zone constructors and initialization functions
so that they know whether the allocation is supposed to be able to sleep
  or not.
* Allow uma_zone constructors and initialation functions to return either
  success or error.  Almost all of the ones in the tree currently return
  success unconditionally, but mbuf is a notable exception: the packet
  zone constructor wants to be able to fail if it cannot suballocate an
  mbuf cluster, and the mbuf allocators want to be able to fail in general
  in a MAC kernel if the MAC mbuf initializer fails.  This fixes the
  panics people are seeing when they run out of memory for mbuf clusters.
* Allow debug.nosleepwithlocks on WITNESS to be disabled, without changing
  the default.

Both bmilekic and jeff have reviewed the changes made to make failable
zone allocations work.
2004-08-02 00:18:36 +00:00
Alan Cox
9bb0e06861 - Push down the acquisition and release of Giant into pmap_protect() on
those architectures without pmap locking.
 - Eliminate the acquisition and release of Giant from vm_map_protect().

(Translation: mprotect(2) runs to completion without touching Giant on
alpha, amd64, i386 and ia64.)
2004-07-30 20:38:30 +00:00
Alan Cox
9be60284a6 Giant is no longer required by vm_waitproc() and vmspace_exitfree().
Eliminate it acquisition and release around vm_waitproc() in kern_wait().
2004-07-30 20:31:02 +00:00
Doug Rabson
92bab635d3 Fix a memory leak in the device pager which is exposed by the NVIDIA
OpenGL driver.

Submitted by: nvidia (possibly also tegge)
2004-07-30 11:09:18 +00:00
Doug Rabson
874f013517 Fix handling of msync(2) for character special files.
Submitted by: nvidia
2004-07-30 11:08:02 +00:00