freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-29 12:03:03 +00:00

Author	SHA1	Message	Date
Mark Johnston	08cfa56ea3	Extend uma_reclaim() to permit different reclamation targets. The page daemon periodically invokes uma_reclaim() to reclaim cached items from each zone when the system is under memory pressure. This is important since the size of these caches is unbounded by default. However it also results in bursts of high latency when allocating from heavily used zones as threads miss in the per-CPU caches and must access the keg in order to allocate new items. With r340405 we maintain an estimate of each zone's usage of its (per-NUMA domain) cache of full buckets. Start making use of this estimate to avoid reclaiming the entire cache when under memory pressure. In particular, introduce TRIM, DRAIN and DRAIN_CPU verbs for uma_reclaim() and uma_zone_reclaim(). When trimming, only items in excess of the estimate are reclaimed. Draining a zone reclaims all of the cached full buckets (the previous behaviour of uma_reclaim()), and may further drain the per-CPU caches in extreme cases. Now, when under memory pressure, the page daemon will trim zones rather than draining them. As a result, heavily used zones do not incur bursts of bucket cache misses following reclamation, but large, unused caches will be reclaimed as before. Reviewed by: jeff Tested by: pho (an earlier version) MFC after: 2 months Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D16667	2019-09-01 22:22:43 +00:00
Jeff Roberson	c168508655	Add two new kernel options to control memory locality on NUMA hardware. - UMA_XDOMAIN enables an additional per-cpu bucket for freed memory that was freed on a different domain from where it was allocated. This is only used for UMA_ZONE_NUMA (first-touch) zones. - UMA_FIRSTTOUCH sets the default UMA policy to be first-touch for all zones. This tries to maintain locality for kernel memory. Reviewed by: gallatin, alc, kib Tested by: pho, gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20929	2019-08-06 21:50:34 +00:00
Gleb Smirnoff	bec2d7e9a2	The KVM code also needs a fix similar to r344269. Reported by: pho	2019-05-29 03:14:46 +00:00
Gleb Smirnoff	aa43298309	With r343051 UMA switched from atomic counts to counter(9) and now kernel reports snap counts of how much a zone alloced and how much it freed. It may happen that snap values doesn't match, e.g alloced - freed < 0. Workaround that in memstat library. Reported by: pho	2019-02-18 21:27:13 +00:00
Gleb Smirnoff	cf64f5197b	This was missed in r343051: make uz_allocs, uz_frees and uz_fails counter(9).	2019-01-15 18:47:19 +00:00
Gleb Smirnoff	bb15d1c778	o Move zone limit from keg level up to zone level. This means that now two zones sharing a keg may have different limits. Now this is going to work: zone = uma_zcreate(); uma_zone_set_max(zone, limit); zone2 = uma_zsecond_create(zone); uma_zone_set_max(zone2, limit2); Kegs no longer have uk_maxpages field, but zones have uz_items. When set, it may be rounded up to minimum possible CPU bucket cache size. For small limits bucket cache can also be reconfigured to be smaller. Counter uz_items is updated whenever items transition from keg to a bucket cache or directly to a consumer. If zone has uz_maxitems set and it is reached, then we are going to sleep. o Since new limits don't play well with multi-keg zones, remove them. The idea of multi-keg zones was introduced exactly 10 years ago, and never have had a practical usage. In discussion with Jeff we came to a wild agreement that if we ever want to reintroduce the idea of a smart allocator that would be able to choose between two (or more) totally different backing stores, that choice should be made one level higher than UMA, e.g. in malloc(9) or in mget(), or whatever and choice should be controlled by the caller. o Sleeping code is improved to account number of sleepers and wake them one by one, to avoid thundering herd problem. o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache() KPI added. Having no bucket cache basically means setting maxcache to 0. o Now with many fields added and many removed (no multi-keg zones!) make sure that struct uma_zone is perfectly aligned. Reviewed by: markj, jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D17773	2019-01-15 00:02:06 +00:00
Mateusz Guzik	f38828cbc7	libmemstat: adjust for per-cpu stats after r338899 Reported by: yuripv Reviewed by: kib, markj Approved by: re (gjb) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17490	2018-10-11 23:25:14 +00:00
Dag-Erling Smørgrav	6bff85ff9a	Reduce <sys/queue.h> pollution. While <sys/sysctl.h> includes <sys/queue.h> unconditionally, it is only actually used in code which is conditional on _KERNEL. Make the #include itself conditional as well, and fix userland code that uses <sys/queue.h> for other purposes but relied on <sys/sysctl.h> to bring it in. MFC after: 1 week	2018-05-11 00:01:43 +00:00
Jeff Roberson	ab3185d15e	Implement NUMA support in uma(9) and malloc(9). Allocations from specific domains can be done by the _domain() API variants. UMA also supports a first-touch policy via the NUMA zone flag. The slab layer is now segregated by VM domains and is precise. It handles iteration for round-robin directly. The per-cpu cache layer remains a mix of domains according to where memory is allocated and freed. Well behaved clients can achieve perfect locality with no performance penalty. The direct domain allocation functions have to visit the slab layer and so require per-zone locks which come at some expense. Reviewed by: Attilio (a slightly older version) Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon	2018-01-12 23:25:05 +00:00
Pedro F. Giffuni	5e53a4f90f	lib: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using mis-identified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-26 02:00:33 +00:00
Bryan Drewery	ea825d0274	DIRDEPS_BUILD: Update dependencies. Sponsored by: Dell EMC Isilon	2017-10-31 00:07:04 +00:00
Justin Hibbits	4026b44790	Fix buildworld for powerpc. vmpage requires struct pmap to exist and contain a pm_stats field. As of r308817, either AIM or BOOKE is required to be set in order to get their respective pmap structs. Rather than expose them both, or try to unify them unnecessarily, add a third option which contains only a pm_stats field, and change the two existing pmap structures to place the common fields at the beginning of the struct. This actually fixes the stats collection by libkvm on AIM hardware, because before it was accessing a possibly different offset, which would cause it to read garbage. Bump __FreeBSD_version to denote this ABI change, so that ports which depend on libkvm can be rebuilt.	2016-11-20 06:10:12 +00:00
Glen Barber	30922917c8	MFH Sponsored by: The FreeBSD Foundation	2016-02-10 04:20:39 +00:00
Gleb Smirnoff	b28cc462ad	Include sys/_task.h into uma_int.h, so that taskqueue.h isn't a requirement for uma_int.h. Suggested by: jhb	2016-02-09 20:22:35 +00:00
Glen Barber	bbb51924bb	MFH Sponsored by: The FreeBSD Foundation	2016-02-08 12:16:01 +00:00
Glen Barber	a70cba9582	First pass through library packaging. Sponsored by: The FreeBSD Foundation	2016-02-04 21:16:35 +00:00
Gleb Smirnoff	9508a0e1fe	Fix build.	2016-02-04 00:23:21 +00:00
Bryan Drewery	7b3ea376a2	META MODE: Prefer INSTALL=tools/install.sh to lessen the need for xinstall.host. This both avoids some dependencies on xinstall.host and allows bootstrapping on older releases to work due to lack of at least 'install -l' support. Sponsored by: EMC / Isilon Storage Division	2015-11-25 19:10:28 +00:00
Simon J. Gerraty	ccfb965433	Add META_MODE support. Off by default, build behaves normally. WITH_META_MODE we get auto objdir creation, the ability to start build from anywhere in the tree. Still need to add real targets under targets/ to build packages. Differential Revision: D2796 Reviewed by: brooks imp	2015-06-13 19:20:56 +00:00
Simon J. Gerraty	44d314f704	dirdeps.mk now sets DEP_RELDIR	2015-06-08 23:35:17 +00:00
Simon J. Gerraty	98e0ffaefb	Merge sync of head	2015-05-27 01:19:58 +00:00
Baptiste Daroussin	6b129086dc	Convert libraries to use LIBADD While here reduce a bit overlinking	2014-11-25 11:07:26 +00:00
Simon J. Gerraty	ee7b0571c2	Merge head from 7/28	2014-08-19 06:50:54 +00:00
Baptiste Daroussin	2b7af31cf5	use .Mt to mark up email addresses consistently (part3) PR: 191174 Submitted by: Franco Fichtner <franco at lastsummer.de>	2014-06-23 08:23:05 +00:00
Simon J. Gerraty	fae50821ae	Updated dependencies	2014-05-16 14:09:51 +00:00
Simon J. Gerraty	76b28ad6ab	Updated dependencies	2014-05-10 05:16:28 +00:00
Simon J. Gerraty	9d2ab4a62d	Merge head	2014-04-27 08:13:43 +00:00
Gleb Smirnoff	345e3f4dd7	Expose real size of UMA allocations via libmemstat(3). Sponsored by: Nginx, Inc.	2014-02-10 20:09:10 +00:00
Simon J. Gerraty	d1d0158641	Merge from head	2013-09-05 20:18:59 +00:00
Jeff Roberson	fc03d22b17	Refine UMA bucket allocation to reduce space consumption and improve performance. - Always free to the alloc bucket if there is space. This gives LIFO allocation order to improve hot-cache performance. This also allows for zones with a single bucket per-cpu rather than a pair if the entire working set fits in one bucket. - Enable per-cpu caches of buckets. To prevent recursive bucket allocation one bucket zone still has per-cpu caches disabled. - Pick the initial bucket size based on a table driven maximum size per-bucket rather than the number of items per-page. This gives more sane initial sizes. - Only grow the bucket size when we face contention on the zone lock, this causes bucket sizes to grow more slowly. - Adjust the number of items per-bucket to account for the header space. This packs the buckets more efficiently per-page while making them not quite powers of two. - Eliminate the per-zone free bucket list. Always return buckets back to the bucket zone. This ensures that as zones grow into larger bucket sizes they eventually discard the smaller sizes. It persists fewer buckets in the system. The locking is slightly trickier. - Only switch buckets in zalloc, not zfree, this eliminates pathological cases where we ping-pong between two buckets. - Ensure that the thread that fills a new bucket gets to allocate from it to give a better upper bound on allocation time. Sponsored by: EMC / Isilon Storage Division	2013-06-18 04:50:20 +00:00
Simon J. Gerraty	7cf3a1c6b2	Updated dependencies	2013-03-11 17:21:52 +00:00
Simon J. Gerraty	f5f7c05209	Updated dependencies	2013-02-16 01:23:54 +00:00
Simon J. Gerraty	7cd2dcf076	Updated/new Makefile.depend	2012-11-08 21:24:17 +00:00
Simon J. Gerraty	23090366f7	Sync from head	2012-11-04 02:52:03 +00:00
Matthew D Fleming	bb196eb480	Const-ify the zone name argument to uma_zcreate(9). MFC after: 3 days	2012-10-26 17:51:05 +00:00
Marcel Moolenaar	7750ad47a9	Sync FreeBSD's bmake branch with Juniper's internal bmake branch. Requested by: Simon Gerraty <sjg@juniper.net>	2012-08-22 19:25:57 +00:00
Glen Barber	3102cfe2e2	Fix various typos in manual pages. Submitted by: amdmi3 PR: 165431 MFC after: 1 week	2012-02-25 14:31:25 +00:00
Sergey Kandaurov	cfc9e655ba	Cosmetic cleanup: remove #define LIBMEMSTAT used to prevent a nested include of opt_vmpage.h from vm/vm_page.h. opt_vmpage.h was retired before 7.0 together with options PQ_NOOPT. Approved by: re (kib) MFC after: 3 days	2011-09-02 14:10:42 +00:00
Sergey Kandaurov	1882360b9b	Get rid of MAXCPU knowledge used for internal needs only. Switch to dynamic memory allocation to hold per-CPU memory types data (sized to mp_maxid for UMA, and to mp_maxcpus for malloc to match the kernel). That fixes libmemstat with arbitrary large MAXCPU values and therefore eliminates MEMSTAT_ERROR_TOOMANYCPUS error type. Reviewed by: jhb Approved by: re (kib)	2011-08-01 09:43:35 +00:00
Attilio Rao	1de471dfee	Revert r222363, as bde@ pointed out the initial solution was far more correct.	2011-05-31 20:59:53 +00:00
Attilio Rao	d361ed4b1c	Style fix: cast to size_t rather than u_long when comparing to sizeof() rets. Requested by: kib	2011-05-27 16:01:51 +00:00
Attilio Rao	d40d5f64a2	Sync with -CURRENT	2011-05-10 18:01:53 +00:00
Attilio Rao	be720a4061	Fix a mismerge.	2011-05-08 14:45:53 +00:00
Attilio Rao	34e4a6f408	Revert MAXCPU introduction. In userland it is always 1. Noted by: marcel	2011-05-08 14:29:25 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Attilio Rao	1d221389b2	Remove the redefinition of MEMSTAT_MAXCPU and just use MAXCPU for that. Reviewed by: sbruno	2011-05-02 17:13:40 +00:00
Attilio Rao	77bdc950f4	MFC @ r221286	2011-05-01 00:48:03 +00:00
Joel Dahl	799162a628	Spelling fixes.	2010-08-03 17:40:09 +00:00
Sean Bruno	bf96595915	Add a new column to the output of vmstat -z to indicate the number of times the system was forced to sleep when requesting a new allocation. Expand the debugger hook, db_show_uma, to display these results as well. This has proven to be very useful in out of memory situations when it is not known why systems have become sluggish or fail in odd ways. Reviewed by: rwatson alc Approved by: scottl (mentor) peter Obtained from: Yahoo Inc.	2010-06-15 19:28:37 +00:00
Ulrich Spörlein	aa12cea2cc	mdoc: order prologue macros consistently by Dd/Dt/Os Although groff_mdoc(7) gives another impression, this is the ordering most widely used and also required by mdocml/mandoc. Reviewed by: ru Approved by: philip, ed (mentors)	2010-04-14 19:08:06 +00:00

1 2

86 Commits