freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-24 11:29:10 +00:00

History

Attilio Rao 3d7acbbabf Fix several callout migration races: - Problem1: Hypothesis: thread1 is doing a callout_reset_on(), within his callout handler, willing to implicitly or explicitly migrate the callout. thread2 is draining the callout. Thesys: * thread1 calls callout_lock() and locks the old callout cpu * thread1 performs the checks in the first path of the callout_reset_on() * thread1 hits this codepiece: /* * If the lock must migrate we have to check the state again as * we can't hold both the new and old locks simultaneously. / if (c->c_cpu != cpu) { c->c_cpu = cpu; CC_UNLOCK(cc); goto retry; } which means it will drop the lock and 'retry' thread2 will callout_lock() and locks the new callout cpu. thread1 spins on the new lock and will not keep going for the moment. * thread2 checks that the callout is not pending (as callout is currently running) and that it is not on cc->cc_curr (because cc now refers to the new callout and the callout is running on the old callout cpu) thus it thinks it is done and returns. * thread1 will now acquire the lock and then adds the callout to the new callout cpu queue That seems an obvious race as callout_stop() falsely reports the callout stopped or worse, callout_drain() falsely returns while the callout is still in use. - Solution1: Fixing this problem would require, in general, to lock both callout cpus at once while switching the c_cpu field and avoid cyclic deadlocks between callout cpus locks. The concept of CPUBLOCK is then introduced (working more or less like the blocked_lock for thread_lock() function) meaning: "in callout_lock(), spin until the c->c_cpu is not different from CPUBLOCK". That way the "original" callout cpu, referred to the above mentioned code snippet, will remain blocked until the lock handover is over critical path will remain covered. - Problem2: Having the callout currently executed on a specific callout cpu and contemporary pending on another callout cpu (as it can happen with current code) breaks, at least, the assumption callout_drain() returns just once the callout cannot be referenced anymore. - Solution2: Callout migration is deferred if the current callout is already under execution. The best place to do that is in softclock() and new members are added to the callout cpu structure in order to specify a pending migration is requested. That is necessary because the callout cannot be trusted (not freed) the 100% of times after the execution of the callout handler. CPUBLOCK will prevent, in the "deferred migration" case, that the callout gets freed in this case, stopping any callout_stop() and callout_drain() possible activity until the migration is actually performed. - Problem3: There is a further race in callout_drain(). In order to avoid a race between sleepqueue lock and callout cpu spinlock, in _callout_stop_safe(), the callout cpu lock is dropped, the sleepqueue lock is acquired and a new callout cpu lookup is performed. Note that the channel used for locking the sleepqueue is obtained from the "current" callout cpu (&cc->cc_waiting). If the callout migrated in the meanwhile, callout_drain() will end up using the wrong wchan for the sleepqueue (the locked one will be the older, while the new one will not really be locked) leading to a lock leak and a race access to sleepqueue. - Solution3: It is enough to check if a migration happened between the operation of acquiring the sleepqueue lock and the new callout cpu lock and eventually unwind all those and try again. This problems can lead to deathly races on moderate (4-ways) SMP environment, leading to easy panic or deadlocks. The 24-ways of the reporter, could easilly panic, with completely normal workload, almost daily. gianni@ kindly wrote the following prof-of-concept which can panic a FreeBSD machine in less than one hour, in smaller SMP: http://www.freebsd.org/~attilio/callout/test.c Reported by: Nicholas Esborn <nick at desert dot net>, DesertNet In collabouration with: gianni, pho, Nicholas Esborn Reviewed by: jhb MFC after: 1 week () Usually, I would aim for a larger MFC timeout, but I really want this in before 8.2-RELEASE, thus re@ accepted a shorter timeout as a special case for this patch		2010-12-29 18:17:36 +00:00
..
amd64	Increase size of pcb_flags to four bytes.	2010-12-22 19:57:03 +00:00
arm	IXP4XX_GPIO_{,UN}LOCK() don't take args. Remove the sc here to make	2010-12-23 19:28:50 +00:00
boot	Give a bit of a hint of the failure (read != expected) but don't make	2010-11-25 03:16:31 +00:00
bsm
cam	Fix a few issues related to the XPT_GDEV_ADVINFO CCB.	2010-12-10 21:38:51 +00:00
cddl	cyclic xcall: use smp_no_rendevous_barrier as setup function parameter	2010-12-17 18:22:50 +00:00
compat	Merge amd64 and i386 bus.h and move the resulting header to x86. Replace	2010-12-20 16:39:43 +00:00
conf	MIPS has lots of flavors as well	2010-12-28 22:49:28 +00:00
contrib	Update firmware for wpi(4) from version 2.14.4 to 15.32.2.9.	2010-12-19 11:37:44 +00:00
crypto	Remove DEBUG sections.	2010-11-27 15:41:44 +00:00
ddb
dev	Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4).	2010-12-29 12:11:07 +00:00
fs	Delete the nfsvno_localconflict() function in the experimental	2010-12-28 23:50:13 +00:00
gdb	there must be only one SYSINIT with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY order	2010-09-30 17:05:23 +00:00
geom	Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4).	2010-12-29 12:11:07 +00:00
gnu	Remove prtactive variable and related printf()s in the vop_inactive	2010-11-19 21:17:34 +00:00
i386	Revert r216777, per jhb@	2010-12-28 22:45:29 +00:00
ia64	Revert r216134. This checkin broke platforms where bus_space are macros:	2010-12-03 07:09:23 +00:00
isa	bus_add_child: change type of order parameter to u_int	2010-09-10 11:19:03 +00:00
kern	Fix several callout migration races:	2010-12-29 18:17:36 +00:00
kgssapi
libkern	Add support for asterisk characters when filling in the GELI password	2010-11-14 14:12:43 +00:00
mips	When allocating memory from bootmem for the kernel to use, try to leave about	2010-12-28 20:11:54 +00:00
modules	Update firmware for wpi(4) from version 2.14.4 to 15.32.2.9.	2010-12-19 11:37:44 +00:00
net	Introduce and use a new VM interface for temporarily pinning pages. This	2010-12-25 21:26:56 +00:00
net80211	The meshid element is memcpy()'ed into se_meshid if included in either	2010-11-22 19:01:47 +00:00
netatalk
netgraph	Simplify ng_pipe locking model by relying on the netgraph framework	2010-11-24 16:02:58 +00:00
netinet	Add a comment for the ccv member of struct tcpcb.	2010-12-28 12:37:57 +00:00
netinet6	Improve plausibility check in sctp_handle_sack().	2010-12-22 17:59:38 +00:00
netipsec	After some off-list discussion, revert a number of changes to the	2010-11-22 19:32:54 +00:00
netipx
netnatm
netncp
netsmb
nfs	Fix the type of the 3rd argument for nm_getinfo so that it works	2010-10-19 11:55:58 +00:00
nfsclient	Remove prtactive variable and related printf()s in the vop_inactive	2010-11-19 21:17:34 +00:00
nfsserver	ZFS might not return monotonically increasing directory offset cookies,	2010-12-28 21:12:15 +00:00
nlm	Modify the NFS clients and the NLM so that the NLM can be used	2010-10-19 00:20:00 +00:00
opencrypto	Let cryptosoft(4) add its pseudo-device with a specific unit number and its	2010-11-14 13:09:32 +00:00
pc98	Merge amd64 and i386 bus.h and move the resulting header to x86. Replace	2010-12-20 16:39:43 +00:00
pci	Remove standard PCI configuration space register definitions.	2010-11-08 22:10:51 +00:00
powerpc	Only keep track of PTE validity statistics for pages not locked in the	2010-12-28 17:02:15 +00:00
rpc	Fix the krpc so that it can handle NFSv3,UDP mounts with a read/write	2010-10-13 00:57:14 +00:00
security	Fix typos.	2010-11-09 10:59:09 +00:00
sparc64	On UltraSPARC-III+ and greater take advantage of ASI_ATOMIC_QUAD_LDD_PHYS,	2010-12-29 16:59:33 +00:00
sun4v	Revert r216134. This checkin broke platforms where bus_space are macros:	2010-12-03 07:09:23 +00:00
sys	- Follow r216313, the sched_unlend_user_prio is no longer needed, always	2010-12-29 09:26:46 +00:00
teken	Use proper bounds checking on VPA.	2010-12-05 10:15:23 +00:00
tools	Add an extra comment to the SDT probes definition. This allows us to get	2010-08-22 11:18:57 +00:00
ufs	Add kernel side support for BIO_DELETE/TRIM on UFS.	2010-12-29 12:25:28 +00:00
vm	Move the increment of vm object generation count into	2010-12-29 12:53:53 +00:00
x86	Drop the icu_lock spinlock while pausing briefly after masking the	2010-12-23 15:17:28 +00:00
xdr
xen	Fix a typo in a comment.	2010-12-14 20:57:40 +00:00
Makefile	Add lex and yacc sources to things cscope'd.	2010-11-21 03:58:11 +00:00