== SIG_DFL or SIG_IGN. Sloppy code does not fully initialize struct
sigaction for such cases, and being too demanding in the case of
default handler does not catch anything.
Reported and tested by: Alex Tutubalin <lexa@lexa.ru>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- Close a migration race where callout_reset() failed to set the
CALLOUT_ACTIVE flag.
- Callout callback functions are now allowed to be protected by
spinlocks.
- Switching the callout CPU number cannot always be done on a
per-callout basis. See the updated timeout(9) manual page for more
information.
- The timeout(9) manual page has been updated to reflect how all the
functions inside the callout API are working. The manual page has
been made function oriented to make it easier to deduce how each of
the functions making up the callout API are working without having
to first read the whole manual page. Group all functions into a
handful of sections which should give a quick top-level overview
when the different functions should be used.
- The CALLOUT_SHAREDLOCK flag and its functionality has been removed
to reduce the complexity in the callout code and to avoid problems
about atomically stopping callouts via callout_stop(). If someone
needs it, it can be re-added. From my quick grep there are no
CALLOUT_SHAREDLOCK clients in the kernel.
- A new callout API function named "callout_drain_async()" has been
added. See the updated timeout(9) manual page for a complete
description.
- Update the callout clients in the "kern/" folder to use the callout
API properly, like cv_timedwait(). Previously there was some custom
sleepqueue code in the callout subsystem, which has been removed,
because we now allow callouts to be protected by spinlocks. This
allows us to tear down the callout like done with regular mutexes,
and a "td_slpmutex" has been added to "struct thread" to atomically
teardown the "td_slpcallout". Further the "TDF_TIMOFAIL" and
"SWT_SLEEPQTIMO" states can now be completely removed. Currently
they are marked as available and will be cleaned up in a follow up
commit.
- Bump the __FreeBSD_version to indicate kernel modules need
recompilation.
- There has been several reports that this patch "seems to squash a
serious bug leading to a callout timeout and panic".
Kernel build testing: all architectures were built
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D1438
Sponsored by: Mellanox Technologies
Reviewed by: jhb, adrian, sbruno and emaste
more generally make it easier to extend 'struct mbuf in the future', make
a number of changes to the data structure:
- As we anticipate embedding mbufs headers within variable-size regions of
memory in the future, change the definitions of byte arrays embedded in
mbufs to be of size [0] rather than [MLEN] and [MHLEN]. In fact, the
cxgbe driver already uses 'struct mbuf' on the front of other storage
sizes, but we would like the global mbuf allocator do be able to do this
as well.
- Fold 'struct m_hdr' into 'struct mbuf' itself, eliminating a set of
macros that aliased 'mh_foo' field names to 'm_foo' names such as
'm_next'. These present a particular problem as we would like to add
new mbuf-header fields -- e.g., 'm_size' -- that, if similarly named via
macros, would introduce collisions with many other variable names in the
kernel.
- Rename 'struct m_ext' to 'struct struct_m_ext' so that we can add
compile-time assertions without bumping into the still-extant 'm_ext'
macro.
- Remove the MSIZE compile-time assertion for 'struct mbuf', but add new
assertions for alignment of embedded data arrays (64-bit alignment even
on 32-bit platforms), and for the sizes the mbuf header, packet header,
and m_ext structure.
- Document that these assertions exist in comments in mbuf.h.
This change is not intended to cause (non-trivial) behavioural
differences, but is a precursor to further mbuf-allocator work.
Differential Revision: https://reviews.freebsd.org/D1483
Reviewed by: bz, gnn, np, glebius ("go ahead, I trust you")
Sponsored by: EMC / Isilon Storage Division
"delist_dev()" function. Make sure the character device structure
doesn't go away until the end of the "destroy_dev()" function due to
concurrently running cleanup code inside "devfs_populate()".
MFC after: 1 week
Reported by: dchagin@
in FreeBSD 7 that has not been used since. It contains a number
of unresolved bugs including an inverted bcopy() and incorrect
handling of read-only mbufs using internal storage. Removing this
unused code is substantially essier than fixing it in order to
update it to the coming mbuf world order -- but it can always be
restored from revision history if it turns out to prove useful for
future work.
Pointed out by: jmallett
Sponsored by: EMC / Isilon Storage Division
by reinitializing the 'freestate' pointer after freeing the memory.
Obtained from: HardenedBSD (71fab80c5dd3034b71a29a61064625018671bbeb)
PR: 194525
Submitted by: Oliver Pinter <oliver.pinter@hardenedbsd.org>
MFC after: 2 weeks
decade: m_pulldown() is willing to consider ordinary mbufs writable.
Retain another, related, and also outdated comment, but with a caveat
that it is partially stale. Do not, for now, address the problem that
it raises (that only EXT_CLUSTER external storage is considered
writable, regardless of the results of M_WRITABLE() on the mbuf).
MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
EOPNOTSUPP. The current behavior can mask real quiesce errors since
devclass_quiesce_driver() stops iterating over drivers as soon as it
gets an error (incluiding EOPNOTSUPP), but the caller it returns the
error to explicitly ignores EOPNOTSUPP.
Reviewed by: imp
kernel via the global cpuset_domain[] array. To export these to userland,
add a CPU_WHICH_DOMAIN level that can be used to fetch the mask for a
specific domain. Add a -d flag to cpuset(1) that can be used to fetch
the mask for a given domain.
Differential Revision: https://reviews.freebsd.org/D1232
Submitted by: jeff (kernel bits)
Reviewed by: adrian, jeff
with calls to the centralised macros, reducing direct use of MLEN and
MHLEN.
Differential Revision: https://reviews.freebsd.org/D1444
Reviewed by: bz
Sponsored by: EMC / Isilon Storage Division
code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly
identical and simply redefine a number of constants and helper subroutines;
a generic implementation will make it easier to implement features around
kernel core dumps. This change does not alter any minidump code and should
have no functional impact.
PR: 193873
Differential Revision: https://reviews.freebsd.org/D904
Submitted by: Conrad Meyer <conrad.meyer@isilon.com>
Reviewed by: jhibbits (earlier version)
Sponsored by: EMC / Isilon Storage Division
may perform a blocking memory allocation, which is unsafe when holding a
mutex.
Differential Revision: https://reviews.freebsd.org/D1443
Reviewed by: rwatson
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
may also halt in C2 and not just C3 (it seems that in some cases the BIOS
advertises its C3 state as a C2 state in _CST). Just play it safe and
disable both C2 and C3 states if a user forces the use of the TSC as the
timecounter on such CPUs.
PR: 192316
Differential Revision: https://reviews.freebsd.org/D1441
No objection from: jkim
MFC after: 1 week
the knowledge of mbuf layout, and in particular constants such as M_EXT,
MLEN, MHLEN, and so on, in mbuf consumers by unifying various alignment
utility functions (M_ALIGN(), MH_ALIGN(), MEXT_ALIGN() in a single
M_ALIGN() macro, implemented by a now-inlined m_align() function:
- Move m_align() from uipc_mbuf.c to mbuf.h; mark as __inline.
- Reimplement M_ALIGN(), MH_ALIGN(), and MEXT_ALIGN() using m_align().
- Update consumers around the tree to simply use M_ALIGN().
This change eliminates a number of cases where mbuf consumers must be aware
of whether or not mbufs returned by the allocator use external storage, but
also assumptions about the size of the returned mbuf. This will make it
easier to introduce changes in how we use external storage, as well as
features such as variable-size mbufs.
Differential Revision: https://reviews.freebsd.org/D1436
Reviewed by: glebius, trasz, gnn, bz
Sponsored by: EMC / Isilon Storage Division
Phabric: https://reviews.freebsd.org/D1247
Reviewed by: jhb, avg
Sponsored by: Spectra Logic Corporation
sys/kern_subr_taskqueue.c:
Modify taskqueue_drain_all() processing to use a temporary
"barrier task", rather than rely on a user task that may
be destroyed during taskqueue_drain_all()'s execution. The
barrier task is queued behind all previously queued tasks
and then has its priority elevated so that future tasks
cannot pass it in the queue.
Use a similar barrier scheme to drain threads processing
current tasks. This requires taskqueue_run_locked() to
insert and remove the taskqueue_busy object for the running
thread for every task processed.
share/man/man9/taskqueue.9:
Remove warning about live-lock issues with taskqueue_drain_all()
and indicate that it does not wait for tasks queued after
it begins processing.
in r276564, change path type to char * (pathnames are always char *).
And remove bogus casts of malloc().
kern___getcwd() internally doesn't actually use or support u_char *
paths, except to copy them to a normal char * path.
These changes are not visible to libc as libc/gen/getcwd.c misdeclares
__getcwd() as taking a plain char * path.
While here remove _SYS_SYSPROTO_H_ for __getcwd() syscall as
we always have sysproto.h.
Pointed out by: bde
MFC after: 1 week
clients, hence they might not handle it very well. This change allows
debugging mutex problems with kernel console drivers when
"debug.witness.skipspin=0" is set in the boot environment.
MFC after: 1 week
witness printouts in the console driver clients can cause this mutex
to recurse by calls to "printf()" from witness for example. In
particular this can happen if "debug.witness.skipspin=0" is set in the
boot environment.
MFC after: 1 week
locks/unlocks the vnode and does a VOP_GETATTR()
for the SEEK_END case. This is safe to do, since
lf_advlock{async}() only uses the size argument
for the SEEK_END case.
The NFSv4 server needs this when
vfs.nfsd.enable_locallocks!=0 since locking the
vnode results in a LOR that can cause a deadlock
for the nfsd threads.
Reviewed by: kib
MFC after: 1 week
the created file name was cached. Use the flag for core dumps.
Requested by: rpaulo
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
(UTC) rather than the archaic (GMT) in comments. Except where the
comments are making fun of people doing this (and pedants who insist
on the new terms).
why there could appear a zero-sized mbufs in socket buffers.
A proper fix would be to divorce record socket buffers and stream
socket buffers, and divorce pru_send that accepts normal data from
pru_send that accepts control data.
in sbappend_locked() and sbappendrecord_locked().
This is a quick fix to the panic introduced by r274712.
A proper solution should be to make sosend_generic() avoid calling
pru_send() with NULL mbuf for the protocols that do not understand
control messages. Those protocols that understand control messages,
should be able to receive NULL mbuf, if control is non-NULL.
into namecache, to avoid cache trashing when doing large operations.
E.g., tar archive extraction is not usually followed by access to many
of the files created.
Right now, each VOP_LOOKUP() implementation explicitely knowns about
this quirk and tests for both MAKEENTRY flag presence and op != CREATE
to make the call to cache_enter(). Centralize the handling of the
quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP.
VFS now sets NOCACHE flag for CREATE namei() calls.
Note that the change in semantic is backward-compatible and could be
merged to the stable branch, and is compatible with non-changed
third-party filesystems which correctly handle MAKEENTRY.
Suggested by: Chris Torek <torek@pi-coral.com>
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
the orphaned descendants. Base of the API is modelled after the same
feature from the DragonFlyBSD.
Requested by: bapt
Reviewed by: jilles (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
A _NEW flag passed to _init_flags() to avoid check for double-init.
Differential Revision: https://reviews.freebsd.org/D1208
Reviewed by: jhb, wblock
MFC after: 1 Month
feature is to quisce the system before suspend.
Stop is implemented by reusing the thread_single(9) with the special
mode SINGLE_ALLPROC. SINGLE_ALLPROC differs from the existing
single-threading modes by allowing (requiring) caller to operate on
other process. Interruptible sleeps for !TDF_SBDRY threads are
suspended like SIGSTOP does it, instead of aborting the sleep, like
SINGLE_NO_EXIT, to avoid spurious EINTRs on resume.
Provide debugging sysctl debug.stop_all_proc, which causes total stop
and suspends syncer, while waiting for variable reset for resume. It
is used for debugging; should be removed after the real use of the
interface is added.
In collaboration with: pho
Discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
filesystem specified VFCF_SBDRY flag, i.e. for NFS.
There are two issues with the sleeps. First, applications may get
unexpected EINTR from the disk i/o syscalls. Second, interruptible
sleep allows the stop of the process, and since mount point is
referenced while thread sleeps, unmount cannot free mount point
structure' memory, blocking unmount indefinitely.
Even for NFS, it is probably only reasonable to enable PCATCH for intr
mounts, but this information is currently not available at VFS level.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
creating delayed write buffers belonging to the reclaimed vnode. Put
the buffer cleanup code after inactivation.
Add asserts that ensure that buffer queues are empty and add BO_DEAD
flag for bufobj to check that no buffers are added after the cleanup.
BO_DEAD is only used by INVARIANTS-enabled kernels.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Since VFS does not/cannot stop writes, sync might run indefinitely, or
be a wrong thing to do at all. E. g. NFS ignores VFS_SYNC() for
forced unmounts, since non-responding server does not allow sync to
finish. On the other hand, filesystems can and do stop writes using
fs-specific facilities, and should already fully flush caches in
VFS_UNMOUNT() due to the race.
Adjust msdosfs tp sync in unmount for forced call, to accomodate the
new behaviour. Note that it is still racy, since writes are not
stopped.
Discussed with: avg, bjk, mckusick
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks