When a da or ada device dissappears, outstanding IOs fail with
ENXIO, not EIO. The check for EIO was probably copied from Illumos,
where that is indeed the correct errno.
Without this change, pulling a busy drive from a zpool would usually
turn it into UNAVAIL, even though pulling an idle drive would turn
it into REMOVED. With this change, it is REMOVED every time.
Also, vdev_geom_io_intr shouldn't do zfs_post_remove, because that
results in devd getting two resource.fs.zfs.removed events. The
comment said that the event had to be sent directly instead of
through the async removal thread because "the DE engine is using
this information to discard prevoius I/O errors". However, the fact
that vdev_geom_io_intr was never actually sending the events until
now, and that vdev_geom_orphan never sent them at all, and that
vdev_geom_orphan usually gets called about 2 seconds after the
actual removal, means that FreeBSD's userland can cope with a late
event just fine.
Approved by: ken (mentor)
Sponsored by: Spectra Logic Corporation
MFC after: 4 weeks
In case of 4K allocation quantum that means for allocations up to 128K.
With growth of memory fragmentation these lists may grow to quite a large
sizes (tenths and hundreds of thousands items). Having in one list items
of different sizes in worst case may require full linear list traversal,
that may be very expensive. Having lists for items of single size means
that unless user specify some alignment or border requirements (that are
very rare cases) first item found on the list should satisfy the request.
While running SPEC NFS benchmark on top of ZFS on 24-core machine with
84GB RAM this change reduces CPU time spent in vmem_xalloc() from 8%
and lock congestion spinning around it from 20% to invisible levels.
And that all is by the cost of just 26 more pointers per vmem instance.
If at some point our kernel will start to actively use KVA allocations
with odd sizes above 128K, something may need to be done to bigger lists
also.
all the structures. While here, move a helper struct only used in
the kernel parser out of this header since it is not part of the MP
specification itself.
completely full, we'd not complete any of the mbufs due to the fence
post error (this creates a large leak). When this is fixed, we still
leak, but at a much smaller rate due to a race between ateintr and
atestart_locked as well as an asymmetry where atestart_locked is
called from elsewhere. Ensure that we free in-flight packets that
have completed there as well. Also remove needless check for NULL on
mb, checked earlier in the loop and simplify a redundant if.
MFC after: 3 days
This change is a proof of concept on how to easily integrate existing
tests from the tools/regression/ hierarchy into the /usr/tests/ test
suite and on how to adapt them to the new layout for src.
To achieve these goals, this change:
- Moves tests from tools/regression/bin/<tool>/ to bin/<tool>/tests/.
- Renames the previous regress.sh files to legacy_test.sh.
- Adds Makefiles to build and install the tests and all their supporting
data files into /usr/tests/bin/.
- Plugs the legacy_test test programs into the test suite using the new
TAP backend for Kyua (appearing in 0.8) so that the code of the test
programs does not have to change.
- Registers the new directories in the BSD.test.dist mtree file.
Reviewed by: freebsd-testing
Approved by: rpaulo (mentor)
This change fixes some cases where bsd.progs.mk would fail to handle
directories with SCRIPTS but no PROGS. In particular, "install" did
not handle such scripts nor dependent files when bsd.subdir.mk was
added to the mix.
This is "make tinderbox" clean.
Reviewed by: freebsd-testing
Approved by: rpaulo (mentor)
This file provides support to build test programs that comply with the
Test Anything Protocol. Its main goal is to support the painless
integration of existing tests from tools/regression/ into the Kyua-based
test suite.
Approved by: rpaulo (mentor)
When the guest is bringing up the APs in the x2APIC mode a write to the
ICR register will now trigger a return to userspace with an exitcode of
VM_EXITCODE_SPINUP_AP. This gets SMP guests working again with x2APIC.
Change the vlapic timer lock to be a spinlock because the vlapic can be
accessed from within a critical section (vm run loop) when guest is using
x2apic mode.
Reviewed by: grehan@
sys/cdefs.h. In particular, in case of COMPAT_43, param.h includes
sys/types.h, which includes sys/select.h, which includes
sys/_sigset.h. The _sigset.h customizes the provided definions based
on COMPAT_43, eliminating osigset_t if symbol is not defined. The
sys/proc.h is included after opt_compat.h and needs osigset_t.
Move opt_compat.h inclusion into the right place.
Sponsored by: The FreeBSD Foundation
the excess code in g_io_check(), bio_resid is also truncated by
g_io_deliver(). As result, bufdonebio() assigns truncated value to
the buffer b_resid field.
Use the residual bio_completed to calculate buffer b_resid from
b_bcount in bufdonebio(), instead of bio_resid, calculated from
bio_length in g_io_deliver().
The issue is seemingly caused by the code rearrange into g_io_check(),
which is not present in stable/10. The change still looks as the
useful change to have in 10 nevertheless.
Reported by: Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Tested by: pho, Stefan Hegnauer <stefan.hegnauer@gmx.ch>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
the bio is unmapped, so we must map the bio pages into pbuf. This
works around the geom classes which do not follow the MAXPHYS limit on
the i/o size, since such classes do not know about unmapped bios
either.
Reported by: Paolo Pinto <paolo.pinto@netasq.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Intel manual says: "If a transition is already in progress, transition to
a new value will subsequently take effect. Reads of IA32_PERF_CTL determine
the last targeted operating point." So seems it should be fine to just
trigger wanted transition and go. Linux does the same.
MFC after: 1 month
is actually sent by the remote node).
Otherwise it generated confusing "Negotiated protocol version 1" debug
messages when processing the second connection.
MFC after: 2 weeks
request back from the receive queue -- it might already be processed
by remote_recv_thread, which lead to crashes like below:
(primary) Unable to receive reply header: Connection reset by peer.
(primary) Unable to send request (Connection reset by peer):
WRITE(954662912, 131072).
(primary) Disconnected from kopusha:7772.
(primary) Increasing localcnt to 1.
(primary) Assertion failed: (old > 0), function refcnt_release,
file refcnt.h, line 62.
Taking the request back was not necessary (it would properly be
processed by the remote_recv_thread) and only complicated things.
MFC after: 2 weeks
indication when a request can be moved to done queue, but also for
detecting the current state of memsync request.
This approach has problems, e.g. leaking a request if memsynk ack from
the secondary failed, or racy usage of write_complete, which should be
called only once per write request, but for memsync can be entered by
local_send_thread and ggate_send_thread simultaneously.
So the following approach is implemented instead:
1) Use hio_countdown only for counting components we waiting to
complete, i.e. initially it is always 2 for any replication mode.
2) To distinguish between "memsync ack" and "memsync fin" responses
from the secondary, add and use hio_memsyncacked field.
3) write_complete() in component threads is called only before
releasing hio_countdown (i.e. before the hio may be returned to the
done queue).
4) Add and use hio_writecount refcounter to detect when
write_complete() can be called in memsync case.
Reported by: Pete French petefrench ingresso.co.uk
Tested by: Pete French petefrench ingresso.co.uk
MFC after: 2 weeks
would either exit on assertion, or, if assertions are not enabled,
fail to authenticate the target.
MFC after: 2 days
Sponsored by: The FreeBSD Foundation
(64MB). Even if we would find one somehow, ZFS kernel code rejects such
devices. It is funny to look on attempts to read 4 256K vdev labels from
1.44MB floppy, though it is not very practical and quite slow.
The An macros is used for authors while the Ar macro is used for arguments.
AFAIK mcast-addr and ifname are not authors.
PR: docs/184649
Submitted by: cnst++
MFC After: 3 days
Prior to the addition of cpp support into calendar itself
#include </usr/share/calendar/calendar.all>
was a legal construction in a calendar file.
Permit this again
after attempting to install to encrypted ZFS root (caused by a typo in a
variable name -- ZFSBOOT_BOOT_FSNAME -> ZFSBOOT_BOOTFS_NAME).
MFC after: 3 days