... instead of deferring the action until first open.
Unlike upstream this has no benefit on FreeBSD.
We know that as soon as the provider is created it is going to be tasted
and thus opened. Initial mediasize of zero causes tasting failure
and subsequent retasting because of the size change.
MFC after: 14 days
This should allow to mount a dataset as a root filesystem even if
it belongs to a pool that is not described in zpool.cache.
This adds some overhead to the boot process though.
If the root filesystem's pool is found in zpool.cache, the by default
its cached configuration will be used for import.
vfs.zfs.rootpool.prefer_cached_config could be set to zero to force
the config to be retasted.
Discussed with: gibbs, pjd, des
MFC after: 25 days
doesn't exist on a dataset we are starting from. For example if we
have the following configuration:
tank
tank/foo
tank/foo@snap
tank/bar
tank/bar@snap
We can execute:
# zfs destroy -t tank@snap
eventhough tank@snap doesn't exit.
Unfortunately it is not possible to do the same with recursive rename:
# zfs rename -r tank@snap tank@pans
cannot open 'tank@snap': dataset does not exist
...until now. This change allows to recursively rename snapshots even if
snapshot doesn't exist on the starting dataset.
Sponsored by: rsync.net
MFC after: 2 weeks
The code builds a map of regions that were freed. On every write the
code consults the map and eventually removes ranges that were freed
before, but are now overwritten.
Freed blocks are not TRIMed immediately. There is a tunable that defines
how many txg we should wait with TRIMming freed blocks (64 by default).
There is a low priority thread that TRIMs ranges when the time comes.
During TRIM we keep in-flight ranges on a list to detect colliding
writes - we have to delay writes that collide with in-flight TRIMs in
case something will be reordered and write will reached the disk before
the TRIM. We don't have to do the same for in-flight writes, as
colliding writes just remove ranges to TRIM.
Sponsored by: multiplay.co.uk
This work includes some important fixes and some improvements obtained
from the zfsonlinux project, including TRIMming entire vdevs on pool
create/add/attach and on pool import for spare and cache vdevs.
Obtained from: zfsonlinux
Submitted by: Etienne Dechamps <etienne.dechamps@ovh.net>
Do this by checking if spa_namespace_lock is already held and not taking
it again in that case.
Add a comment explaining why that is done and why it is safe.
Reviewed by: pjd
MFC after: 24 days
Since all attribute values start at 8-byte aligned boundary, we would
previously incorrectly calculate dn_bonuslen if any attribute but the
last had a variable-length value with length not multiple of 8.
Reported by: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Tested by: Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com> (for upstream)
MFC after: 2 weeks
- skip length_idx index for a replaced variable-sized attribute
- skip length_idx index for a removed variable-sized attribute
- also re-arranged code to make sure that length_idx is always
incremented for variable-sized attributes
- additionally add an assertion that the number of actually produced
attributes is the same as the expected number of resulting
attributes
In cooperation with: Matthew Ahrens <mahrens@delphix.com>
Tested by: Trent Nelson <trent@snakebite.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com> (for upstream)
To do: get this upstreamed
MFC after: 2 weeks
When calling a revoke(2) on a dtrace device, dtrace_close() could be
called, even if threads are still stuck in the device. Defer the actual
deallocation of datastructures to the cdevpriv destructor.
While there, remove the unneeded D_TRACKCLOSE and D_NEEDMINOR flags. For
the helper device, we never need it. For the regular dtrace devices, we
only need these flags on FreeBSD pre-8.
MFC after: 1 month
1) It is not useful to call "devfs_clear_cdevpriv()" from
"d_close" callbacks, hence for example read, write, ioctl and
so on might be sleeping at the time of "d_close" being called
and then then freed private data can still be accessed.
Examples: dtrace, linux_compat, ksyms (all fixed by this patch)
2) In sys/dev/drm* there are some cases in which memory will
be freed twice, if open fails, first by code in the open
routine, secondly by the cdevpriv destructor. Move registration
of the cdevpriv to the end of the drm open routines.
3) devfs_clear_cdevpriv() is not called if the "d_open" callback
registered cdevpriv data and the "d_open" callback function
returned an error. Fix this.
Discussed with: phk
MFC after: 2 weeks
Update DTrace disassembler accordingly. The code to treat the prefixes
as null prefixes was already in place.
Although in practice compilers seem to generate only cs-prefix for use
in long NOPs, the same treatment is applied to all of cs, ds, es, ss for
consistency.
Reported by: emaste
Tested by: emaste
Obtained from: Illumos commit 13442:4adbe6de60c8 (+ local changes)
MFC after: 5 days
According to the AMD manual the whole range from 0x09 to 0x1f are NOPs.
Intel manual mentions only 0x1f. Use only Intel one for now, it seems
to be the one actually generated by compilers.
Use gdb mnemonic for the operation: "nopw".
[1] AMD64 Architecture Programmer's Manual
Volume 3: General-Purpose and System Instructions
[2] Software Optimization Guide for AMD Family 10h Processors
[3] Intel(R) 64 and IA-32 Architectures Software Developer’s Manual
Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z
Tested by: Fabian Keil <freebsd-listen@fabiankeil.de> (earlier version)
MFC after: 3 days
Bryan Cantrill implemented the equivalent of semi-log graph
paper for Dtrace so llquantize will use one logarithmic and
one linear scale.
Special thanks to Mark Peek for providing fix to an
assertion and to Fabian Keill for testing the port.
Illumos Revision: 13355:15b74a2a9a9d
Reference:
https://www.illumos/issues/905
Obtained from: Illumos
Tested by: Fabian Keill, mp
MFC after: 4 days
differentiate between an incremental and full stream.
Be sure not to generate guid equal to 0.
Reported by: someone who saw 0 being generated as 64bit random guid
MFC after: 3 days
The skew calculation here is exactly backwards. We were able to repro
it on a multi-package ESX server running a FreeBSD VM, where the TSCs
can be pretty evil.
MFC after: 1 week
Submitted by: Jeff Ford <jeffrey.ford2@isilon.com>
Reviewed by: avg, gnn
1948 zpool list should show more detailed pool information
Display per-vdev information with "zpool list -v".
The added expandsize property has currently no value on FreeBSD.
This changeset allows adding expansion support to individual vdevs
in the future.
References:
https://www.illumos.org/issues/1948
Obtained from: illumos (issue #1948)
MFC after: 2 weeks
vn_rlimit_fsize takes uio->uio_offset and uio->uio_resid into account
when determining whether given write would exceed RLIMIT_FSIZE.
When APPEND flag is specified, ZFS updates uio->uio_offset to point to the
end of file.
But this happens after a call to vn_rlimit_fsize, so vn_rlimit_fsize check
can be rendered ineffective by thread that opens some file with O_APPEND
and lseeks below RLIMIT_FSIZE before calling write.
Submitted by: Mateusz Guzik <mjguzik at gmail dot com>
MFC after: 2 weeks
2703 add mechanism to report ZFS send progress
If the zfs send command is used with the -v flag, the amount of bytes
transmitted is reported in per second updates.
References:
https://www.illumos.org/issues/2703
Obtained from: illumos (issue #2703)
MFC after: 2 weeks
to follow the example of OpenSolaris and its descendants, which implemented
cpu as an inline that took a value out of curthread. At certain points in
the FreeBSD scheduler curthread->td_oncpu will no longer be valid (in
particukar, just before the thread gets descheduled) so instead I have
implemented this as its own built-in variable.
Sponsored by: Sandvine Inc.
MFC after: 1 week
accesses of the cache member of vm_object objects.
- Use novel vm_page_is_cached() for checks outside of the vm subsystem.
Reviewed by: alc
MFC after: 2 weeks
X-MFC: r234039