Some implementations of Xen don't expose the XENMEM_memory_map hypercall.
Shallow the error from XENMEM_memory_map in xen_arch_init_physmem() and just
return 0. This will fallback to using the non-arch specific mechanism for
allocating scratch space.
Reported by: cperciva
Reviewed by: Elliott Mitchell
Fixes: 69c47485b5 ('x86/xen: use UNUSABLE e820 regions for external mappings')
Sponsored by: Cloud Software Group
Differential revision: https://reviews.freebsd.org/D46205
Except for elements whose value is zero, the elements of pagesizes[] are
always sorted in increasing order, so once a loop starting from the end
of the array has found a non-zero element, it has found the largest
valued element and can stop iterating.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D46215
Attempting to reduce vm_pageout_page_count at startup when the machine
has less than 8MB of physical memory is pointless, since we haven't run
on machines with so little memory in ages.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D46206
netmap's generic mode tries to improve performance by minimizing mbuf
allocations. In service of this goal, it maintains an extra reference
to the mbuf and polls the counter to see if the driver has released its
reference by calling m_freem(). As a result, the extref destructor is
not called when expected by the netfront driver, and mbufs tags are not
freed.
Modify the tx path to release its mbuf tags promptly when reclaiming tx
descriptors. They are drawn from a fixed-size pool, so otherwise are
quickly exhausted when a netfront interface is in netmap generic mode.
Co-authored by: royger
MFC after: 2 weeks
Fixes: dabb3db7a8 ("xen/netfront: deal with mbuf data crossing a page boundary")
Sponsored by: Cloud Software Group
Sponsored by: Klara, Inc.
Sponsored by: Zenarmor
These syscall muxes are under COMPAT7 or earlier and AFACT they were
only ever used in libc. The which arguments seems to have never had a
published API and it was a mistake that they were exported or declared.
Reviewed by: kib, jhb
Differential Revision: https://reviews.freebsd.org/D46209
sndstat(4) falsely reports "hardware" as the starting point of
recording, and ending point of playback VCHANs. Recording VCHANs get
their input from the primary recording channel, and playback VCHANs send
their input to the primary playback channel.
Sponsored by: The FreeBSD Foundation
MFC after: 2 days
Reviewed by: dev_submerge.ch, markj
Differential Revision: https://reviews.freebsd.org/D46177
Xen PVH entry point requires to modify the environment provided by the boot
loader, so that the ACPI RSDP is re-written to use the Xen generated RSDP
instead of the native one.
The current logic in the PVH entry point reserves a single page (4K) in order
to copy the contents of the environment passed from the boot loader, so that
the bootloader provided "acpi.rsdp" is dropped and a Xen specific one is added
afterwards.
This however doesn't scale well, as it's possible for the environment to be
bigger than 4K. Bumping the buffer, or attempting to peek at the size of the
metadata all seem to just add more complexity to a sensitive path. Instead
introduce a new ACPI hook that allows setting the RSDP address directly, and
use it from the PVH entry point to set the position of the Xen generated RSDP.
This allows to reduce the logic in the PVH metadata processing, as there's no
need to parse and filter the bootloader provided environment.
Note that modifying the environment blob in-place is likely to not work. The
RSDP address is provided as a string, it's possible the new RSDP location is
higher than the current one, and the string with the new location would overrun
the space used by the previous one.
Sponsored by: Cloud Software Group
PR: 277200
MFC: 3 days
Reviewed by: markj kib
Differential revision: https://reviews.freebsd.org/D46089
The array passed to vm_pageout_flush, and constructed in a middle-out
fashion, can never use array element zero. Shrink the array by one,
and reduce indices by one, to save that bit of stack space. In the
vm_object version, make the accounting look more like the pageout
version.
Reported by: alc
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D46208
dmu_buf_will_clone() calls arc_buf_destroy() if there is an associated
ARC buffer with the dbuf. However, this can only be done conditionally.
If the previous dirty record's dr_data is pointed at db_dbf then
destroying it can lead to NULL pointer deference when syncing out the
previous dirty record.
This updates dmu_buf_fill_clone() to only call arc_buf_destroy() if the
previous dirty records dr_data is not pointing to db_buf. The block
clone wil still set the dbuf's db_buf and db_data to NULL, but this will
not cause any issues as any previous dirty record dr_data will still be
pointing at the ARC buffer.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes#16337
Add missing flags to veriexec(8) as well as some examples to
help explain usage.
Also add veriexec.4
Sponsored by: Juniper Networks, Inc.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D46207
The largest loader that works for PXE boot is about 500k. PXE needs low
memory for packets and other driver state, so the largest safe size for
the loader is about 500k. Reduce the size from 560k to 500k so we don't
accidentally break PXE in the future.
Add a comment for people with special needs. If you control the
hardware, it can be safe to have boot loaders as large as 580k or 600k
in some cases. Since the BIOS loader is becoming more and more of a
legacy item, the build variable LOADERSIZE isn't documented. This change
doesn't change that: there's been little demand for this documentation
and in general, users shouldn't change it lightly.
PR: 257018
Sponsored by: Netflix
The definitions in _stdint.h has some complications around visibility
that _limits.h does not have. Switch to __SIZE_T_MAX to avoid those.
This fixes the devel/gperf, devel/glib20 and math/mpfr builds with
_FORTIFY_SOURCE enabled to unlock a large fraction of the ports tree to
build.
Reported by: Shawn Webb (HardenedBSD)
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
In the zfree() case, the explicit_bzero() calls zero all the allocation,
including the redzone which malloc() has marked as invalid. So calling
kasan_mark() before those is in fact necessary.
This fixes a crash at boot when 'ldconfig' is run and tries to get
random bytes through getrandom() (relevant part of the stack is
read_random_uio() -> zfree() -> explicit_bzero()) for kernels with KASAN
compiled in.
Approved by: markj (mentor)
Fixes: 4fab5f0054 ("kern_malloc: fold free and zfree together into one __always_inline func")
MFC after: 10 days
MFC with: 4fab5f0054
Sponsored by: The FreeBSD Foundation
Fix multiple build errors on FreeBSD.
The main reason is, that the variable 'dxattr_obj' is used
uninitialized within the start of the 'out label'.
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Change 2d39824195 switched net.add_addr_allfibs default to 0. The
warning message is for potential users of the feature. Well since all
supported releases have 0 as default, those potential users may have
already gotten the notification, emitting this WARNING every time
increasing the fib number is less useful but rather confusing to other
users. So let's suppress it right now.
PR: 280097
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D45971
A new instance of using ld with -T to bring in the kernel ld script
crept into the tree after I originally did the refactoring. It too needs
-L ${SYSDIR}/conf added.
Fixes: 37d6d682af
Sponsored by: Netflix
Add new symbols defined in dwarf 4 and dwarf 5.
Submitted by: Matt Macy (in D17982, done differently)
Sponsored by: Netflix
Reviewed by: kib, markj, emaste
Differential Revision: https://reviews.freebsd.org/D44072
Move a copy of amd64's debug code into debug.ldscript. Make all the
kernels use this. This has the effect of modernizing the STABS for
powerpc as the others were almost already in sync. For the ones that
weren't this adds the DWARF 3 debug symbols from i386/amd64.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D44071
sa_add_projid() gets called via zfs_setattr() for setting project id
on old file/dir, which were created before upgrading to project quota
feature. This function does lookup for all possible SA and update them
all together along with project ID at needed fixed offset. But its
missing lookup and update of SA_ZPL_DXATTR, effectively it losses
SA_ZPL_DXATTR.
Closes#16287
Signed-off-by: Jitendra Patidar <jitendra.patidar@nutanix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Make sure log record don't stray beyond valid memory region.
There is a lack of verification of the space occupied by fixed members
of lr_t in the zil_parse.
We can create a crafted image to trigger an out of bounds read by
following these steps:
1) Do some file operations and reboot to simulate abnormal exit
without umount
2) zil_chain.zc_nused: 0x1000
3) First lr_t
lr_t.lrc_txtype: 0x0
lr_t.lrc_reclen: 0x1000-0xb8-0x1
lr_t.lrc_txg: 0x0
lr_t.lrc_seq: 0x1
4) Update checksum in zil_chain.zc_eck
Fix:
Add some checks to make sure the remaining bytes are large enough to
hold an log record.
Signed-off-by: XDTG <click1799@163.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
- In vmexit_smccc(), copy an assertion from amd64.
- In fbsdrun_addcpu(), make sure that our vm_suspend_cpu() call is
succesful.
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D46197
As on amd64, APs will repeatedly exit until they are brought online, so this
hack helps avoid burning CPU time during guest bootup.
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D46195
This fixes a number of clang 19 warnings:
sys/dev/qat/qat_api/common/compression/dc_session.c:154:15: error: comparison of different enumeration types ('enum _CpaBoolean' and 'icp_qat_hw_compression_delayed_match_t') [-Werror,-Wenum-compare]
154 | if (CPA_TRUE == pService->comp_device_data.enableDmm) {
| ~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sys/dev/qat/qat_api/common/compression/dc_session.c:285:17: error: comparison of different enumeration types ('enum _CpaBoolean' and 'icp_qat_hw_compression_delayed_match_t') [-Werror,-Wenum-compare]
285 | (CPA_TRUE == pService->comp_device_data.enableDmm) ?
| ~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The `enableDmm` field of variable `comp_device_data` is of type
`icp_qat_hw_compression_delayed_match_t`, not `CpaBoolean`. In this
case, we can seamlessly replace the value with
`ICP_QAT_HW_COMPRESSION_DELAYED_MATCH_ENABLED`, which is numerically
equal to `CPA_TRUE`.
MFC after: 3 days
This fixes a clang 19 warning:
sys/dev/isci/scil/scif_sas_smp_remote_device.c:197:26: error: comparison of different enumeration types ('SCI_IO_STATUS' (aka 'enum _SCI_IO_STATUS') and 'enum _SCI_STATUS') [-Werror,-Wenum-compare]
197 | if (completion_status == SCI_FAILURE_RETRY_REQUIRED)
| ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
The `completion_status` variable is of type `SCI_IO_STATUS`, not
`SCI_STATUS`. In this case, we can seamlessly replace the value with
`SCI_IO_FAILURE_RETRY_REQUIRED`, which is numerically equal to
`SCI_FAILURE_RETRY_REQUIRED`.
MFC after: 3 days
This fixes a clang 19 warning:
sys/dev/iavf/iavf_lib.c:514:39: error: comparison of different enumeration types ('enum virtchnl_vsi_type' and 'enum iavf_vsi_type') [-Werror,-Wenum-compare]
514 | if (sc->vf_res->vsi_res[i].vsi_type == IAVF_VSI_SRIOV)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
The `vsi_type` field of `struct virtchnl_vsi_resource` is of type `enum
virtchnl_vsi_type`, not `enum iavf_vsi_type`. In this case, we can
seamlessly replace the value with `VIRTCHNL_VSI_SRIOV`, which is
numerically equal to `IAVF_VSI_SRIOV`.
MFC after: 3 days
In commit 8e53cd7099 the intent was to add sys/dts/include to the
compiler include path, but this was spelled incorrectly, leading to an
error with clang 19:
cc: error: no such include directory: '$/dts/include' [-Werror,-Wmissing-include-dirs]
Use the spelling -I$S/dts/include instead.
MFC after: 3 days
This fixes a clang 19 warning:
usr.sbin/keyserv/crypt_server.c:237:53: error: comparison of different enumeration types ('des_mode' (aka 'enum des_mode') and 'enum desmode') [-Werror,-Wenum-compare]
237 | if (_my_crypt != &_arcfour_crypt && argp->des_mode == CBC) {
| ~~~~~~~~~~~~~~ ^ ~~~
The type of `argp->des_mode` (aka `desargs::des_mode`) is `enum
des_mode` from `/usr/include/rpcsvc/crypt.h`, not `enum desmode` from
`/usr/include/rpc/des.h` (which is used in `struct desparams`).
Luckily the integer values of `enum desmode`'s `CBC` and `ECB` are
identical to those of `enum des_mode`'s `CBC_DES` and `ECB_DES`, so
replace both values.
MFC after: 3 days
An earlier set of mixer(8) tests were removed leading to this entry,
but now the entry is removing the new tests.
Fixes: 94a86f3f69 mixer(8): Add tests
The Svpbmt extension provides specification of "Page-Based Memory
Types", or memory attributes (e.g. cacheability constraints).
Extend the pmap code to apply memory attributes when creating/updating
PTEs. This is done in a way which has no effect on CPUs lacking Svpbmt
support, and is non-hostile to alternate encodings of memory attributes
-- a future change will enable this for T-HEAD CPUs, which implement
this PTE feature in an different (incompatible) way.
Reviewed by: jhb
Tested by: br
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45471
For use in pmap_change_attr_locked(), where we might need to demote L1
pages in the DMAP.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45628
Don't pass through writes of the command register through to the
physical device. These registers do not need to be in sync, and in
some cases (e.g. when the guest is sizing the BAR and temporarily
disables decoding), the states need to diverge.
PR: 205549
Reviewed by: corvink
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D46179
This fixes a clang 19 warning:
crypto/heimdal/lib/krb5/deprecated.c:75:17: error: comparison of different enumeration types ('krb5_keytype' (aka 'enum ENCTYPE') and 'enum krb5_keytype_old') [-Werror,-Wenum-compare]
75 | if (keytype != KEYTYPE_DES || context->etypes_des == NULL)
| ~~~~~~~ ^ ~~~~~~~~~~~
In https://github.com/heimdal/heimdal/commit/3bebbe5323 this was solved
by adding a cast. That commit is rather large, so I'm only applying the
one-liner here.
MFC after: 3 days
Linux provides SLAB_RECLAIM_ACCOUNT and __GFP_RECLAIMABLE flags to
mark memory allocations that can be freed via shinker calls. It
should allow kernel to tune and group such allocations for lower
memory fragmentation and better reclamation under pressure.
This patch marks as reclaimable most of ARC memory, directly
evictable via ZFS shrinker, plus also dnode/znode/sa memory,
indirectly evictable via kernel's superblock shrinker.
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Pass VM_NOFREE_IMPORT_ORDER to vm_phys_alloc_pages instead of
VM_LEVEL_0_ORDER when allocating a higher-order page for
the NOFREE page allocator.
Reported by: alc
Fixes: a8693e8
This mirrors ppoll's visibility in sys/poll.h and fixes a build issue
with some _POSIX_C_SOURCE requests due to missing the sigset_t typedef.
Reported by: eduardo
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
This patch modifies pmap_growkernel in all pmaps to use VM_ALLOC_NOFREE
when allocating new pagetable pages. This should help reduce longterm
fragmentation as these pages are never released after
they are allocated.
Differential Revision: https://reviews.freebsd.org/D45998
Reviewed by: alc, markj, kib, mhorne
Tested by: alc
This patch adds a new KVA arena for separating M_NEVERFREED allocations.
Separating KVAs for pages that are never freed should facilitate
superpage promotion in the kernel.
Differential Revision: https://reviews.freebsd.org/D45997
Reviewed by: alc, kib, markj
Tested by: alc