1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-01 08:27:59 +00:00
Commit Graph

9077 Commits

Author SHA1 Message Date
Mitchell Horne
a89262079e Consistently provide ffs/fls using builtins
Use of compiler builtin ffs/ctz functions will result in optimized
instruction sequences when possible, and fall back to calling a function
provided by the compiler run-time library. We have slowly shifted our
platforms to take advantage of these builtins in 60645781d6 (arm64),
1c76d3a9fb (arm), 9e319462a0 (powerpc, partial).

Some platforms still rely on the libkern implementations of these
functions provided by libkern, namely riscv, powerpc (ffs*, flsll), and
i386 (ffsll and flsll). These routines are slow, as they perform a
linear search for the bit in question. Even on platforms lacking
dedicated bit-search instructions, such as riscv, the compiler library
will provide better-optimized routines, e.g. by using binary search.

Consolidate all definitions of these functions (whether currently using
builtins or not) to libkern.h. This should result in equivalent or
better performing routines in all cases.

One wart in all of this is the existing HAVE_INLINE_F*** macros, which
we use in a few places to conditionally avoid the slow libkern routines.
These aren't easily removed in one commit. For now, provide these
defines unconditionally, but marked for removal after subsequent
cleanup.

Removal of the now unused libkern routines will follow in the next
commit.

Reviewed by:	dougm, imp (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D40698
2023-07-06 14:46:41 -03:00
Mark O'Donovan
b0d3d44dfe qlnxe: add driver to amd64 NOTES
Signed-off-by: Mark O'Donovan <shiftee@posteo.net>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/779
2023-07-01 11:06:59 -06:00
Alan Cox
0d2f98c2f0 amd64 pmap: Tidy up pmap_promote_pde() calls
Since pmap_ps_enabled() is true by default, check it inside of
pmap_promote_pde() instead of at every call site.

Modify pmap_promote_pde() to return true if the promotion succeeded and
false otherwise.  Use this return value in a couple places.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D40744
2023-06-24 13:09:04 -05:00
Alan Cox
34eeabff5a amd64/arm64 pmap: Stop requiring the accessed bit for superpage promotion
Stop requiring all of the PTEs to have the accessed bit set for superpage
promotion to occur.  Given that change, add support for promotion to
pmap_enter_quick(), which does not set the accessed bit in the PTE that
it creates.

Since the final mapping within a superpage-aligned and sized region of a
memory-mapped file is typically created by a call to pmap_enter_quick(),
we now achieve promotions in circumstances where they did not occur
before, for example, the X server's read-only mapping of libLLVM-15.so.

See also https://www.usenix.org/system/files/atc20-zhu-weixi_0.pdf

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D40478
2023-06-12 13:40:57 -05:00
Warner Losh
9121945d70 Regenerate sysent stuff after $FreeBSD$ removal
Sponsored by:		Netflix
2023-06-09 07:28:27 -06:00
Dmitry Chagin
cbbac56091 linux(4): Preserve fpu xsave state across signal delivery on amd64
PR:			270247
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D40444
MFC after:		2 weeks
2023-06-09 01:33:26 +03:00
Dmitry Chagin
920184ed6e linux(4): In preparation for xsave refactor fxsave code on amd64
Due to fxsave area is os independent reimplement fxsave handmade code
using copying of a whole area.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D40443
MFC after:		2 weeks
2023-06-09 01:32:46 +03:00
Dmitry Chagin
84617f6fcc linux(4) rt_sendsig: Remove the use of caddr_t
Replace caddr_t by more appropriate char *.

MFC after:		2 weeks
2023-06-06 23:01:39 +03:00
Colin Percival
9d6ae1e3c2 Revert "Revert "tslog: Annotate some early boot functions""
Now that <sys/tslog.h> is wrapped in #ifdef _KERNEL, it's safe to have
tslog annotations in files which might be built from userland (i.e. in
subr_boot.c, which is built as part of the boot loader).

This reverts commit 59588a546f.
2023-06-04 22:49:38 -07:00
Xin LI
4d779448ad gve: Fix build on i386 and enable LINT builds.
Reviewed-by:	imp
Differential Revision: https://reviews.freebsd.org/D40419
2023-06-04 16:35:00 -07:00
Colin Percival
59588a546f Revert "tslog: Annotate some early boot functions"
The change to subr_boot.c broke the libsa build because the TSLOG
macros have their own definitions for the boot loader -- I didn't
realize that the loader code used subr_boot.c.

I'm currently testing a fix and I'll revert this revert once I'm
satisfied that everything works, but I don't want to leave the
tree broken for too long.

This reverts commit 469cfa3c30.
2023-06-04 11:39:45 -07:00
Colin Percival
45cc8519f5 tslog: Annotate parts of SYSINIT cpu
Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM,
SYSINIT cpu takes roughly 2770 us:
* 2280 us in vm_ksubmap_init
  * 535 us in kmem_malloc
    * 450 us in pmap_zero_page
  * 1720 us in pmap_growkernel
    * 1620 us in pmap_zero_page
* 80 us in bufinit
* 480 us in cpu_setregs
  * 430 us in cpu_setregs calling load_cr0

Much of this is hypervisor overhead: load_cr0 is slow because it traps
to the hypervisor, and 99% of the time in pmap_zero_page is spent when
we first touch the page, presumably due to the host Linux kernel
faulting in backing pages one by one.

Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D40327
2023-06-04 10:16:35 -07:00
Colin Percival
2404380aac tslog: Optionally instrument pmap_zero_page
Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM,
pmap_zero_page is responsible for 4.6 ms of the 25.0 ms of boot time.
This is not in fact time spent zeroing pages though; almost all of
that time is spent in a first-touch penalty, presumably due to the
host Linux kernel faulting in backing pages one by one.

There's probably a way to improve that by teaching Firecracker to
fault in all the VM's pages from the start rather than having them
faulted in one at a time, but that's outside of FreeBSD's control.

This commit adds a TSLOG_PAGEZERO option which enables TSLOG on the
amd64 pmap_zero_page function; it's a separate option (turned off
by default even if TSLOG is enabled) since zeroing pages happens
enough that it can easily fill the TSLOG buffer and prevent other
timing information from being recorded.

Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D40326
2023-06-04 10:16:31 -07:00
Colin Percival
469cfa3c30 tslog: Annotate some early boot functions
Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM,
hammer_time takes roughly 2740 us:
* 55 us in xen_pvh_parse_preload_data
  * 20 us in boot_parse_cmdline_delim
  * 20 us in boot_env_to_howto
* 15 us in identify_hypervisor
* 1320 us in link_elf_reloc
  * 1310 us in relocate_file1 handling ef->rela
* 25 us in init_param1
* 30 us in dpcpu_init
* 355 us in initializecpu
  * 255 us in initializecpu calling load_cr4
* 425 us in getmemsize
  * 280 us in pmap_bootstrap
    * 205 us in create_pagetables
* 10 us in init_param2
* 25 us in pci_early_quirks
* 60 us in cninit
* 90 us in kdb_init
* 105 us in msgbufinit
* 20 us in fpuinit
* 205 us elsewhere in hammer_time

Some of these are unavoidable (e.g. identify_hypervisor uses CPUID and
load_cr4 loads the CR4 register, both of which trap to the hypervisor)
but others may deserve attention.

Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D40325
2023-06-04 10:16:22 -07:00
Mark Johnston
18282c4772 sysarch: Add includes required for ktrcapfail() calls to be compiled
Reported by:	jfree
MFC after:	1 week
2023-06-01 17:18:23 -04:00
Mateusz Guzik
6217c2473d amd64: zero-pad register dumps on panic
de gustibus and so on

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-05-30 13:15:56 +00:00
Dmitry Chagin
eb98f77910 linux(4): Regen for linux_execve
MFC after:		2 month
2023-05-29 12:18:30 +03:00
Dmitry Chagin
8340b03425 linux(4): Add a dedicated linux_exec_copyin_args()
Because Linux allows to exec binaries with 0 argc.

Reviewed by:		brooks
Differential Revision:	https://reviews.freebsd.org/D40148
MFC after:		2 month
2023-05-29 12:18:16 +03:00
Dmitry Chagin
d706d02edb sysentvec: Retire sv_imgact_try as unneeded anymore
The sysentvec sv_imgact_try was used by kern_exec() to allow
non-native ABI to fixup shell path according to ABI root directory.
Since the non-native ABI can now specify its root directory directly
to namei() via pwd_altroot() call this facility is not needed anymore.

Differential Revision:	https://reviews.freebsd.org/D40092
MFC after:		2 month
2023-05-29 11:18:11 +03:00
Dmitry Chagin
57578deac7 Brandinfo: Retire emul_path as unneeded anymore
The Barndinfo emul_path was used by the Elf image activator to fixup
interpreter file name according to ABI root directory. Since the
non-native ABI can now specify its root directory directly to namei()
via pwd_altroot() call this facility is not needed anymore.

Differential Revision:	https://reviews.freebsd.org/D40091
MFC after:		2 month
2023-05-29 11:17:28 +03:00
Dmitry Chagin
fd745e1db6 linux(4): Use pwd_altroot() to tell namei() about ABI root path
PR:			72920
Differential Revision:	https://reviews.freebsd.org/D40090
MFC after:		2 month
2023-05-29 11:16:46 +03:00
Dmitry Chagin
78c2e58fa5 linux(4): Fix stack unwinding across signal frame on x86_64
Get rid of using register numbers which is undefined in libunwind
on x86_64.

Differential Revision:	https://reviews.freebsd.org/D40156
MFC after:		1 month
2023-05-28 17:07:28 +03:00
Dmitry Chagin
037b60fb0f linux(4): Preserve %rcx (return address) like a Linux do
Perhaps, this does not makes much sense as destroyng %rcx declared by
the x86_64 Linux syscall ABI. However,:
a) if we get a signal while we are in the kernel, we should restore
   tf_rcx when preparing machine context for signal handlers.
b) the Linux world is strange, someone can depend on %rcx value
   after syscall, something like go.

Differential Revision:	https://reviews.freebsd.org/D40155
MFC after:		1 month
2023-05-28 17:06:47 +03:00
Dmitry Chagin
185bd9fa30 linux(4): Simplify %r10 restoring on amd64
Restore %r10 at system call entry to avoid doing this multiply times.

Differential Revision:	https://reviews.freebsd.org/D40154
MFC after:		1 month
2023-05-28 17:06:23 +03:00
Dmitry Chagin
a463dd8108 linux(4): Add a comment explaining registers at syscall entry point on amd64
Differential Revision:	https://reviews.freebsd.org/D40153
MFC after:		1 month
2023-05-28 17:06:05 +03:00
Dmitry Chagin
a99b890ecd linux(4): Drop a weird comment from linux_set_syscall_retval on amd64
I agree, it would be great to avoid PCB_FULL_IRET, however we should
follow Linux system call ABI.

Reviewed by:		emaste
Differential Revision:	https://reviews.freebsd.org/D40152
MFC after:		1 month
2023-05-28 17:05:44 +03:00
Mark Johnston
9fb6718d1b smp: Dynamically allocate the stoppcbs array
This avoids bloating the kernel image when MAXCPU is large.

A follow-up patch for kgdb and other kernel debuggers is needed since
the stoppcbs symbol is now a pointer.  Bump __FreeBSD_version so that
debuggers can use osreldate to figure out how to handle stoppcbs.

PR:		269572
MFC after:	never
Reviewed by:	mjg, emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39806
2023-05-25 18:09:55 -04:00
Mark Johnston
e17eca3276 vmm: Avoid embedding cpuset_t ioctl ABIs
Commit 0bda8d3e9f ("vmm: permit some IPIs to be handled by userspace")
embedded cpuset_t into the vmm(4) ioctl ABI.  This was a mistake since
we otherwise have some leeway to change the cpuset_t for the whole
system, but we want to keep the vmm ioctl ABI stable.

Rework IPI reporting to avoid this problem.  Along the way, make VM_RUN
a bit more efficient:
- Split vmexit metadata out of the main VM_RUN structure.  This data is
  only written by the kernel.
- Have userspace pass a cpuset_t pointer and cpusetsize in the VM_RUN
  structure, as is done for cpuset syscalls.
- Have the destination CPU mask for VM_EXITCODE_IPIs live outside the
  vmexit info structure, and make VM_RUN copy it out separately.  Zero
  out any extra bytes in the CPU mask, like cpuset syscalls do.
- Modify the vmexit handler prototype to take a full VM_RUN structure.

PR:		271330
Reviewed by:	corvink, jhb (previous versions)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D40113
2023-05-23 21:15:59 -04:00
Dmitry Chagin
1d76741520 linux(4): Implement ptrace_pokeusr for x86_64
Differential Revision:	https://reviews.freebsd.org/D40097
MFC after:		1 week
2023-05-18 20:02:35 +03:00
Dmitry Chagin
3d0addcd35 linux(4): Make ptrace_pokeusr machine dependent
Differential Revision:	https://reviews.freebsd.org/D40096
MFC after:		1 week
2023-05-18 20:01:12 +03:00
Dmitry Chagin
dd2a6cd701 linux(4): Make ptrace_peekusr machine dependend
And partially implement it for x86_64.

Differential Revision:	https://reviews.freebsd.org/D40095
MFC after:		1 week
2023-05-18 20:00:12 +03:00
Piotr Pawel Stefaniak
411942a70e GENERIC: remove a stray space character 2023-05-13 21:31:49 +02:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Bjoern A. Zeeb
721b44ba5f amd64: pmap.h put a guard around a pcpu.h function
pmap_get_pcid() calls zpcpu_get() which is defined in pcpu.h.
It is unclear why we do not include that header but like right
above the change add another guard around pmap_get_pcid().
This allows some LinuxKPI headers to compile again.

Suggested by:	markj
MFC after:	10 days
2023-05-12 11:14:54 +00:00
Warner Losh
062a7b918f twe: Remove driver
Sponsored by:		Netflix
2023-05-10 22:24:12 -06:00
Konstantin Belousov
bf864c3ed5 amd64 MINIMAL: SysV IPC syscalls are loadable
Reviewed by:	emaste, imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39990
2023-05-09 18:30:07 +03:00
Konstantin Belousov
0c1c5e36eb amd64 MINIMAL: remove UFS from compiled-in list
Reviewed by:	emaste, imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39990
2023-05-09 18:30:07 +03:00
Konstantin Belousov
bba6249ae9 amd64 MINIMAL config: remove statements about UFS module
All UFS options work for ufs.ko.

Reviewed by:	emaste, imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39990
2023-05-09 18:30:07 +03:00
Vitaliy Gusev
c543e09f1f
bhyve: save/restore pir_desc
Failing to preserve pir_desc can result in pending interrupts being lost
on resume leading to a hung VM.

Reviewed by:		corvink, jhb
MFC after:		1 week
Sponsored by:		vStack
Differential Revision:	https://reviews.freebsd.org/D35447
2023-05-09 10:31:27 +02:00
Bojan Novković
fefac54359
bhyve: fix vCPU single-stepping on VMX
This patch fixes virtual machine single stepping on VMX hosts.

Currently, when using bhyve's gdb stub, each attempt at single-stepping
a vCPU lands in a timer interrupt. The current single-stepping mechanism
uses the Monitor Trap Flag feature to cause VMEXIT after a single
instruction is executed. Unfortunately, the SDM states that MTF causes
VMEXITs for the next instruction that gets executed, which is often not
what the person using the debugger expects. [1]

This patch adds a new VM capability that masks interrupts on a vCPU by
blocking interrupt injection and modifies the gdb stub to use the newly
added capability while single-stepping a vCPU.

[1] Intel SDM 26.5.2 Vol. 3C

Reviewed by:		corvink, jbh
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D39949
2023-05-09 10:04:55 +02:00
Mitchell Horne
aba91805aa hwpmc: use kstack_contains()
This existing helper function is preferable to the hand-rolled
calculation of the kstack bounds.

Make some small style improvements while here. Notably, rename every
instance of "r", the return address, to "ra". Tidy the includes in the
affected files.

Reviewed by:	jkoshy
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39909
2023-05-06 14:49:19 -03:00
Konstantin Belousov
38843fe0f2 amd64: add MINIMALUP config
This is the MINIMAL config with SMP/NUMA options turned off.
Useful to ensure that UP configuration still builds, until it is removed
finally.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-05-06 14:24:07 +03:00
Konstantin Belousov
3a8c69c1ff amd64 MINIMAL config: remove sentence about acpi
On amd64 ACPI is required to boot, it cannot work as a module, and we do
not build the ACPI module for long time.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-05-06 14:24:07 +03:00
Konstantin Belousov
7c8e66ed8d amd64: convert UP code to dynamically allocated pmap->pm_pcid
Reported by:	peterj
Sponsored by:	The FreeBSD Foundation
2023-05-06 14:24:07 +03:00
Corvin Köhne
b10e100d16
vmm: don't free unallocated memory
If vmx or svm is disabled in BIOS or the device isn't supported by vmm,
modinit won't allocate these state save areas. As kmem_free panics when
passing a NULL pointer to it, loading the vmm kernel module causes a
panic too.

PR:			271251
Reviewed by:		markj
Fixes:			74ac712f72 ("vmm: Dynamically allocate a couple of per-CPU state save areas")
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D39974
2023-05-05 15:34:00 +02:00
Igor Ostapenko
0167b5a793 sys/amd64/conf/FIRECRACKER: typo (compatiblity)
https://bugs.freebsd.org/269753

PR:                      269753
Reported by:             Igor Ostapenko
Approved by:             doc, src (delphij, imp, zlei)
Differential revision:   https://reviews.freebsd.org/D38741
2023-05-05 01:23:08 +01:00
John Baldwin
4961faaacc pmap_{un}map_io_transient: Use bool instead of boolean_t.
Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D39920
2023-05-04 12:29:48 -07:00
John Baldwin
407f675718 imgact_elf: Change header_supported to return bool instead of boolean_t.
Reviewed by:	imp, kib, emaste
Differential Revision:	https://reviews.freebsd.org/D39919
2023-05-04 12:29:29 -07:00
Konstantin Belousov
3582acbad3 amd64 mp_machdep.c: remove useless comment
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D39945
2023-05-04 18:39:22 +03:00
Konstantin Belousov
af1c6d3f30 amd64: do not leak pcpu pages
Do not preallocate pcpu area backing pages on early startup, only
allocate enough of KVA for pcpu[MAXCPU] and the page for BSP.  Other
pages are allocated after we know the number of cpus and their
assignments to the domains.

PCPUs are not accessed until they are initialized, which happens on AP
startup.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D39945
2023-05-04 18:39:22 +03:00