We have been incorrectly choosing the "hlt" idle method on modern AMD
EPYC servers for C1 idle. This is because AMD also uses the Functional
Fixed Hardware interface. Due to not parsing the table properly for
AMD, and due to a weird quirk where the mwait latency for C1 is
mis-interpreted as the latency for hlt, we wind up choosing hlt for
c1, which has a far higher wake up latency (similar to IO) of roughly
400us on my test system (AMD 7502P).
This patch fixes this by:
- Looking for AMD in addition to Intel in the FFH
(Note the vendor id of "2" for AMD is not publically documented, but
AMD has confirmed they are using "2" and has promised to document it.)
- Using mwait on AMD when specified in the table, and when CPUid says
its supported
- Fixing a weird issue where we copy the contents of cx_ptr for C1 and
when moving to C2, we do not reinitialize cx_ptr. This leads to
mwait being selected, and ignoring the specified i/o halt method
unless we clear mwait before looking at the table for C2.
Differential Revision: https://reviews.freebsd.org/D47444
Reviewed by: dab, kib, vangyzen
Sponsored by: Netflix
which returns apic device_t by apic_id, if there exists the pci representer
Sponsored by: Advanced Micro Devices (AMD)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
With the new 32-bit UEFI loader, it's convenient to have a sysctl to
figure out how we booted. Can be accessed at machdep.efi_arch
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1098
Added (de)registration functions for memory controller driver to be
notified when ECC errors occur. This allows for decoding of the
specific error by the driver.
Submitted by: Lakshman Likith Nudurupati <lnlakshman@juniper.net>
Sponsored by: Juniper Networks, Inc.
Obtained from: Semihalf
Xen PVH entry point requires to modify the environment provided by the boot
loader, so that the ACPI RSDP is re-written to use the Xen generated RSDP
instead of the native one.
The current logic in the PVH entry point reserves a single page (4K) in order
to copy the contents of the environment passed from the boot loader, so that
the bootloader provided "acpi.rsdp" is dropped and a Xen specific one is added
afterwards.
This however doesn't scale well, as it's possible for the environment to be
bigger than 4K. Bumping the buffer, or attempting to peek at the size of the
metadata all seem to just add more complexity to a sensitive path. Instead
introduce a new ACPI hook that allows setting the RSDP address directly, and
use it from the PVH entry point to set the position of the Xen generated RSDP.
This allows to reduce the logic in the PVH metadata processing, as there's no
need to parse and filter the bootloader provided environment.
Note that modifying the environment blob in-place is likely to not work. The
RSDP address is provided as a string, it's possible the new RSDP location is
higher than the current one, and the string with the new location would overrun
the space used by the previous one.
Sponsored by: Cloud Software Group
PR: 277200
MFC: 3 days
Reviewed by: markj kib
Differential revision: https://reviews.freebsd.org/D46089
As of 9e6544dd6e contigfree(9) is no longer
needed and should not be used anymore. We leave a wrapper for 3rd party
code in at least 15.x but remove (almost) all other cases from the tree.
This leaves one use of contigfree(9) untouched; that was the original
trigger for 9e6544dd6e and is handled in D45813 (to be committed
seperately later).
Sponsored by: The FreeBSD Foundation
Reviewed by: markj, kib
Tested by: pho (10h stress test run)
Differential Revision: https://reviews.freebsd.org/D46099
All architectures enable NEW_PCIB in DEFAULTS (arm being the most recent
to do so in 121be55599 (arm: Set NEW_PCIB in DEFAULTS rather than a
subset of kernel configs")), so it's time we removed the legacy code
that no longer sees much testing and has a significant maintenance
burden.
Reviewed by: jhb, andrew, emaste
Differential Revision: https://reviews.freebsd.org/D32954
related to the page tables page allocation and mapping.
Sponsored by: The FreeBSD Foundation
Sponsored by: Advanced Micro Devices (AMD)
MFC after: 1 week
A test system provided by AMD panicked with "madt_parse_apics:
I/O APIC ID 255 too high". I/O APIC ID 255 is acceptable, so increase
the limit.
Reviewed by: jhb, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45157
Several x86 interrupt core functions were already operating on intsrc
structures. Now switch the remaining 3 to intsrc for consistency.
Swap the order of intr_add_handler()'s first two arguments. This
matches INTRNG and is more consistent with other functions in this
interface.
Differential Revision: https://reviews.freebsd.org/D35386
Reviewed by: imp, markj (previous version)
Pull Request: https://github.com/freebsd/freebsd-src/pull/1126
Convert existing FreeBSD vmware_hvcall function to take a channel
and parameter arguments.
Added vmware_guestrpc_cmd() to send GuestRPC commands to the VMware
hypervisor. The sbuf argument is used for both the command to send
and to store the data to return to the caller.
The following KPIs can be used to get and set FreeBSD-specific guest
information in key/value pairs:
* vmware_guestrpc_set_guestinfo
- set a value into the guestinfo.fbsd.<keyword> key
* vmware_guestrpc_get_guestinfo
- get the value stored in the guestinfo.fbsd.<keyword> key
Add VMware devices to x86 NOTES
Reviewed by: jhb
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44528
The public bus_release_resource() API still accepts both forms, but
the internal kobj method no longer passes the arguments.
Implementations which need the rid or type now use rman_get_rid() or
rman_get_type() to fetch the value from the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44131
The public bus_activate/deactivate_resource() API still accepts both
forms, but the internal kobj methods no longer pass the arguments.
Implementations which need the rid or type now use rman_get_rid() or
rman_get_type() to fetch the value from the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44130
The public bus_adjust_resource() API still accepts both forms, but the
internal kobj method no longer passes the argument. Implementations
which need the type now use rman_get_type() to fetch the value from
the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44128
Unify the HVM and PVH early setup, byt making both rely on the hypervisor
initialization hook part of identify_hypervisor().
The current initialization takes care of the hypercall page, the sahred info
page and does any fixup necessary to metadata video console information if
FreeBSD is booted as the initial domain (so the video console is handed from
Xen into FreeBSD).
Note this has the nice side effect of also allowing to use the Xen console on
HVM guests, which allows to get rid of the QEMU emulated uart and still get
a nice text console.
Sponsored by: Cloud Software Group
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D43764
Start by moving the hyeprcall setup to such function.
The aim is to have a function that does all the required Xen early
initialization for both HVM and PVH, instead of having it scattered across
different paths.
Sponsored by: Cloud Software Group
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D43932
Alas, ARM declared xen_ulong_t to be 64-bits long, unlike i386 where
it matches the word size. As a result, compatibility wrappers are
needed for Xen atomic operations.
Reviewed by: royger
After removing filter functionality, the naming doesn't clearly
represent what the function does, so try to address this. Include some
code clarity and style improvements.
Create a common version in subr_busdma_bounce.c, used by most
implementations. powerpc still needs its own version of the function,
due to its dmat->iommu == NULL check.
No functional change intended.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42896
Without filter functions, we do not need to keep track of tag ancestry.
All inheritance of the parent tag's parameters occurs when creating the
new child tag.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42895
Address filter functions are unused, unsupported, and now rejected.
Simplify some busdma code by removing filter functionality completely.
Note that the chains of parent tags become useless, and will be cleaned
up in the next commit.
No functional change intended.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42894
In particular, this enables support for PCI config access for domains
(segments) other than 0.
Reported by: cperciva
Tested by: cperciva (m7i.metal-48xl AWS instance)
Reviewed by: imp
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D42828
This commit changes the API of pci_cfgreg(read|write) to add a domain
argument (referred to as a segment in ACPI parlance) (note that this
is not the same as a NUMA domain, but something PCI-specific). This
does not yet enable access to domains other than 0, but updates the
API to support domains.
Places that use hard-coded bus/slot/function addresses have been
updated to hardcode a domain of 0. A few places that have the PCI
domain (segment) available such as the acpi_pcib_acpi.c Host-PCI
bridge driver pass the PCI domain.
The hpt27xx(4) and hptnr(4) drivers fail to attach to a device not on
domain 0 since they provide APIs to their binary blobs that only
permit bus/slot/function addressing.
The x86 non-ACPI PCI bus drivers all hardcode a domain of 0 as they do
not support multiple domains.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D42827
Commit 27c36a12f1 is an x86-only feature. As such xen_evtchn_needs_ack
should only exist on x86.
Differential Revision: https://reviews.freebsd.org/D29913
Reviewed by: royger
[royger]: adjust comment.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.
Sponsored by: Netflix
Applies only to bare-metal Zen2 processors. The system currently
automatically applies it to all of them.
Tunable/sysctl 'machdep.mitigations.zenbleed.enable' can be used to
forcibly enable or disable the mitigation at boot or run-time. Possible
values are:
0: Mitigation disabled
1: Mitigation enabled
2: Run the automatic determination.
Currently, value 2 is the default and has identical effect as value 1.
This might change in the future if we choose to take into account
microcode revisions in the automatic determination process.
The tunable/sysctl value is simply ignored on non-applicable CPU models,
which is useful to apply the same configuration on a set of machines
that do not all have Zen2 processors. Trying to set it to any integer
value not listed above is silently equivalent to setting it to value 2
(automatic determination).
The current mitigation state can be queried through sysctl
'machdep.mitigations.zenbleed.state', which returns "Not applicable",
"Mitigation enabled" or "Mitigation disabled". Note that this state is
not guaranteed to be accurate in case of intervening modifications of
the corresponding chicken bit directly via cpuctl(4) (this includes the
cpucontrol(8) utility). Resetting the desired policy through
'machdep.mitigations.zenbleed.enable' (possibly to its current value)
will reset the hardware state and ensure that the reported state is
again coherent with it.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41817
They are a bit more informative than raw hexadecimal values.
While here, sort existing defines of bits for AMD MSRs to match the address
order.
Reviewed by: kib, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41816
On 64bit, there is a 4-byte hole in struct vdso_timekeep32 after
tk_current, if the structure is not packed. This is due to the MD
th_x86_pvc_last_systime being 64bit.
Change amd64 VDSO_TIMEHANDS_MD32 to not use uint64_t, replace it with
pair of uint32_t, as it is done for all other members.
PR: 273085
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
`intr_bind(u_int vector, u_char cpu);` looked suspicious since
everywhere else "cpu" is a u_int and >256 processors isn't unreasonable
now. `intr_bind()` is not used anywhere in FreeBSD (now, after commit
bf42f3738087). Time to remove.
Relnotes: Yes
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D36901
This avoids bloating the kernel image when MAXCPU is large.
A follow-up patch for kgdb and other kernel debuggers is needed since
the stoppcbs symbol is now a pointer. Bump __FreeBSD_version so that
debuggers can use osreldate to figure out how to handle stoppcbs.
PR: 269572
MFC after: never
Reviewed by: mjg, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39806
The SPDX folks have obsoleted the BSD-2-Clause-NetBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix