This simplifies checking for default resource range for bus_alloc_resource(),
and improves readability.
This is part of, and related to, the migration of rman_res_t from u_long to
uintmax_t.
Discussed with: jhb
Suggested by: marcel
Provide bus_get_bus_tag() for sparc64, powerpc, arm, arm64 and mips
nexus and its children in order to return a platform specific default tag.
This is required to ensure generic correctness of the bus_space tag.
It is especially needed for arches where child bus tag does not match
the parent bus tag. This solves the problem with ppc architecture
where the PCI bus tag differs from parent bus tag which is big-endian.
This commit is a part of the following patch:
https://reviews.freebsd.org/D4879
Submitted by: Marcin Mazurek <mma@semihalf.com>
Obtained from: Semihalf
Sponsored by: Annapurna Labs
Reviewed by: jhibbits, mmel
Differential Revision: https://reviews.freebsd.org/D4879
Summary:
Migrate to using the semi-opaque type rman_res_t to specify rman resources. For
now, this is still compatible with u_long.
This is step one in migrating rman to use uintmax_t for resources instead of
u_long.
Going forward, this could feasibly be used to specify architecture-specific
definitions of resource ranges, rather than baking a specific integer type into
the API.
This change has been broken out to facilitate MFC'ing drivers back to 10 without
breaking ABI.
Reviewed By: jhb
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5075
* GENERAL
- Update copyright.
- Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set
neither to ON, which means we want Fortuna
- If there is no 'device random' in the kernel, there will be NO
random(4) device in the kernel, and the KERN_ARND sysctl will
return nothing. With RANDOM_DUMMY there will be a random(4) that
always blocks.
- Repair kern.arandom (KERN_ARND sysctl). The old version went
through arc4random(9) and was a bit weird.
- Adjust arc4random stirring a bit - the existing code looks a little
suspect.
- Fix the nasty pre- and post-read overloading by providing explictit
functions to do these tasks.
- Redo read_random(9) so as to duplicate random(4)'s read internals.
This makes it a first-class citizen rather than a hack.
- Move stuff out of locked regions when it does not need to be
there.
- Trim RANDOM_DEBUG printfs. Some are excess to requirement, some
behind boot verbose.
- Use SYSINIT to sequence the startup.
- Fix init/deinit sysctl stuff.
- Make relevant sysctls also tunables.
- Add different harvesting "styles" to allow for different requirements
(direct, queue, fast).
- Add harvesting of FFS atime events. This needs to be checked for
weighing down the FS code.
- Add harvesting of slab allocator events. This needs to be checked for
weighing down the allocator code.
- Fix the random(9) manpage.
- Loadable modules are not present for now. These will be re-engineered
when the dust settles.
- Use macros for locks.
- Fix comments.
* src/share/man/...
- Update the man pages.
* src/etc/...
- The startup/shutdown work is done in D2924.
* src/UPDATING
- Add UPDATING announcement.
* src/sys/dev/random/build.sh
- Add copyright.
- Add libz for unit tests.
* src/sys/dev/random/dummy.c
- Remove; no longer needed. Functionality incorporated into randomdev.*.
* live_entropy_sources.c live_entropy_sources.h
- Remove; content moved.
- move content to randomdev.[ch] and optimise.
* src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h
- Remove; plugability is no longer used. Compile-time algorithm
selection is the way to go.
* src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h
- Add early (re)boot-time randomness caching.
* src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h
- Remove; no longer needed.
* src/sys/dev/random/uint128.h
- Provide a fake uint128_t; if a real one ever arrived, we can use
that instead. All that is needed here is N=0, N++, N==0, and some
localised trickery is used to manufacture a 128-bit 0ULLL.
* src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h
- Improve unit tests; previously the testing human needed clairvoyance;
now the test will do a basic check of compressibility. Clairvoyant
talent is still a good idea.
- This is still a long way off a proper unit test.
* src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h
- Improve messy union to just uint128_t.
- Remove unneeded 'static struct fortuna_start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])
* src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h
- Improve messy union to just uint128_t.
- Remove unneeded 'staic struct start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])
- Fix some magic numbers elsewhere used as FAST and SLOW.
Differential Revision: https://reviews.freebsd.org/D2025
Reviewed by: vsevolod,delphij,rwatson,trasz,jmg
Approved by: so (delphij)
Device probe value of BUS_PROBE_NOWILDCARD should be treated specially only
if the device has a fixed devclass. Otherwise it should be interpreted just
as if the driver doesn't want to claim the device.
Prior to this change a device that was not claimed explicitly by its driver
would remain "attached" to the driver that returned BUS_PROBE_NOWILDCARD.
This would bump up the reference on 'driver->refs' and its 'dev->ops' would
point to the 'driver->ops'. When the driver is subsequently unloaded the
'dev->ops->cls' is left pointing to freed memory.
This fixes an easily reproducible #GP fault caused by loading and unloading
vmm.ko multiple times.
Differential Revision: https://reviews.freebsd.org/D2294
Reviewed by: imp, jhb
Discussed with: rstone
Reported by: Leon Dang (ldang@nahannisys.com)
MFC after: 2 weeks
A late change to the SR-IOV infrastructure broke passthrough of
VFs. device_set_devclass() was being used to try to force the
ppt driver to attach to the device, but this didn't work because
the DF_FIXEDCLASS flag wasn't being set on the device, so the
ppt driver probe routine would not match when it returned
BUS_NOWILDCARD. Fix this by adding a new device function that
both sets the devclass and sets the DF_FIXEDCLASS flag, and use
that to force the ppt driver to attach to VFs.
Differential Revision: https://reviews.freebsd.org/D2041
Reviewed by: jhb
MFC after: 3 weeks
allows the user to request administrative changes to individual devices
such as attach or detaching drivers or disabling and re-enabling devices.
- Add a new /dev/devctl2 character device which uses ioctls for device
requests. The ioctls use a common 'struct devreq' which is somewhat
similar to 'struct ifreq'.
- The ioctls identify the device to operate on via a string. This
string can either by the device's name, or it can be a bus-specific
address. (For unattached devices, a bus address is the only way to
locate a device.) Bus drivers register an eventhandler to claim
unrecognized device names that the driver recognizes as a valid address.
Two buses currently support addresses: ACPI recognizes any device
in the ACPI namespace via its full path starting with "\" and
the PCI bus driver recognizes an address specification of
'pci[<domain>:]<bus>:<slot>:<func>' (identical to the PCI selector
strings supported by pciconf).
- To make it easier to cut and paste, change the PnP location string
in the PCI bus driver to output a full PCI selector string rather
than 'slot=<slot> function=<func>'.
- Add a devctl(3) interface in libdevctl which provides a wrapper around
the ioctls and is the preferred interface for other userland code.
- Add a devctl(8) program which is a simple wrapper around the requests
supported by devctl(3).
- Add a device_is_suspended() function to check DF_SUSPENDED.
- Add a resource_unset_value() function that can be used to remove a
hint from the kernel environment. This is used to clear a
hint.<driver>.<unit>.disabled hint when re-enabling a boot-time
disabled device.
Reviewed by: imp (parts)
Requested by: imp (changing PCI location string)
Relnotes: yes
flag value is already exposed via dv_flags, just not the meaning of the
flags themselves. Use these constants to annotate devices that are
disabled or suspended in devinfo output.
This allows one to make a kernel module to tune the
number of queues before the driver loads.
This is needed so that a module at SI_SUB_CPU can set
tunables for these drivers to take. Otherwise getenv
is called too early by the TUNABLE macros.
Reviewed by: smh
Phabric: https://reviews.freebsd.org/D1149
a new per-device '%domain' sysctl node that returns the NUMA domain a
device is associated with if it is associated with one.
Note that this API is still a WIP and might change before 11.0 actually
ships.
Differential Revision: https://reviews.freebsd.org/D930
Reviewed by: kib, adrian
This code has had an extensive rewrite and a good series of reviews, both by the author and other parties. This means a lot of code has been simplified. Pluggable structures for high-rate entropy generators are available, and it is most definitely not the case that /dev/random can be driven by only a hardware souce any more. This has been designed out of the device. Hardware sources are stirred into the CSPRNG (Yarrow, Fortuna) like any other entropy source. Pluggable modules may be written by third parties for additional sources.
The harvesting structures and consequently the locking have been simplified. Entropy harvesting is done in a more general way (the documentation for this will follow). There is some GREAT entropy to be had in the UMA allocator, but it is disabled for now as messing with that is likely to annoy many people.
The venerable (but effective) Yarrow algorithm, which is no longer supported by its authors now has an alternative, Fortuna. For now, Yarrow is retained as the default algorithm, but this may be changed using a kernel option. It is intended to make Fortuna the default algorithm for 11.0. Interested parties are encouraged to read ISBN 978-0-470-47424-2 "Cryptography Engineering" By Ferguson, Schneier and Kohno for Fortuna's gory details. Heck, read it anyway.
Many thanks to Arthur Mesh who did early grunt work, and who got caught in the crossfire rather more than he deserved to.
My thanks also to folks who helped me thresh this out on whiteboards and in the odd "Hallway track", or otherwise.
My Nomex pants are on. Let the feedback commence!
Reviewed by: trasz,des(partial),imp(partial?),rwatson(partial?)
Approved by: so(des)
* Add a bus_if.m method - get_domain() - returning the VM domain or
ENOENT if the device isn't in a VM domain;
* Add bus methods to print out the domain of the device if appropriate;
* Add code in srat.c to save the PXM -> VM domain mapping that's done and
expose a function to translate VM domain -> PXM;
* Add ACPI and ACPI PCI methods to check if the bus has a _PXM attribute
and if so map it to the VM domain;
* (.. yes, this works recursively.)
* Have the pci bus glue print out the device VM domain if present.
Note: this is just the plumbing to start enumerating information -
it doesn't at all modify behaviour.
Differential Revision: D906
Reviewed by: jhb
Sponsored by: Norse Corp
fail the allocation request. Allocations of "reserved" resources such as
PCI BARs already fail the request instead of panic'ing in this case.
MFC after: 1 week
Summary:
Add the beginnings of multipass suspend/resume, by introducing
BUS_SUSPEND_CHILD/BUS_RESUME_CHILD, and move the PCI driver to this.
Reviewers: jhb
Reviewed By: jhb
Differential Revision: https://reviews.freebsd.org/D590
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
old devd's and thus hosts that get IP addresses from DHCP was too much
of a POLA violation.
The sysctl may be removed again after r263758 has been merged to at
least stable/9 and stable/10, and releases have been cut from those
branches.
Discussed with: mjg
Reported by: theraven, rwatson
Add a SYSINIT that forces a reseed during proc0 setup, which happens
fairly late in the boot process.
Add a RANDOM_DEBUG option which enables some debugging printf()s.
Add a new RANDOM_ATTACH entropy source which harvests entropy from the
get_cyclecount() delta across each call to a device attach method.
match devices where the driver class was fixed but the unit number was
wildcarded. This better matches the documented behaviour in
DEVICE_PROBE(9).
Reviewed by: imp
Submitted by: adrian, zec
Fix multiple kernel panics when VIMAGE is enabled in the kernel.
These fixes are based on patches submitted by Adrian Chadd and Marko Zec.
(1) Set curthread->td_vnet to vnet0 in device_probe_and_attach() just before calling
device_attach(). This fixes multiple VIMAGE related kernel panics
when trying to attach Bluetooth or USB Ethernet devices because
curthread->td_vnet is NULL.
(2) Set curthread->td_vnet in if_detach(). This fixes kernel panics when detaching networking
interfaces, especially USB Ethernet devices.
(3) Use VNET_DOMAIN_SET() in ng_btsocket.c
(4) In ng_unref_node() set curthread->td_vnet. This fixes kernel panics
when detaching Netgraph nodes.
bug where a PCI device would be powered down if it failed to probe, but
not when its driver was detached (e.g. via kldunload).
- Add a new helper method resource_list_release_active() which forcefully
releases any active resources of a specified type from a resource list.
- Add a bus_child_detached method for the PCI bus driver which forces any
active resources to be released (and whines to the console if it finds
any) and then powers the device down.
- Call pci_child_detached() if we fail to probe a device when a driver
is kldloaded. This isn't perfect but can avoid leaking resources
from a probe() routine in the kldload case.
Reviewed by: imp, brooks
MFC after: 1 month
"device_free_softc()" and "device_claim_softc()",
to allow USB serial drivers refcounting the softc.
These functions are used to grab the softc from
auto-free and to free the softc back to the correct
malloc type, respectivly.
Discussed with: jhb
MFC after: 2 weeks
allocated softc structure which is returned by
device_get_softc(). This method can be used to
easily implement softc refcounting. This can be
desirable when the softc has memory references
which are controlled by userspace handles for
example.
This solves the problem of blocking the caller
of device_detach() for a non-deterministic time.
Discussed with: kib, ed
MFC after: 2 weeks
This combination doesn't make sense, unit numbers should be hardwired
only in context of a known driver. The wildcard devices should have
wildcard unit numbers.
Reviewed by: jhb
MFC after: 2 weeks
being attached. This is implemented by adding a new DS_ATTACHING state
while a device's DEVICE_ATTACH() method is being invoked. A driver is
required to not fail an attach of a busy device. The device's state will
be promoted to DS_BUSY rather than DS_ACTIVE() if the device was marked
busy during DEVICE_ATTACH().
Reviewed by: kib
MFC after: 1 week
Descriptions are specific to drivers and we don't change drivers on attached
devices. This fixes a few places where we were not clearing the description
when detaching a driver (e.g. with device_attach() failed). While here, fix
a few other nits:
- Remove spurious call to remove a device's driver from
devclass_driver_deleted(). device_detach() removes it already.
- Fix a typo.
having dereferenced it. We either should generally check the device_t's
supplied to bus functions before using them (which we seem to virtually
never do) or just assume that they are not NULL.
While at it make this code fit 78 columns.
Found with: Coverity Prevent(tm)
CID: 4230
when actually setting a driver as especially ENOMEM is fatal in these
cases.
- Annotate other calls to device_set_devclass(9) and device_set_driver(9)
without the return value being checked and that are okay to fail.
Reviewed by: yongari (slightly earlier version)
to kern/subr_bus.c. Simplify this function so that it no longer
depends on malloc() to execute. Identify a few other places where
it makes sense to use device_delete_all_children().
MFC after: 1 week