Add tso_tcp_flags_mask_first_segment, tso_tcp_flags_mask_middle_segment,
and tso_tcp_flags_mask_last_segment sysctl-variables to control the
handling of TCP flags during TSO.
This allows to change the masks appropriate for classical ECN and to
configure appropriate masks for accurate ECN.
MFC after: 3 days
Sponsored by: Netflix
Add tso_tcp_flags_mask_first_segment, tso_tcp_flags_mask_middle_segment,
and tso_tcp_flags_mask_last_segment sysctl-variables to control the
handling of TCP flags during TSO.
This allows to change the masks appropriate for classical ECN and to
configure appropriate masks for accurate ECN.
Reviewed by: rrs
MFC after: 3 days
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D44259
I found I was getting constant device timeouts when doing anything
more complicated than a single SSH on laptop with RTL8811AU.
After digging into it, i found a variety of fun situations, including
traffic stalls that would recover w/ a shorter (1 second) USB transfer
timeout. However, the big one is a straight up hang of any TX endpoint
until the NIC was reset. The RX side kept going just fine; only the
TX endpoints would hang.
Reproducing it was easy - just start up a couple of traffic streams
on different WME AC's - eg a best effort + bulk transfer, like
browsing the web and doing an ssh clone - throw in a ping -i 0.1
to your gateway, and it would very quickly hit device timeouts every
couple of seconds.
I put everything into a single TX EP and the hangs went away.
Well, mostly.
So after some MORE digging, I found that this driver isn't checking
if the transfers are going into the correct EPs for the packet
WME access category / 802.11 TID; and would frequently be able
to schedule multiple transfers into the same endpoint.
Then there's a second problem - there's an array of endpoints
used for setting up the USB device, with .endpoint = UE_ADDR_ANY,
however they're also being setup with the same endpoint configured
in multiple transfer configs. Eg, a NIC with 3 or 4 bulk TX endpoints
will configure the BK and BE endpoints with the same physical endpoint
ID. This also leads to timed out transfers.
My /guess/ was that the firmware isn't happy with one or both of the
above, and so I solved both.
* drop the USB transfer timeout to 1 second, not 5 seconds -
that way we'll either get a 1 second traffic pause and USB transfer
failure, or a 5 second device timeout. Having both the TX timeout
and the USB transfer timeout made recovery from a USB transfer
timeout (without a NIC reset) almost impossible.
* enforce one transfer per endpoint;
* separate pending/active buffer tracking per endpoint;
* each endpoint now has its own TX callback to make sure the queue /
end point ID is known;
* and only frames from a given endpoint pending queue is going
into the active queue and into that endpoint.
* Finally, create a local wme2qid array and populate it with the
endpoint mapping that ensures unique physical endpoint use.
Locally tested:
* rtl8812AU, 11n STA mode
* rtl8192EU, 11n STA mode (with diffs to fix the channel config / power
timeouts.)
Differential Revision: https://reviews.freebsd.org/D47522
Add tso_tcp_flags_mask_first_segment, tso_tcp_flags_mask_middle_segment,
and tso_tcp_flags_mask_last_segment sysctl-variables to control the
handling of TCP flags during TSO.
This allows to fix the masks appropriate for classical ECN and to
configure appropriate masks for accurate ECN.
Michael notes emperically 82599 has an unexpected middle mask:
Chip First Middle Last
82599 0xFF6 0xFF6 0xF7F
which should be fixed up to 0xF76 (RFC 3168) in a future commit.
Reviewed by: rrs, rscheff
MFC after: 3 days
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D44258
Use hc_ prefix instead of rmx_. The latter stands for "route metrix" and
is an artifact from the 90-ies, when TCP caching was embedded into the
routing table. The rename should have happened back in 97d8d152c2.
No functional change. Done with sed(1) command:
s/rmx_(mtu|ssthresh|rtt|rttvar|cwnd|sendpipe|recvpipe|granularity|expire|q|hits|updates)/hc_\1/g
These were removed in a40ecb6f74 because they do not apply to igc
hardware which uses EITR for interval timing.
MFC after: 3 days
Sponsored by: BBOX.io
The bsdlabel utility is deprecated, gpart should be used instead:
- Offset the first 16 sectors, just like bsdlabel did (used for
metadata)
- Use a freebsd-ufs partition type (regardless bsdlabel creating a
'!0')
Reviewed by: emaste, imp
Approved by: emaste (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D47653
Once we set that we're doing the inversion workaround, there's no sense
continuing to search for the inversion workaround.
Sponsored by: Netflix
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D47686
X/Open originally had _XOPEN_SOURCE defined to signify conformance with
the Single Unix Specification, starting with its third iteration. There
it defined _XOPEN_SOURCE being defined as the same thing as
_POSIC_C_SOURCE=2, though the different versions of the spec had slight
variances as to what's defined and wheter or not _XOPEN_SOURCE_EXTENSION
needed to be defined. Document that we don't do anything in this case.
It turns out that enabling the proper strict environment breaks at least
some old software, so for the moment it's a nop until that can be sorted
out (though that is a very low proprity task).
Sponsored by: Netflix
This block has a lot of nesting, not helped by two adjacent nested
blocks involving _POSIX_C_SOURCE, with only the inner one commented,
looking like it's the end of the outer one. Comment the outer one as
well so it's not quite so hard to figure out.
MFC after: 1 week
Nothing uses it anymore, so drop it from the 'safe' list. Also, move
stand/efi/loader/main.c to using machine/_inttypes.h which is all it
really needs.
Sponsored by: Netflix
Our implementation currently diverges from POSIX 2024 in a couple of
ways, as now noted in the BUGS section.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47589
The previous change committed a preliminary version of the change to
use iterators to free page sequences. This updates to what was
intended to be the final version.
Reviewed by: markj (previous version)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D46724
Use pctrie iterators for removing some page sequences from radix
trees, to avoid repeated searches from the tree root.
Rename vm_page_object_remove to vm_page_remove_radixdone, and remove
from it the responsibility for removing a page from its radix tree,
and pass that responsibility on to its callers.
For one of those callers, vm_page_rename, pass a pages pctrie_iter,
rather than a page, and use the iterator to remove the page from its
radix tree.
Define functions vm_page_iter_remove() and vm_page_iter_free() that
are like vm_page_remove() and vm_page_free(), respectively, except
that they take an iterator as parameter rather than a page, and use
the iterator to remove the page from the radix tree instead of
searching the radix tree. Function vm_page_iter_free() assumes that
the page is associated with an object, and calls
vm_page_free_object_prep to do the part of vm_page_free_prep that is
object-related.
In functions vm_object_split and vm_object_collapse_scan, use a
pctrie_iter to walk over the pages of the object, and use
vm_page_rename and vm_radix_iter_remove modify the radix tree without
searching for pages. In vm_object_page_remove and _kmem_unback, use a
pctrie_iter and vm_page_iter_free to remove the page from the radix
tree.
Reviewed by: markj (prevoius version)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D46724
The return value is not required to be the difference between the
differing bytes, only less than zero, zero, or greater than zero.
Reviewed by: fuz
Event: Kitchener-Waterloo Hackathon 202406
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47683
If during ip_forward() we find a blackhole (or reject) route we should stop
processing and count this in the 'cantforward' counter, just like we already do
for IPv6.
Blackhole routes are set to use the loopback interface, so we don't actually
incorrectly forward traffic, but we do fail to count it as unroutable.
Test this, both for IPv4 and IPv6.
Reviewed by: melifaro
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D47529
Add a menu to the installer to run fwget(8) inside the newly installed
system to install firmware known to be needed.
This requires working netowrking.
This is needed at least for wireless currently for when we entirely
stop shipping new firmware in src.git to have working networking on
the installed system (we already do need this for at least rtw89).
Sponsored by: The FreeBSD Foundation
Tested with: 4 different iwlwifi chipsets in a system (earlier version)
Suggested improvments by: jrtc27
MFC after: 8 hours
Differential Revision: https://reviews.freebsd.org/D47491
Wireless driver firmware is no longer added to the src tree.
In order to have wireless support in the installer for the new drivers
we install the firmware packages onto disc1 (and memstick) and dvd
if built on FreeBSD and NOPKG is not defined (to not break cross-builds
from Linux or OSX and to allow people to opt-out).
Sponsored by: The FreeBSD Foundation
Submitted by: cperciva (the orig. commands and where to place them)
Reviewed by: jrtc27
MFC after: 6 hours
Differential Revision: https://reviews.freebsd.org/D47407
-l is required by LSB for login shell, all other shells: bash, zsh,
oksh, mksh, ... implements it.
with -l sh will act as a login shell and read the profile.
MFC After: 1 week
Obtained From: dash (3b7c8442bfe7c2fd0a6b0415df6ddf66a399fd55)
Reviewed by: kib, lme
Differential Revision: https://reviews.freebsd.org/D47681
Disable the IP/IP6/ICMP/... counter probe points by default.
They are kept enabled in debug builds, and can be enabled with
'options KDTRACE_MIB_SDT'.
Requested by: glebius
Reviewed by: glebius
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D47657
unistd.h includes too much extra stuff for the boot loader. This creates
a fair amount of namespace pollution and it's best to just make it an
alias for stand.h like the other include files already are.
Sponsored by: Netflix
We only need to provide sig_atomic_t in emulation. However, including
machine/signal.h brings in too much namespace pollution related to
signals. Instead, define sig_atomic_t as long. Setting long is async
atomic on all platforms (though powerpc64 defines it to an int), though
that doesn't matter since the boot loader doesn't use signals.
Sponsored by: Netflix
struct cdev has members of type struct timespec. Include sys/_timespec.h
to so we don't need to rely on namespace pollution to define it.
Sponsored by: Netflix
We don't in general support running newer libc on an older kernel, but
have occasionally added support for specific functionality on a case-by-
case basis. When we do this it is usually done as an aid for developers
to get across a change that introduced new functionality, as for 64-bit
inodes and the introduction of the getrandom syscall.
The getrandom syscall was added in commit e9ac27430c ("Implement
getrandom(2) and getentropy(3)") in 2018, and exists in all supported
FreeBSD versions. The ECAPMODE special case applied to a few months
worth of kernel versions also in 2018 -- fixed as of commit ed1fa01ac4
("Regen after r337998.").
The backwards-compatibility support is no longer needed, so remove it.
Relnotes: Yes
Reviewed by: brooks, cem, delphij
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47636
The previous maximum value for the upper watermark was based on the old
value of MAXPHYS. Raise it to allow more parallel I/O on large systems.
This is still a rather flawed mechanism since it's applied without
regard to the number of filesystems or block devices between which this
mechanism sits, but we might as well bump the limits at this point, as
they haven't been revised in quite a long time.
Reviewed by: imp, kib
MFC after: 2 weeks
Fixes: cd85379104 ("Make MAXPHYS tunable. Bump MAXPHYS to 1M.")
Differential Revision: https://reviews.freebsd.org/D47398
Right now, sockaddr_nl parameters are not displayed in kdump output,
however they are present in the structure in ktrace.out file when
ktrace is run.
Reviewed by: melifaro, markj, glebius
Differential Revision: https://reviews.freebsd.org/D47411
This should have been done with commit d9fe718287 but was missed.
Fixes: d9fe718287 ("makefs: Remove the warning printed when makefs -t zfs is used")
MFC after: 3 days
vnet.interface and zfs.dataset can be used to specify multiple
interfaces/datasets in jail.conf, but not on the command-line, which is
a bit surprising. Extend the handling of ip(4|6).addr to those
parameters, update the description of vnet.interface in jail.8, and add
a rudimentary regression test.
Reviewed by: zlei, jamie
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D47651
atomic(9) primitives are documented as operating on unsigned types.
Here, we need a cast to avoid a tautological comparison.
Add a regression test for access(2), which was affected by the bug.
Reported by: NetApp
Reviewed by: kib
Fixes: e511bd1406 ("vfs: fully lockless v_writecount adjustment")
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47672
This is somewhaht hard to test reliably, but we'll give it a shot. Startup
a sleep(1) daemon with a hefty restart delay. In refactoring of daemon(8),
we inadvertently started dropping SIGTERMs that came in while we were
waiting to restart the child, so we employ the strategy:
- Pop the child sleep(1) first
- Wait for sleep(1) to exit (pid file truncated)
- Pop the daemon(8) with a SIGTERM
- Wait for daemon(8) to exit
The pidfile is specifically truncated outside of the event loop so that we
don't have a kqueue to catch it in the current model.
PR: 277959
Reviewed by: des, markj
Differential Revision: https://reviews.freebsd.org/D47005
We populate the kqueue with all of four kevents: three signal handlers and
one for read of the child pipe. Every time we start the child, we rebuild
this kqueue from scratch for the child and tear it down before we exit and
check if we need to restart the child. As a consequence, we effectively
drop any of the signals we're interested in between restarts.
Push the kqueue out into the daemon state to avoid losing any signal events
in the process, and reimplement the restart timer in terms of kqueue timers.
The pipe read event will be automatically deleted upon last close, which
leaves us with only the signal events that really get retained between
restarts of the child.
PR: 277959
Reviewed by: des, markj
Differential Revision: https://reviews.freebsd.org/D47004