Use hc_ prefix instead of rmx_. The latter stands for "route metrix" and
is an artifact from the 90-ies, when TCP caching was embedded into the
routing table. The rename should have happened back in 97d8d152c2.
No functional change. Done with sed(1) command:
s/rmx_(mtu|ssthresh|rtt|rttvar|cwnd|sendpipe|recvpipe|granularity|expire|q|hits|updates)/hc_\1/g
If during ip_forward() we find a blackhole (or reject) route we should stop
processing and count this in the 'cantforward' counter, just like we already do
for IPv6.
Blackhole routes are set to use the loopback interface, so we don't actually
incorrectly forward traffic, but we do fail to count it as unroutable.
Test this, both for IPv4 and IPv6.
Reviewed by: melifaro
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D47529
Disable the IP/IP6/ICMP/... counter probe points by default.
They are kept enabled in debug builds, and can be enabled with
'options KDTRACE_MIB_SDT'.
Requested by: glebius
Reviewed by: glebius
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D47657
The socket buffer locking used to be standard on SOCKBUF_LOCK/UNLOCK. But we are now
moving to a more elegant SOCK_SENDBUF_LOCK/UNLOCK and SOCK_RECVBUF_LOCK/UNLOCK.
Lets get BBR and Rack to use these updated macros.
Reviewed by:glebius, tuexen, rscheff
Differential Revision:https://reviews.freebsd.org/D47542
This was useful when TCP timers were not able to reliably stop. Note that
in_pcbinfo_destroy() called later asserts that V_tcbinfo.ipi_count is 0.
This reverts 806929d514, b54e08e11a.
Avoid sending small segments by making sure that cwnd is usually
calculated in full (data) segment sizes. Especially during loss
recovery and retransmission scenarios.
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47474
Properly calculate the expected flight size (cwnd) during
limited transmit. Exclude the SACK scoreboard from
consideration when still in limited transmit.
PR: 282605
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47541
Over IP networks, forward and return path largely
act independently from each other. Do not disable LRO
on the TX side, when reordering/loss is happening
on the RX half-connection.
Reviewed By: rrs, #transport, peter.lei_ieee.org
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47056
- Use the local var "laddr" instead of sin->sin_addr in one block.
- Use in_nullhost() instead of explicit comparisons with INADDR_ANY.
- Combine multiple socket options checks into one.
- Fix indentation.
- Remove some unhelpful comments.
This is in preparation for some simplification and bug-fixing.
No functional change intended.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D47451
Change the older LOCK related macros over to the
dedicated send/recv buffer macros in the base tcp stack.
No functional change intended.
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47567
sys/netinet and sys/netipsec are both part of the 'blessed' netstack, so
can access struct ifnet directly. With this structure becoming private
very soon, the necessary files need to get direct access.
Sponsored by: Juniper Networks, Inc.
in_pcblookup_hash_wild_* looks up unconnected inpcbs, so there is no
point in passing the foreign address and port, and indeed those
parameters are not used. So, remove them.
No functional change intended.
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D47385
According to RFC 3390 the CWND should be set to one MSS if the
SYN or SYN-ACK has been retransmitted. This is handled in the
code by setting CWND to 1 and cc_conn_init() translates this
to MSS. Unfortunately, cc_cong_signal() was overwriting the
special value of 1 in case of a lost SYN, and therefore the
initial CWND was not as it was supposed to be.
Fix this by not overwriting the special value of 1.
Reviewed by: cc, rscheff
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D47439
Identify interfaces consistenly by the pair of the ifn pointer
and the index.
This avoids a use after free when the ifn and or index was reused.
Reported by: bz, pho, and others
MFC after: 3 days
Found while auditing calls to M_WRITABLE to see if M_EXTPG could be
removed from its checks.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D46785
Refactoring of cwnd and moving the adjustment for SACKed data into
tcp_output() - cwnd tracking the maximum extent starting at snd_una -
allows both SACK loss recovery as well as SACK transmissions after
RTO during slow start and if allowed, the use of TSO while in loss
recovery.
Reviewed By: tuexen, cc, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D43470
This allows to remove the drop of the lock tcp_timer_enter(), which closes
a sophisticated but possible race that involves three threads. In case we
got a callout executing and two threads trying to close the connection,
e.g. and interrupt and a syscall, then lock yielding in tcp_timer_enter()
may transfer lock from one closing thread to the other closing thread,
instead of the callout.
Reviewed by: jtl
Differential Revision: https://reviews.freebsd.org/D45747
When snd_nxt doesn't track snd_max, partial SACK ACKs may elicit
unexpected duplicate retransmissions. This is usually masked by
LRO not necessarily ACKing every individual segment, and prior
to RFC6675 SACK loss recovery, harder to trigger even when an
RTO happens while SACK loss recovery is ongoing.
Address this by improving the logic when to start a SACK loss recovery
and how to deal with a RTO, as well as improvements to the adjusted
congestion window during transmission selection.
Reviewed By: tuexen, cc, #transport
Sponsored by: NetApp, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43355
They all were experimental and some comments refer to internal Netflix
versions. There is not reason to leak that into the header. Style unused
options so that they have the available value aligned with really used
values.
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D46779
When the sysctl-variable net.inet.ip.accept_sourceroute is non-zero,
an mbuf would be leaked when processing a SYN-segment containing an
IPv4 strict or loose source routing option, when the on-stack
syncache entry is used or there is an error related to processing
TCP MD5 options.
Fix this by freeing the mbuf whenever an error occurred or the
on-stack syncache entry is used.
Reviewed by: markj, rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D46839
Don't leak a reference count for so->so_cred when processing an
incoming SYN segment with an on-stack syncache entry and the
sysctl variable net.inet.tcp.syncache.see_other is false.
Reviewed by: cc, markj, rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Pull Request: https://reviews.freebsd.org/D46793
Don't leak a maclabel when SYN segments are processed which results
in an error due to MD5 signature handling.
Tweak the #idef MAC to allow additional upcoming changes.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D46766
These IPPROTO_TCP-level socket option names correspond to socket
options, which are not implemented. So remove them.
Thanks to Peter Lei for suggesting this change.
Reviewed by: rscheff, thj
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D46623