1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-11-23 07:31:31 +00:00
Commit Graph

8100 Commits

Author SHA1 Message Date
Gleb Smirnoff
3789810845 tcp: avoid bcopy() in tcp_mss_update() 2024-11-20 16:37:24 -08:00
Gleb Smirnoff
2944a888ea tcp: remove so != NULL check
In the modern FreeBSD network stack a socket outlives its tcpcb.
2024-11-20 16:37:18 -08:00
Gleb Smirnoff
b80c06cc0a tcp: use const argument in the TCP hostcache KPI
The hostcache can't modify tcpcb, inpcb or connection info.
2024-11-20 16:30:42 -08:00
Gleb Smirnoff
09000cc133 tcp: mechanically rename hostcache metrics structure fields
Use hc_ prefix instead of rmx_.  The latter stands for "route metrix" and
is an artifact from the 90-ies, when TCP caching was embedded into the
routing table.  The rename should have happened back in 97d8d152c2.

No functional change. Done with sed(1) command:

s/rmx_(mtu|ssthresh|rtt|rttvar|cwnd|sendpipe|recvpipe|granularity|expire|q|hits|updates)/hc_\1/g
2024-11-20 16:29:00 -08:00
Kristof Provost
e27970ae8f netinet: handle blackhole routes
If during ip_forward() we find a blackhole (or reject) route we should stop
processing and count this in the 'cantforward' counter, just like we already do
for IPv6.
Blackhole routes are set to use the loopback interface, so we don't actually
incorrectly forward traffic, but we do fail to count it as unroutable.

Test this, both for IPv4 and IPv6.

Reviewed by:	melifaro
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D47529
2024-11-20 16:52:41 +01:00
Kristof Provost
438ca68cef netinet: default mib counter probe points off
Disable the IP/IP6/ICMP/... counter probe points by default.
They are kept enabled in debug builds, and can be enabled with
'options KDTRACE_MIB_SDT'.

Requested by:	glebius
Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D47657
2024-11-20 09:52:48 +01:00
Michael Tuexen
8caa2f5351 tcp: define tcp_lro_log() only when TCP_BLACKBOX is defined
Reviewed by:		rrs, Peter Lei
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D47401
2024-11-17 19:21:01 +01:00
Randall Stewart
12fc79619a Change the SOCKBUF_LOCK calls to use the more refined SOCK_XXXBUF_LOCK/UNLOCK.
The socket buffer locking used to be standard on SOCKBUF_LOCK/UNLOCK. But we are now
moving to a more elegant SOCK_SENDBUF_LOCK/UNLOCK and SOCK_RECVBUF_LOCK/UNLOCK.
Lets get BBR and Rack to use these updated macros.

Reviewed by:glebius, tuexen, rscheff
Differential Revision:https://reviews.freebsd.org/D47542
2024-11-15 12:37:05 -05:00
Gleb Smirnoff
0b4539ee54 inpcb: gc unused argument of in_pcbconnect() 2024-11-14 11:39:13 -08:00
Gleb Smirnoff
fb7c1ac5ac tcp: remove the looping on pcb count in tcp_destroy()
This was useful when TCP timers were not able to reliably stop. Note that
in_pcbinfo_destroy() called later asserts that V_tcbinfo.ipi_count is 0.

This reverts 806929d514, b54e08e11a.
2024-11-14 11:39:12 -08:00
Gleb Smirnoff
81f08f3038 siftr: remove pointless assertion
The assertion is correct, but isn't useful.  Also it contradicts
its own comment.
2024-11-14 11:39:12 -08:00
Richard Scheffenegger
22dcc81293 tcp: Use segment size excluding tcp options for all cwnd calculations
Avoid sending small segments by making sure that cwnd is usually
calculated in full (data) segment sizes. Especially during loss
recovery and retransmission scenarios.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47474
2024-11-14 10:16:57 +01:00
Richard Scheffenegger
8f5a2e216f tcp: fix cwnd recalculation during limited transmit
Properly calculate the expected flight size (cwnd) during
limited transmit. Exclude the SACK scoreboard from
consideration when still in limited transmit.

PR: 282605
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47541
2024-11-14 09:19:49 +01:00
Richard Scheffenegger
c9047eb7b3 tcp: allow TSO even while RX path is unordered
Over IP networks, forward and return path largely
act independently from each other. Do not disable LRO
on the TX side, when reordering/loss is happening
on the RX half-connection.

Reviewed By: rrs, #transport, peter.lei_ieee.org
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47056
2024-11-14 09:15:53 +01:00
Mark Johnston
45a77bf23f inpcb: Make some cosmetic improvements to in_pcbbind()
- Use the local var "laddr" instead of sin->sin_addr in one block.
- Use in_nullhost() instead of explicit comparisons with INADDR_ANY.
- Combine multiple socket options checks into one.
- Fix indentation.
- Remove some unhelpful comments.

This is in preparation for some simplification and bug-fixing.

No functional change intended.

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D47451
2024-11-14 16:05:27 +00:00
Richard Scheffenegger
dded4e9e52 tcp: change SOCKBUF_* macros to SOCK_[RECV|SEND]BUF_* macros
Change the older LOCK related macros over to the
dedicated send/recv buffer macros in the base tcp stack.

No functional change intended.

Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D47567
2024-11-14 02:08:12 +01:00
Justin Hibbits
4d0c95384f net: Include private header in more needed places
sys/netinet and sys/netipsec are both part of the 'blessed' netstack, so
can access struct ifnet directly.  With this structure becoming private
very soon, the necessary files need to get direct access.

Sponsored by:	Juniper Networks, Inc.
2024-11-13 14:30:59 -05:00
Mark Johnston
21d7ac8c79 inpcb: Remove some unused parameters in internal hash lookup functions
in_pcblookup_hash_wild_* looks up unconnected inpcbs, so there is no
point in passing the foreign address and port, and indeed those
parameters are not used.  So, remove them.

No functional change intended.

MFC after:	1 week
Sponsored by:	Klara, Inc.
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D47385
2024-11-08 14:25:19 +00:00
Michael Tuexen
625835c8b5 tcp: fix the initial CWND when a SYN retransmission happened
According to RFC 3390 the CWND should be set to one MSS if the
SYN or SYN-ACK has been retransmitted. This is handled in the
code by setting CWND to 1 and cc_conn_init() translates this
to MSS. Unfortunately, cc_cong_signal() was overwriting the
special value of 1 in case of a lost SYN, and therefore the
initial CWND was not as it was supposed to be.
Fix this by not overwriting the special value of 1.

Reviewed by:		cc, rscheff
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D47439
2024-11-05 09:52:42 +01:00
Michael Tuexen
518a1163d0 sctp: fix debug message
MFC after:	3 days
2024-11-03 11:20:54 +01:00
Michael Tuexen
523913c943 sctp: improve handling of address changes
Identify interfaces consistenly by the pair of the ifn pointer
and the index.
This avoids a use after free when the ifn and or index was reused.

Reported by:	bz, pho, and others
MFC after:	3 days
2024-11-03 10:20:08 +01:00
Michael Tuexen
470a63cde4 sctp: garbage collect two unused functions
MFC after:	3 days
2024-11-02 17:58:09 +01:00
Michael Tuexen
bf11fdaf0d sctp: don't consider the interface name when removing an address
Checking the interface name can not be done consistently, so
don't do it.

MFC after:	3 days
2024-11-02 16:33:02 +01:00
Michael Tuexen
d839cf2fbb sctp: editorial cleanup
Improve consistency, no functional change intended.

MFC after: 	3 days
2024-11-02 16:02:52 +01:00
John Baldwin
28aafeb83c netinet*: Add assertions for some places that don't support M_EXTPG mbufs
Found while auditing calls to M_WRITABLE to see if M_EXTPG could be
removed from its checks.

Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D46785
2024-10-31 16:32:32 -04:00
Richard Scheffenegger
7dc78150c7 tcp: refactor cwnd during SACK transmissions to allow TSO
Refactoring of cwnd and moving the adjustment for SACKed data into
tcp_output() - cwnd tracking the maximum extent starting at snd_una -
allows both SACK loss recovery as well as SACK transmissions after
RTO during slow start and if allowed, the use of TSO while in loss
recovery.

Reviewed By:		tuexen, cc, #transport
Sponsored by:		NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D43470
2024-10-29 19:04:12 +01:00
Michael Tuexen
d08713dcdb sctp: another cleanup
No functional change intended.

MFC after:	3 days
2024-10-27 15:01:45 +01:00
Michael Tuexen
a05620b0f6 sctp: cleanup the addition of addresses which are already known
No functional change intended.

MFC after: 	3 days
2024-10-25 14:11:09 +02:00
Michael Tuexen
02478e6591 sctp: further cleanup
MFC after:	3 days
2024-10-25 13:47:43 +02:00
Michael Tuexen
ce5b5361d4 sctp garbage collect sctp_update_ifn_mtu
MFC after:	3 days
2024-10-24 22:00:59 +02:00
Gleb Smirnoff
d021d3b3c6 tcp: get rid of TDP_INTCPCALLOUT
With CALLOUT_TRYLOCK we don't need this special flag.

Reviewed by:		jtl
Differential Revision:	https://reviews.freebsd.org/D45748
2024-10-24 10:14:03 -07:00
Gleb Smirnoff
bffebc336f tcp: use CALLOUT_TRYLOCK for the TCP callout
This allows to remove the drop of the lock tcp_timer_enter(), which closes
a sophisticated but possible race that involves three threads.  In case we
got a callout executing and two threads trying to close the connection,
e.g. and interrupt and a syscall, then lock yielding in tcp_timer_enter()
may transfer lock from one closing thread to the other closing thread,
instead of the callout.

Reviewed by:		jtl
Differential Revision:	https://reviews.freebsd.org/D45747
2024-10-24 10:14:03 -07:00
Zhenlei Huang
2f395cfda8 tcp cc: Remove a stray semicolon
MFC after:	1 week
2024-10-24 23:04:50 +08:00
Michael Tuexen
e4ac0183a1 sctp: cleanup
No functional change intended.

MFC after:	3 days
2024-10-24 13:24:49 +02:00
Michael Tuexen
ce20b48a60 sctp: improve debug output
MFC after:	3 days
2024-10-24 13:19:14 +02:00
Gleb Smirnoff
994a82a019 tcp: garbage collect unused macros
Fixes:	d40c0d47cd
2024-10-18 10:35:38 -07:00
Richard Scheffenegger
440f4ba18e tcp: fix duplicate retransmissions when RTO happens during SACK loss recovery
When snd_nxt doesn't track snd_max, partial SACK ACKs may elicit
unexpected duplicate retransmissions. This is usually masked by
LRO not necessarily ACKing every individual segment, and prior
to RFC6675 SACK loss recovery, harder to trigger even when an
RTO happens while SACK loss recovery is ongoing.

Address this by improving the logic when to start a SACK loss recovery
and how to deal with a RTO, as well as improvements to the adjusted
congestion window during transmission selection.

Reviewed By:	tuexen, cc, #transport
Sponsored by:	NetApp, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D43355
2024-10-10 13:02:47 +02:00
Michael Tuexen
4466a97e83 sctp: check locking requirements
Actually assert the locking instead of describing it in a comment.
No functional change intended.

MFC after:	3 days
2024-10-10 15:50:41 +02:00
Michael Tuexen
e1a09d1e9d sctp: make sctp_free_ifn() static
It is not used outside of the file.
No functional change intended.

MFC after:	3 days
2024-10-10 10:43:32 +02:00
Michael Tuexen
2e9761eb80 sctp: cleanup sctp_delete_ifn
The address lock is always held, so no need for the second
parameter.
No functional change intended.

MFC after:	3 days
2024-10-10 10:36:00 +02:00
Ed Maste
91a9e4e01d sctp: propagate cap rights on sctp_peeloff
PR:		201052
Reviewed by:	oshogbo, tuexen
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D46884
2024-10-08 20:36:50 -04:00
John Baldwin
519981e3c0 tcp_output: Clear FIN if tcp_m_copym truncates output length
Reviewed by:	rscheff, tuexen, gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D46824
2024-10-02 15:12:37 -04:00
Michael Tuexen
2eacb0841c tcp: small cleanup
No functional change intended.

Reviewed by:		cc, glebius, markj, rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46850
2024-10-01 17:34:35 +02:00
Gleb Smirnoff
57671d5ccc tcp: further cleanup old options
They all were experimental and some comments refer to internal Netflix
versions.  There is not reason to leak that into the header. Style unused
options so that they have the available value aligned with really used
values.

Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D46779
2024-09-30 12:11:37 -07:00
Michael Tuexen
01eb635d12 tcp: improve mbuf handling when processing SYN segments
When the sysctl-variable net.inet.ip.accept_sourceroute is non-zero,
an mbuf would be leaked when processing a SYN-segment containing an
IPv4 strict or loose source routing option, when the on-stack
syncache entry is used or there is an error related to processing
TCP MD5 options.
Fix this by freeing the mbuf whenever an error occurred or the
on-stack syncache entry is used.

Reviewed by:		markj, rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46839
2024-09-30 20:00:04 +02:00
Michael Tuexen
a2e4f45480 tcp: whitespace cleanup
No functional change intended.

Reported by:	markj
MFC after:	1 week
Sponsored by:	Netflix, Inc.
2024-09-30 19:53:57 +02:00
Michael Tuexen
cbc9438f05 tcp: improve ref count handling when processing SYN
Don't leak a reference count for so->so_cred when processing an
incoming SYN segment with an on-stack syncache entry and the
sysctl variable net.inet.tcp.syncache.see_other is false.

Reviewed by:		cc, markj, rscheff
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Pull Request:		https://reviews.freebsd.org/D46793
2024-09-28 22:06:41 +02:00
Michael Tuexen
78e1b031d2 tcp: improve MAC error handling for SYN segments
Don't leak a maclabel when SYN segments are processed which results
in an error due to MD5 signature handling.
Tweak the #idef MAC to allow additional upcoming changes.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46766
2024-09-26 08:10:01 +02:00
Gleb Smirnoff
a00c3a94bf tcp: remove remnants of 20+ year old disabled code from d912c694ee
Fixes:	90ad2dc287
2024-09-24 14:36:10 -07:00
Michael Tuexen
87fbd9fc7f tcp: remove unused socket option names
These IPPROTO_TCP-level socket option names correspond to socket
options, which are not implemented. So remove them.
Thanks to Peter Lei for suggesting this change.

Reviewed by:		rscheff, thj
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46623
2024-09-20 13:03:53 +02:00