When we handle a packet via route-to (i.e. pf_route6()) we still need to
verify the MTU. However, we only run that check in the forwarding case.
Set the PFIL_FWD tag when running the pf_test6(PF_OUT) check from
pf_route6(). We are in fact forwarding, so should call the test function
as such. This will cause us to run the MTU check, and generate an ICMP6
packet-too-big error when required.
See also: 54c62e3e5d
See also: f1c0030bb0
See also: https://redmine.pfsense.org/issues/14290
Sponsored by: Rubicon Communications, LLC ("Netgate")
When knob is zero, intent is that no SIGSYS signals are delivered.
Comparing zero to zero does not test much, we should compare the count
of delivered SIGSYSs to zero.
Reviewed by: dchagin, imp
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44077
The test needs to be performed in a new process that was forked with
RFCFDG flag. The will guarantee that the table will start to grow from 20
file descriptors, no matter what kyua(1) or a bare shell was doing before
executing this test. This should fix repetitive test runs from a shell
as well as failures with kyua(1) in some environments.
This reverts commit fa6a02f50e.
It makes the test less probable to fail, but it doesn't fix the
root issue - that on entry the parent process may have already
a large file descriptor table.
This should fix the test failing on some machines/conditions/runs. This
won't fix failures in standalone run, but should fix kyua(1) runs.
Currently with standalone run it will usually fail because the 40-sized
allocation is skipped (see details below).
This matches what forking test does: open 128 files in the parent and 128
in the child. There should actually be no difference where and when the
files are open, but let's mimic the forking test, and open more files in
the spawned thread. Also opening from two different contexts adds a bit
more entropy to the test.
What the test does it checks that fdgrowtable() has been called at least
three tmes for the test process, and the old tables are still on the free
list as long as other execution contexts exist. Under kyua(1) control the
first call grows the table from 20 to 40, but the original table of 20 is
an embedded one, thus is not put on the free list. Passing 40 open files
the table grows to 128 and first old table lands on the free list. Passing
128 open file the table grows to 256 and a second old table lands on the
free list. After that the test would pass. The threaded test was one
open file off before this fix sometimes.
This test runs several scenarios when sleep(9) on a listen(2)ing socket is
interrupted by shutdown(2) or by close(2). What should happen in that
case is not specified, neither is documented. However, there is certain
behavior that we have and this test makes sure it is preserved. There is
software that relies on it, see bug 227259. This test is based on
submission with this bug, bugzilla attachment 192260.
The test checks TCP and unix(4) stream socket behavior and SCTP can be
added easily if needed.
The test passes on FreeBSD 11 to 15. It won't pass on FreeBSD 10,
although the wakeup behavior of shutdown(2) is the same, but it doesn't
return error.
PR: 227259
Capability rights passed to cap_rights_* are not simple bitmaks and
cannot be ORed together in general (although it will work for certain
subsets of rights).
PR: 277057
Fixes: e5e1d9c7b7 ("path_test: Add a test case for...")
Sponsored by: The FreeBSD Foundation
If socket has data interleaved with control it would never allow to read
two pieces of data, neither two pieces of control with one recvmsg(2). In
other words, presence of control makes a SOCK_STREAM socket behave like
SOCK_SEQPACKET, where control marks the records. This is not a documented
or specified behavior, but this is how it worked always for BSD sockets.
If you look closer at it, this actually makes a lot of sense, as if it
were the opposite both the kernel code and an application code would
become way more complex.
The change made recvfd_payload() to return received length and requires
caller to do ATF_REQUIRE() itself. This required a small change to
existing test rights_creds_payload. It also refactors a bit f28532a0f3,
pushing two identical calls out of TEST_PROTO ifdef.
Reviwed by: markj
Differential Revision: https://reviews.freebsd.org/D43724
If a file system's on-disk format does not support st_birthtime, it
isn't clear what value it should return in stat(2). Neither our man
page nor the OpenGroup specifies. But our convention for UFS and
msdosfs is to return { .tv_sec = -1, .tv_nsec = 0 }. fusefs is
different. It returns { .tv_sec = -1, .tv_nsec = -1 }. It's done that
ever since the initial import in SVN r241519.
Most software apparently handles this just fine. It must, because we've
had no complaints. But the Rust standard library will panic when
reading such a timestamp during std::fs::metadata, even if the caller
doesn't care about that particular value. That's a separate bug, and
should be fixed.
Change our invalid value to match msdosfs and ufs, pacifying the Rust
standard library.
PR: 276602
MFC after: 1 week
Sponsored by: Axcient
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D43590
If we apply a route-to to an inbound packet pf_route() may hand that
packet over to dummynet. Dummynet may then delay the packet, and later
re-inject it. This re-injection (in dummynet_send()) needs to know
if the packet was inbound or outbound, to call the correct path for
continued processing.
That's done based on the pf_pdesc we pass along (through
pf_dummynet_route() and pf_pdesc_to_dnflow()). In the case of pf_route()
on inbound packets that may be wrong, because we're called in the input
path, and didn't update pf_pdesc->dir.
This can manifest in issues with fragmented packets. For example, a
fragmented packet will be re-fragmented in pf_route(), and if dummynet
makes different decisions for some of the fragments (that is, it delays
some and allows others to pass through directly) this will break.
The packets that pass through dummynet without delay will be transmitted
correctly (through the ifp->if_output() call in pf_route()), but
the delayed packets will be re-injected in the input path (and not
the output path, as they should be). These packets will pass through
pf_test(PF_IN) as they're tagged PF_MTAG_FLAG_DUMMYNET. However,
this tag is then removed and the packet will be routed and enter
pf_test(PF_OUT) where pf_reassemble() will hold them indefinitely
(as some fragments have been transmitted directly, and will never hit
pf_test(PF_OUT)).
The fix is simple: we must update pf_pfdesc->dir to PF_OUT before we
pass the packet to dummynet.
See also: https://redmine.pfsense.org/issues/15156
Reviewed by: rcm
Sponsored by: Rubicon Communications, LLC ("Netgate")
The TCP implied connect is an artifact left after T/TCP. To my surprise
it still works, hence the existence of this test. Please read this email
first:
https://lists.freebsd.org/pipermail/freebsd-net/2010-August/026311.html
An interesting fact that this test takes 220 - 240 milliseconds to
execute on my Threadripper PRO. Flipping the '#if 0' to '#if 1' in the
test, thus bringing it back to normal connect(2), would speed the test up
a hundred times and I guess all this time is fork+exec of the test.
When we route-to the state should be bound to the route-to interface,
not the default route interface. However, we should only do so for
outbound traffic, because inbound traffic should bind on the arriving
interface, not the one we eventually transmit on.
Explicitly check for this in BOUND_IFACE().
We must also extend pf_find_state(), because subsequent packets within
the established state will attempt to match the original interface, not
the route-to interface.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43589
Otherwise we get spurious test failures when running tests in parallel.
The intent here was to name jails after the tests, but this was done
incorrectly in a couple of places.
MFC after: 1 week
While there are no inherent limits to the number of exporters we're
likely to scale rather badly to very large numbers. There's also no
obvious use case for more than a handful. Limit to 128 exporters to
prevent foot-shooting.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Ensure we print it as such, rather than as a signed integer, as that
would lead to confusion.
Reported by: Jim Pingle <jimp@netgate.com>
Sponsored by: Rubicon Communications, LLC ("Netgate")
pflow(4) now also exports NAT session creation/destruction information.
Test that this works as expected.
While here improve the parsing of ipfix (i.e. pflowproto 10) a bit, and
check more information for the existing state information exports.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43117
Test that we can send netflow information over IPv6.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43115
Test that we actually send netflow messages when configured to do so.
We do not yet inspect the generated netflow messages.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43111
Basic creation, validation and cleanup test for the new pflow interface.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43109
This would previously return 1 if the slave side of the pts was closed
to force an application to read() from it and observe the EOF, but it's
not clear why and this is inconsistent both with how we handle devices
with similar mechanics (like pipes) and also with other kernels, such as
OpenBSD/NetBSD and Linux.
PR: 239604
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D43457
If a copy_file_range operation tries to read from a page that was
previously written via mmap, that page must be flushed first.
MFC after: 2 weeks
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D43451
Ensure that we do the right thing when we reassemble fragmented packet
and send it through a dummynet pipe.
Sponsored by: Rubicon Communications, LLC ("Netgate")
The bug isn't fusefs-specific, but this is the easiest way to reproduce
it.
PR: 276191
MFC after: 1 week
MFC with: bdb46c21a3
Differential Revision: https://reviews.freebsd.org/D43446
Reviewed by: kib
Change sequence of syscalls: instead of "add, delete, check, check"
run sequence "add, check, delete, check". Seems to make more sense.
Do minimal parsing of incoming messages: find the IPv4 address there
and compare it to the original.
Allocate 9000 bytes to match the largest requsted size. Add a check to
prevent the list of sizes and buffer size from getting out of sync
again.
Reviewed by: markj
Found with: CheriBSD
Differential Revision: https://reviews.freebsd.org/D43340
You can't ever safely map a single page and then map a superpage sized
mapping over it with MAP_FIXED. Even in a single-threaded program, ASLR
might mean you land too close to another mapping and on CheriBSD we
don't allow the initial reservation to grow because doing so requires
program changes that are hard to automate.
To avoid this, map the entire region we want to use upfront.
Reviewed by: markj
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D43282
Implement Netlink socket receive buffer as a simple TAILQ of nl_buf's,
same part of struct sockbuf that is used for send buffer already.
This shaves a lot of code and a lot of extra processing. The pcb rids
of the I/O queues as the socket buffer is exactly the queue. The
message writer is simplified a lot, as we now always deal with linear
buf. Notion of different buffer types goes away as way as different
kinds of writers. The only things remaining are: a socket writer and
a group writer.
The impact on the network stack is that we no longer use mbufs, so
a workaround from d187154750 disappears.
Note on message throttling. Now the taskqueue throttling mechanism
needs to look at both socket buffers protected by their respective
locks and on flags in the pcb that are protected by the pcb lock.
There is definitely some room for optimization, but this changes tries
to preserve as much as possible.
Note on new nl_soreceive(). It emulates soreceive_generic(). It
must undergo further optimization, see large comment put in there.
Note on tests/sys/netlink/test_netlink_message_writer.py. This test
boiled down almost to nothing with mbufs removed. However, I left
it with minimal functionality (it basically checks that allocating N
bytes we get N bytes) as it is one of not so many examples of ktest
framework that allows to test KPIs with python.
Note on Linux support. It got much simplier: Netlink message writer
loses notion of Linux support lifetime, it is same regardless of
process ABI. On socket write from Linux process we perform
conversion immediately in nl_receive_message() and on an output
conversion to Linux happens in in nl_send_one(). XXX: both
conversions use M_NOWAIT allocation, which used to be the case
before this change, too.
Reviewed by: melifaro
Differential Revision: https://reviews.freebsd.org/D42524
With upcoming protocol specific socket buffer for Netlink we need some
additional tests that cover basic socket operations, w/o much of actual
Netlink knowledge. Following tests are performed:
1) Overflow. If an application keeps sending messages to the kernel,
but doesn't read out the replies, then first the receive buffer shall
fill and after that further messages from applications will be queued
on the send buffer until it is filled. After that socket operations
should block. However, reading from the receive buffer some data should
wake up the taskqueue and the send buffer should start draining again.
2) Peek & trunc. Check that socket correctly reports amount of readable
data with MSG_PEEK & MSG_TRUNC. This is typical pattern of Netlink apps.
3) Sizes. Check that zero size read doesn't affect the socket, undersize
read will return one truncated message and the message is removed from
the buffer. Check that large buffer will be filled in one read, without
any boundaries imposed by internal representation of the buffer. Check
that any meaningful read is amended with control data if requested so.
Reviewed by: melifaro
Differential Revision: https://reviews.freebsd.org/D42525
After ad874544d9, interface name
validation has been removed, resulting in two unit tests failures.
Drop the failing tests since they no longer apply.
Reported by: markj
Building tests/sys/fs/fusefs with clang 18 results the following
warning:
tests/sys/fs/fusefs/cache.cc:145:14: error: variable length arrays in C++ are a Clang extension [-Werror,-Wvla-cxx-extension]
145 | uint8_t buf[bufsize];
| ^~~~~~~
Because we do not particularly care that this is a clang extension,
suppress the warning.
MFC after: 3 days
When this functionality was moved to libifconfig in 3dfbda3401,
the end of list calculation was modified for unknown reasons, practically
limiting the number of bridge member returned to (about) 102.
This patch changes the calculation back to what it was originally and
adds a unit test to verify it works as expected.
Reported by: Patrick M. Hausen (via ML)
Reviewed by: kp
Approved by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43135
Have a simple Gilbert-Elliott channel model in
dummynet to mimick correlated loss behavior of
realistic environments. This allows simpler testing
of burst-loss environments.
Reviewed By: tuexen, kp, pauamma_gundo.com, #manpages
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D42980
Let the accept functions provide stack memory for protocols to fill it in.
Generic code should provide sockaddr_storage, specialized code may provide
smaller structure.
While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting
required length in case if provided length was insufficient. Our manual
page accept(2) and POSIX don't explicitly require that, but one can read
the text as they do. Linux also does that. Update tests accordingly.
Reviewed by: rscheff, tuexen, zlei, dchagin
Differential Revision: https://reviews.freebsd.org/D42635
If ZFS reports that a disk had at least 8 I/O operations over 60s that
were each delayed by at least 30s (implying a queue depth > 4 or I/O
aggregation, obviously), fault that disk. Disks that respond this
slowly can degrade the entire system's performance.
MFC after: 2 weeks
Sponsored by: Axcient
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D42825
Add tests for adding a route using an interface only (without an IP
address).
Reviewed by: rcm
Approved by: kp (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41436
The ng_socket(4) node already writes more than declared size of the
struct at least in the in ng_getsockaddr(). Make size match size of
a node name. The value is pasted instead of including ng_message.h
into ng_socket.h. This is external API and we want to keep it stable
even if NG_NODESIZ is redefined in a kernel build.
Reviewed by: afedorov
Differential Revision: https://reviews.freebsd.org/D42690
Shell limitation is that a classic function call via $() is a subshell
and atf-sh(3) commands won't work as epxected there. Subsequently,
atf_skip inside a function won't skip a test. The test will fail later.
A working approach is to pass desired variable name as argument to
a function and don't run subshell.
Reviewed by: ngie
Differential Revision: https://reviews.freebsd.org/D42646
Fixes: ea82362219
We require an ANSI-C compiler to build the base system. It's required
that __func__ work. Remove this define since the only known problem
compilers are ancient history (gcc 2.6 from 1994, almost pre-dating the
project). 3rd party code that used this define will now need to provide
it via some other means when using non-ansi-c compilers.
PR: 275221 (exp-run)
Sponsored by: Netflix
Remove __CC_SUPPORTS_INLINE, __CC_SUPPORTS___INLINE__,
__CC_SUPPORTS___FUNC__, __CC_SUPPORTS_WARNING,
__CC_SUPPORTS_VARADIC_XXX, __CC_SUPPORTS_DYNAMIC_ARRAY_INIT: they are
unused. Also remove them from the generated cryptodevh.py script.
Retain, for the moment, __CC_SUPPORTS___INLINE, since it's used in this
file.
PR: 275221 (exp-run)
Sponsored by: Netflix
Remove __GNUCLIKE_BUILTIN_NEXT_ARG, __GNUCLIKE_MATH_BUILTIN_RELOPS,
__GNUCLIKE_BUILTIN_MEMCPY: they are unused. Also remove them from the
generated cryptodevh.py script.
PR: 275221 (exp-run)
Sponsored by: Netflix
Remove __GNUCLIKE_BUILTIN_VARARGS, __GNUCLIKE_BUILTIN_STDARG,
__GNUCLIKE_BUILTIN_VAALIST, __GNUC_VA_LIST_COMPATIBILITY: they are
unused. Also remove them from the generated cryptodevh.py script.
PR: 275221 (exp-run)
Sponsored by: Netflix
__GNUCLIKE_CTOR_SECTION_HANDLING is unused, remove it. Also remove it
from the generated cryptodevh.py script.
PR: 275221 (exp-run)
Sponsored by: Netflix
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.
Sponsored by: Netflix
Replace int with either size_t or ssize_t (depending on context) in
order to support bit strings up to SSIZE_MAX bits in length. Since
some of the arguments that need to change type are pointers, we must
resort to light preprocessor trickery to avoid breaking existing code.
MFC after: 3 weeks
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D42698
When we create a new state for multihomed sctp connections (i.e.
based on INIT/INIT_ACK or ASCONF parameters) we cannot know what
interfaces we'll be seeing that traffic on. Make those states floating,
irrespective of state policy.
MFC after: 1 week
Sponsored by: Orange Business Services
Dummynet re-injects an mbuf with MTAG_IPFW_RULE added, and the same mtag
is used by divert(4) as parameters for packet diversion.
If according to pf rule set a packet should go through dummynet first
and through ipdivert after then mentioned mtag must be removed after
dummynet not to make ipdivert think that this is its input parameters.
At the very beginning ipfw consumes this mtag what means the same
behavior with tag clearing after dummynet.
And after fabf705f4b pf passes parameters to ipdivert using its
personal MTAG_PF_DIVERT mtag.
PR: 274850
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D42609
In my test suite runs I occasionally see shutdown(2) fail with
ECONNRESET rather than ENOTCONN. soshutdown(2) will return ENOTCONN if
the socket has been disconnected (synchronized by the socket lock), and
tcp_usr_shutdown() will return ECONNRESET if the inpcb has been dropped
(synchronized by the inpcb lock). I think it's possible to pass the
first check in soshutdown() but fail the second check in
tcp_usr_shutdown(), so modify the KTLS tests to permit this.
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42277
The initial multihome implementation was a little simplistic, and failed
to create all of the required states. Given a client with IP 1 and 2 and
a server with IP 3 and 4 we end up creating states for 1 - 3 and 2 - 3,
as well as 3 - 1 and 4 - 1, but not for 2 - 4.
Check for this.
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D42362
Don't drop fragmented packets when reassembly is disabled, they can be
matched by rules with "fragment" keyword. Ensure that presence of scrub
rules forces old behaviour.
Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D42355
Add option to send fragmented packets and to properly sniff them by
reassembling them by the sniffer itself.
Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D42354
Resolved conflict between ipfw and pf if both are used and pf wants to
do divert(4) by having separate mtags for pf and ipfw.
Also fix the incorrect 'rulenum' check, which caused the reported loop.
While here add a few test cases to ensure that divert-to works as
expected, even if ipfw is loaded.
divert(4)
PR: 272770
MFC after: 3 weeks
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D42142