Summary:
Add the following accessors needed by infiniband drivers:
* if_getaddrlen()
* if_setbroadcastaddr()
* if_resolvemulti()
With these accessors, and additional changes on the drivers' side, an
amd64 kernel can be compiled with `struct ifnet` completely hidden.
Reviewed by: melifaro
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38488
Summary:
Add 2 new APIs for supporting recent mbuf changes:
* 36e0a362ac added the m_snd_tag_alloc() wrapper around
if_snd_tag_alloc(). Push this down to the ifnet level.
* 4d7a1361ef adds the m_rcvif_serialize()/m_rcvif_restore() KPIs to
serialize and restore an ifnet pointer. Add the necessary wrapper to
get the index generation for this.
Reviewed By: jhb
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38340
Summary:
Sometimes it's useful to iterate over all interfaces in the current
VNET, as the linuxulator does in several places.
Unlike other iterators in the IfAPI this propagates any error received
up to the caller, instead of returning a count.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius, melifaro
Differential Revision: https://reviews.freebsd.org/D38348
Summary:
The only user of the ALTQ_IS_ENABLED() in a driver checks against the
ifnet queue. Abstract that all out and present the interface to check
if ALTQ is enabled on the interface.
Sponsored by: Juniper Networks, Inc.
Reviewed By: glebius
Differential Revision: https://reviews.freebsd.org/D38204
Summary:
Firewire is the only device driver that accesses the l2com member, all
other accesses are handled within the netstack itself.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius, melifaro
Differential Revision: https://reviews.freebsd.org/D38203
Summary:
* if_setreassignfn for wireguard.
* if_getinputfn() and if_getstartfn() for various drivers. Use the
function descriptor typedefs for these and the setters.
* vlantrunk accessor. This is used by VLAN_CAPABILITIES() used by
several drivers, as well as directly by mxge(4).
* if_pcp member accessor, used by cxgbe.
* accessors for netmap adapter.
Sponsored by: Juniper Networks, Inc.
Reviewed By: glebius
Differential Revision: https://reviews.freebsd.org/D38202
Summary:
Keep TOEDEV() macro for backwards compatibility, and add a SETTOEDEV()
macro to complement with the new accessors.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38199
Summary:
Port the MAC modules to use the IfAPI APIs as part of this.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38197
As part of the effort to hide the internals of the ifnet struct, convert
the DEBUGNET_SET() macro to use an accessor instead of directly touching
the methods member.
Reviewed by: glebius (older version)
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38105
Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail. Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.
Reviewed by: glebius
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046
Some drivers, like iflib drivers, take a 'context' argument instead of a
ifnet argument, as a single interface may have multiple contexts.
Follow this scheme by passing the context argument down. Most drivers
will likely pass 'ifp' as the context.
Reviewed by: glebius
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38102
Summary:
A common pattern has been to:
if (foo)
caps = IFCAP_FOO;
ifp->if_capenable &= ~IFCAP_FOO;
ifp->if_capenable |= caps;
which in the new order of things would be:
if (foo)
caps = IF_FOO;
if_setcapenablebits(ifp, 0, IFCAP_FOO);
if_setcapenablebits(ifp, caps, 0);
This change streamlines this into:
if (foo)
caps = IF_FOO;
if_setcapenablebits(ifp, caps, IFCAP_FOO);
Reviewed by: melifaro
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37993
Summary:
Add the following accessors to hide some more netstack details:
* if_get/setcapabilities2 and *bits analogue
* if_setdname
* if_getxname
* if_transmit - wrapper for call to ifp->if_transmit()
- This required changing the existing if_transmit to
if_transmit_default, since that's its purpose.
* if_getalloctype
* if_getindex
* if_foreach_addr_type - Like if_foreach_lladdr() but for any address
family type. Used by some drivers to iterate over all AF_INET
addresses.
* if_init() - wrapper for ifp->if_init() call
* if_setinputfn
* if_setsndtagallocfn
* if_togglehwassist
Reviewers: #transport, #network, glebius, melifaro
Reviewed by: #network, melifaro
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37664
The detach of the interface and group were leaving pfi_ifnet memory
behind. Check if the kif still has references, and clean it up if it
doesn't
On interface detach, the group deletion was notified first and then a
change notification was sent. This would recreate the group in the kif
layer. Reorder the change to before the delete.
PR: 257218
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D37569
* Add link-state change notifications by subscribing to ifnet_link_event.
In the Linux netlink model, link state is reported in 2 places: first is
the IFLA_OPERSTATE, which stores state per RFC2863.
The second is an IFF_LOWER_UP interface flag. As many applications rely
on the latter, reserve 1 bit from if_flags, named as IFF_NETLINK_1.
This flag is mapped to IFF_LOWER_UP in the netlink headers. This is done
to avoid making applications think this flag is actually
supported / presented in non-netlink outputs.
* Add flag change notifications, by hooking into rt_ifmsg().
In the netlink model, notification should include the bitmask for the
change flags. Update rt_ifmsg() to include such bitmask.
Differential Revision: https://reviews.freebsd.org/D37597
Add methods for setting and removing the description from the interface,
so the external users can manage it without using ioctl API.
MFC after: 2 weeks
o Assert that every protosw has pr_attach. Now this structure is
only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw. This was suggested
in 1996 by wollman@ (see 7b187005d1), and later reiterated
in 2006 by rwatson@ (see 6fbb9cf860).
o Make struct domain hold a variable sized array of protosw pointers.
For most protocols these pointers are initialized statically.
Those domains that may have loadable protocols have spacers. IPv4
and IPv6 have 8 spacers each (andre@ dff3237ee5).
o For inetsw and inet6sw leave a comment noting that many protosw
entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36232
The old mechanism of getting them via domains/protocols control input
is a relict from the previous century, when nothing like EVENTHANDLER(9)
existed yet. Retire PRC_IFDOWN/PRC_IFUP as netinet was the only one
to use them.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36116
We could insert proxy NDP entries by the ndp command, but the host
with proxy ndp entries had not responded to Neighbor Solicitations.
Change the following points for proxy NDP to work as expected:
* join solicited-node multicast addresses for proxy NDP entries
in order to receive Neighbor Solicitations.
* look up proxy NDP entries not on the routing table but on the
link-level address table when receiving Neighbor Solicitations.
Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35307
MFC after: 2 weeks
When we destroy an interface while the jail containing it is being
destroyed we risk seeing a race between if_vmove() and the destruction
code, which results in us trying to move a destroyed interface.
Protect against this by using the ifnet_detach_sxlock to also covert
if_vmove() (and not just detach).
PR: 262829
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D34704
Supplement ifindex table with generation count and use it to
serialize & restore an ifnet pointer.
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33266
Fun note: git show e6abef0918
(cherry picked from commit e1882428dc)
Now that ifindex is static to if.c we can unvirtualize it. For lifetime
of an ifnet its index never changes. To avoid leaking foreign interfaces
the net.link.generic.system.ifcount sysctl and the ifnet_byindex() KPI
filter their returned value on curvnet. Since if_vmove() no longer
changes the if_index, inline ifindex_alloc() and ifindex_free() into
if_alloc() and if_free() respectively.
API wise the only change is that now minimum interface index can be
greater than 1. The holes in interface indexes were always allowed.
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33672
(cherry picked from commit 91f44749c6)
This reverts commit 91f44749c6.
Devirtualization of V_if_index and V_ifindex_table was rushed into
the tree lacking proper context, discussion, and declaration of intent,
so I'm backing it out as harmful to VNET on the following grounds:
1) The change repurposed the decades-old and stable if_index KBI for
new, unclear goals which were omitted from the commit note.
2) The change opened up a new resource exhaustion vector where any vnet
could starve the system of ifnet indices, including vnet0.
3) To circumvent the newly introduced problem of separating ifnets
belonging to different vnets from the globalized ifindex_table, the
author introduced sysctl_ifcount() which does a linear traversal over
the (potentially huge) global ifnet list just to return a simple upper
bound on existing ifnet indices.
4) The change effectively led to nonuniform ifnet index allocation
among vnets.
5) The commit note clearly stated that the patch changed the implicit
if_index ABI contract where ifnet indices were assumed to be starting
from one. The commit note also included a correct observation that
holes in interface indices were always allowed, but failed to declare
that the userland-observable ifindex tables could now include huge
empty spans even under modest operating conditions.
6) The author had an earlier proposal in the works which did not
affect per-vnet ifnet lists (D33265) but which he abandoned without
providing the rationale behind his decision to do so, at the expense
of sacrificing the vnet isolation contract and if_index ABI / KBI.
Furthermore, the author agreed to back out his changes himself and
to follow up with a proposal for a less intrusive alternative, but
later silently declined to act. Therefore, I decided to resolve the
status-quo by backing this out myself. This in no way precludes a
future proposal aiming to mitigate ifnet-removal related system
crashes or panics to be accepted, provided it would not unnecessarily
compromise the goal of as strict as possible isolation between vnets.
Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
This reverts commit 703e533da5.
Revert "ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif"
This reverts commit e1882428dc.
Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
Supplement ifindex table with generation count and use it to
serialize & restore an ifnet pointer.
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33266
Fun note: git show e6abef0918
Now that ifindex is static to if.c we can unvirtualize it. For lifetime
of an ifnet its index never changes. To avoid leaking foreign interfaces
the net.link.generic.system.ifcount sysctl and the ifnet_byindex() KPI
filter their returned value on curvnet. Since if_vmove() no longer
changes the if_index, inline ifindex_alloc() and ifindex_free() into
if_alloc() and if_free() respectively.
API wise the only change is that now minimum interface index can be
greater than 1. The holes in interface indexes were always allowed.
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33672
* Do a single call into if_clone.c instead of two. The cloner
can't disappear since the interface sits on its list.
* Make restoration smarter - check that cloner with same name
exists in the new vnet.
Differential revision: https://reviews.freebsd.org/D33941
In 4f6c66cc9c, ifa_ifwithnet() was changed to no longer
ifa_ref() the returned ifaddr, and instead the caller was required
to stay in the net_epoch for as long as they wanted the ifaddr
to remain valid. However, this missed the case where an AF_LINK
lookup would call ifaddr_byindex(), which still does ifa_ref()
the ifaddr. This would cause a refcount leak.
Fix this by inlining the relevant parts of ifaddr_byindex() here,
with the ifa_ref() call removed. This also avoids an unnecessary
entry and exit from the net_epoch for this case.
I've audited all in-tree consumers of ifa_ifwithnet() that could
possibly perform an AF_LINK lookup and confirmed that none of them
will expect the ifaddr to have a reference that they need to
release.
MFC after: 2 months
Sponsored by: Dell Inc
Differential Revision: https://reviews.freebsd.org/D28705
Reviewed by: melifaro
This requires moving net.link.generic sysctl declaration from if_mib.c
to if.c. Ideally if_mib.c needs just to be merged to if.c, but they
have different license texts.
Differential revision: https://reviews.freebsd.org/D33263
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them
in epoch. Most of the code touched remains unsafe, as the returned
pointer is being used after epoch exit. Mark that with a comment.
Validate the index argument inside the function, reducing argument
validation requirement from the callers and making V_if_index
private to if.c.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D33263
Now it is possible to just merge all this complexity into single
linear function. Note that IFNET_WLOCK() is a sleepable lock, so
we can M_WAITOK and epoch_wait_preempt().
Reviewed by: melifaro, bz, kp
Differential revision: https://reviews.freebsd.org/D33262
So let's just call malloc() directly. This also avoids hidden
doubling of default V_if_indexlim.
Reviewed by: melifaro, bz, kp
Differential revision: https://reviews.freebsd.org/D33261
Now that if_alloc_domain() never fails and actually doesn't
expose ifnet to outside we can eliminate IFNET_HOLD and two
step index allocation.
Reviewed by: kp
Differential revision: https://reviews.freebsd.org/D33259
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.
A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
The last consumer of if_com_alloc() is firewire. It never fails
to allocate. Most likely the if_com_alloc() KPI will go away
together with if_fwip(), less likely new consumers of if_com_alloc()
will be added, but they would need to follow the no fail KPI.