Special thanks to Pavlin Radoslavov <pavlin@icir.org> for testing and
fixing numerous problems.
Sponsored by: FreeBSD Foundation
Reviewed by: Pavlin Radoslavov <pavlin@icir.org>
of bw_meter entries were processed up to one second ahead.
After an unappropriate rescheduling of some of the bw_meter
entries, the upcalls weren't delivered.
* pim_register_prepare() uses the appropriate sw_csum flag to
call ip_fragment() so the IP checksum is computed properly.
* Modify pim_register_prepare() to take care of IP packets that
don't need fragmentation.
* Add-back in_delayed_cksum() to encap_send(), because it seems it
should be there.
Submitted by: Pavlin Radoslavov <pavlin@icir.org>
Disabled by default. To enable it, the new "options PIM" must be
added to the kernel configuration file (in addition to MROUTING):
options MROUTING # Multicast routing
options PIM # Protocol Independent Multicast
2. Add support for advanced multicast API setup/configuration and
extensibility.
3. Add support for kernel-level PIM Register encapsulation.
Disabled by default. Can be enabled by the advanced multicast API.
4. Implement a mechanism for "multicast bandwidth monitoring and upcalls".
Submitted by: Pavlin Radoslavov <pavlin@icir.org>
sysctl:
- sysctlbyname("net.inet.ip.mfctable", ...)
- sysctlbyname("net.inet.ip.viftable", ...)
This change is needed so netstat can use sysctlbyname() to read
the data from those tables.
Otherwise, in some cases "netstat -g" may fail to report the
multicast forwarding information (e.g., if we run a multicast
router on PicoBSD).
* Bug fix: when sending IGMPMSG_WRONGVIF upcall to the multicast
routing daemon, set properly "im->im_vif" to the receiving
incoming interface of the packet that triggered that upcall
rather than to the expected incoming interface of that packet.
* Bug fix: add missing increment of counter "mrtstat.mrts_upcalls"
* Few formatting nits (e.g., replace extra spaces with TABs)
Submitted by: Pavlin Radoslavov <pavlin@icir.org>
of asserting that an mbuf has a packet header. Use it instead of hand-
rolled versions wherever applicable.
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
drain routines are done by swi_net, which allows for better queue control
at some future point. Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.
Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
No functional changes, but:
+ the mrouting module now should behave the same as the compiled-in
version (it did not before, some of the rsvp code was not loaded
properly);
+ netinet/ip_mroute.c is now truly optional;
+ removed some redundant/unused code;
+ changed many instances of '0' to NULL and INADDR_ANY as appropriate;
+ removed several static variables to make the code more SMP-friendly;
+ fixed some minor bugs in the mrouting code (mostly, incorrect return
values from functions).
This commit is also a prerequisite to the addition of support for PIM,
which i would like to put in before DP2 (it does not change any of
the existing APIs, anyways).
Note, in the process we found out that some device drivers fail to
properly handle changes in IFF_ALLMULTI, leading to interesting
behaviour when a multicast router is started. This bug is not
corrected by this commit, and will be fixed with a separate commit.
Detailed changes:
--------------------
netinet/ip_mroute.c all the above.
conf/files make ip_mroute.c optional
net/route.c fix mrt_ioctl hook
netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here
together with other rsvp code, and a couple
of indentation fixes.
netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks
netinet/ip_var.h rsvp function hooks
netinet/raw_ip.c hooks for mrouting and rsvp functions, plus
interface cleanup.
netinet/ip_mroute.h remove an unused and optional field from a struct
Most of the code is from Pavlin Radoslavov and the XORP project
Reviewed by: sam
MFC after: 1 week
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
inpcb parameter to ip_output and ip6_output to allow the IPsec code to
locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version
Reviewed by: julian, luigi (silent), -arch, -net, darren
Approved by: julian, silence from everyone else
Obtained from: openbsd (mostly)
MFC after: 1 month
packets in addition to IPPROTO_IPV4 and IPPROTO_IPV6, explicitly specify
IPPROTO_IPV4 or IPPROTO_IPV6 instead of -1 when calling encap_attach().
MFC after: 28 days
(along with other if_gre changes)
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a
socket buffer. The mutex in the receive buffer also protects the data
in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count
- so_options
- so_linger
- so_state
o Remove *_locked() socket APIs. Make the following socket APIs
touching the members above now require a locked socket:
- sodisconnect()
- soisconnected()
- soisconnecting()
- soisdisconnected()
- soisdisconnecting()
- sofree()
- soref()
- sorele()
- sorwakeup()
- sotryfree()
- sowakeup()
- sowwakeup()
Reviewed by: alfred
Requested by: bde
Since locking sigio_lock is usually followed by calling pgsigio(),
move the declaration of sigio_lock and the definitions of SIGIO_*() to
sys/signalvar.h.
While I am here, sort include files alphabetically, where possible.
pointer which will then result in the allocated route's reference
count never being decremented. Just flood ping the localhost and
watch refcnt of the 127.0.0.1 route with netstat(1).
Submitted by: jayanth
Back out ip_output.c,v 1.143 and ip_mroute.c,v 1.69 that allowed
ip_output() to be called with a NULL route pointer. The previous
paragraph shows why this was a bad idea in the first place.
MFC after: 0 days
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386
Reviewed by: bde, jake, tmm
- Use sysctl to export stats
- Use ip_encap.c's encapsulation support
- Update lkm to kld (is 6 years a record for a broken module?)
- Remove some unused cruft
This closes a minor information leak which allows a remote observer to
determine the rate at which the machine is generating packets, since the
default behaviour is to increment a counter for each packet sent.
Reviewed by: -net
Obtained from: OpenBSD
before adding/removing packets from the queue. Also, the if_obytes and
if_omcasts fields should only be manipulated under protection of the mutex.
IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on
the queue. An IF_LOCK macro is provided, as well as the old (mutex-less)
versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which
needs them, but their use is discouraged.
Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF,
which takes care of locking/enqueue, and also statistics updating/start
if necessary.
in favor of the new-style per-vif socket.
this does not affect the behavior of the ISI rsvpd but allows
another rsvp implementation (e.g., KOM rsvp) to take advantage
of the new style for particular sockets while using the old style
for others.
in the future, rsvp supporn should be replaced by more generic
router-alert support.
PR: kern/20984
Submitted by: Martin Karsten <Martin.Karsten@KOM.tu-darmstadt.de>
Reviewed by: kjc
fields between host and network byte order. The details:
o icmp_error() now does not add IP header length. This fixes the problem
when icmp_error() is called from ip_forward(). In this case the ip_len
of the original IP datagram returned with ICMP error was wrong.
o icmp_error() expects all three fields, ip_len, ip_id and ip_off in host
byte order, so DTRT and convert these fields back to network byte order
before sending a message. This fixes the problem described in PR 16240
and PR 20877 (ip_id field was returned in host byte order).
o ip_ttl decrement operation in ip_forward() was moved down to make sure
that it does not corrupt the copy of original IP datagram passed later
to icmp_error().
o A copy of original IP datagram in ip_forward() was made a read-write,
independent copy. This fixes the problem I first reported to Garrett
Wollman and Bill Fenner and later put in audit trail of PR 16240:
ip_output() (not always) converts fields of original datagram to network
byte order, but because copy (mcopy) and its original (m) most likely
share the same mbuf cluster, ip_output()'s manipulations on original
also corrupted the copy.
o ip_output() now expects all three fields, ip_len, ip_off and (what is
significant) ip_id in host byte order. It was a headache for years that
ip_id was handled differently. The only compatibility issue here is the
raw IP socket interface with IP_HDRINCL socket option set and a non-zero
ip_id field, but ip.4 manual page was unclear on whether in this case
ip_id field should be in host or network byte order.
pr_input() routines prototype is also changed to support IPSEC and IPV6
chained protocol headers.
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
state.
Note: this requires a recompilation of netstat (but netstat has been
broken since rev 1.52 of ip_mroute.c anyway)
Obtained from: Significantly based on Steve McCanne's
<mccanne@cs.berkeley.edu> work for BSD/OS
another specialized mbuf type in the process. Also clean up some
of the cruft surrounding IPFW, multicast routing, RSVP, and other
ill-explored corners.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
previous hackery involving struct in_ifaddr and arpcom. Get rid of the
abominable multi_kludge. Update all network interfaces to use the
new machanism. Distressingly few Ethernet drivers program the multicast
filter properly (assuming the hardware has one, which it usually does).
The rest of the code was treating it as a header mbuf, but it was
allocated as a normal mbuf.
This fixes the panic: ip_output no HDR when you have a multicast
tunnel configured.
pulled up already. This bug can cause the first packet from a source
to a group to be corrupted when it is delivered to a process listening
on the mrouter.
Move ipip_input() and rsvp_input() prototypes to ip_var.h
Remove unused prototype for rip_ip_input() from ip_var.h
Remove unused variable *opts from rip_output()
Make a copy of the header of a packet that gets queued due to
lack of forwarding cache entry, so that nobody else can step
on it. Thanks to Mike Karels <karels@bsdi.com> for pointing
this one out.
Garrett,
Here are some patches for the rate limiting code. It should be faster,
and in particular it doesn't leak malloc'd memory any more when rate_limit'ing
a phyint.
It now uses an mbuf chain at each vif, instead of the static queue array.
This means that the MAXQSIZE is now variable per vif (although there is no
interface to change it other than a debugger); this is an area for more
experimentation.
Bill
Submitted by: Bill Fenner <fenner@parc.xerox.com>
case, multicast options are not passed to ip_mforward().) The previous
version had a wrong test, thus causing RSVP mrouters to forward RSVP messages
in violation of the spec.
submitting them as context diffs for the following files:
sys/netinet/ip_mroute.c
sys/netinet/ip_var.h
sys/netinet/raw_ip.c
usr.sbin/mrouted/igmp.c
usr.sbin/mrouted/prune.c
The routine rip_ip_input in raw_ip.c is suggested by Mark Tinguely
(tinguely@plains.nodak.edu). I have been running mrouted with these patches
for over a week and nothing has seemed seriously wrong. It is being run in
two places on our network as a tunnel on one and a subnet querier on the
other. The only problem I have run into is that mrouted on the tunnel must
start up last or the pruning isn't done correctly and multicast packets
flood your subnets.
Submitted by: Soochon Radee <slr@mitre.org>
to something more recent than the ancient 1.2 release contained in
4.4. This code has the following advantages as compared to
previous versions (culled from the README file for the SunOS release):
- True multicast delivery
- Configurable rate-limiting of forwarded multicast traffic on each
physical interface or tunnel, using a token-bucket limiter.
- Simplistic classification of packets for prioritized dropping.
- Administrative scoping of multicast address ranges.
- Faster detection of hosts leaving groups.
- Support for multicast traceroute (code not yet available).
- Support for RSVP, the Resource Reservation Protocol.
What still needs to be done:
- The multicast forwarder needs testing.
- The multicast routing daemon needs to be ported.
- Network interface drivers need to have the `#ifdef MULTICAST' goop ripped
out of them.
- The IGMP code should probably be bogon-tested.
Some notes about the porting process:
In some cases, the Berkeley people decided to incorporate functionality from
later releases of the multicast code, but then had to do things differently.
As a result, if you look at Deering's patches, and then look at
our code, it is not always obvious whether the patch even applies. Let
the reader beware.
I ran ip_mroute.c through several passes of `unifdef' to get rid of
useless grot, and to permanently enable the RSVP support, which we will
include as standard.
Ported by: Garrett Wollman
Submitted by: Steve Deering and Ajit Thyagarajan (among others)