The most significant changes are:
- Use UMA zone instead of own chunk of memory.
- Lock each hash entry separately.
- Expire items "actively" - interrupt method can expire flows
from hash slot, when it searches through it.
- Remove global tailqueue. Make callout thread search through
every hash slot.
- Export datagram is detached from private data and filled. If
it is incomplete, it is attached back. Another thread will
continue working with it.
Lesser, but also important speedups:
- Flows in hash slot are stored in tailqueue. Whenever a flow is
hit, it is moved to the begging, so it can be located quicker.
- When callout thread works with hash slot it bails out if
slot mutex is contested.
to the mbuf. Offset cannot exceed MHLEN bytes. This is currently used to
fix Ethernet header alignment problem on alpha and sparc64. Also change all
users of m_uiotombuf to pass proper offset.
Reviewed by: jmg, sam
Tested by: Sten Spans "sten AT blinkenlights DOT nl"
MFC after: 1 week
protocol. RFCOMM is a SOCK_STREAM protocol not SOCK_SEQPACKET. This was a
serious bug caused by cut-and-paste. I'm surprised it did not bite me before.
Dunce hat goes to me.
MFC after: 3 days
EA bit is set in hdr->length (16-bit length). This currently has no effect
on the rest of the code. It just fixes the debug message.
MFC After: 3 weeks
Functional changes:
- Cut struct source_hookinfo. Just use hook_p pointer.
- Remove "start_now" command. "start" command now requires number of
packets to send as argument. "start" command actually starts sending.
Move the code that actually starts sending from ng_source_rcvmsg()
to ng_source_start().
- Remove check for NG_SOURCE_ACTIVE in ng_source_stop(). We can be called
with flag cleared (see begin of ng_source_intr()).
- If NG_SEND_DATA_ONLY() use log(LOG_DEBUG) instead of printf(). Otherwise
we will *flood* console.
- Add ng_connect_t method, which sends NGM_ETHER_GET_IFNAME command
to "output" hook. Cut ng_source_request_output_ifp(). Refactor
ng_source_store_output_ifp() to use ifunit() and don't muck through
interface list.
- Add "setiface" command, which gives ability to configure interface
in case when ng_source_connect() failed. This happens, when we are not
connected directly to ng_ether(4) node.
- Remove KASSERTs, which can never fire.
- Don't check for M_PKTHDR in rcvdata method. netgraph(4) does this
for us.
Style:
- Assign sc_p = NG_NODE_PRIVATE(node) in declaration, to be
consistent with style of other nodes.
- Sort variables.
- u_intXX -> uintXX.
- Dots at ends of comments.
Sponsored by: Rambler
be pass-thru mode, when traffic is not copied by ng_tee, but passed thru
ng_netflow.
Changes made:
- In ng_netflow_rcvdata() do all necessary pulluping: Ethernet header,
IP header, and TCP/UDP header.
- Pass only pointer to struct ip to ng_netflow_flow_add(). Any TCP/UDP
headers are guaranteed to by after it.
- Merge make_flow_rec() function into ng_netflow_flow_add().
be pass-thru mode, when traffic is not copied by ng_tee, but passed thru
ng_netflow.
Changes made:
- In ng_netflow_rcvdata() do all necessary pulluping: Ethernet header,
IP header, and TCP/UDP header.
- Pass only pointer to struct ip to ng_netflow_flow_add(). Any TCP/UDP
headers are guaranteed to by after it.
- Merge make_flow_rec() function into ng_netflow_flow_add().
precision when IP packet may travel through internet for several seconds.
Also uptime measured in milliseconds overflows every 48+ days.
But we have to do same to keep compatibility with Cisco and flow-tools.
Make a macro MILLIUPTIME, which does overflowable multiplication to 1000.
Requested by: Sergey Ryabin, Oleg Bulyzhin
MFC after: 1 week
its return value and free resources if function returns error. Plug
several memory leaks with this change.
Submitted by: archie
Found by: Coverity Prevent analysis tool
a socket from a regular socket to a listening socket able to accept new
connections. As part of this state transition, solisten() calls into the
protocol to update protocol-layer state. There were several bugs in this
implementation that could result in a race wherein a TCP SYN received
in the interval between the protocol state transition and the shortly
following socket layer transition would result in a panic in the TCP code,
as the socket would be in the TCPS_LISTEN state, but the socket would not
have the SO_ACCEPTCONN flag set.
This change does the following:
- Pushes the socket state transition from the socket layer solisten() to
to socket "library" routines called from the protocol. This permits
the socket routines to be called while holding the protocol mutexes,
preventing a race exposing the incomplete socket state transition to TCP
after the TCP state transition has completed. The check for a socket
layer state transition is performed by solisten_proto_check(), and the
actual transition is performed by solisten_proto().
- Holds the socket lock for the duration of the socket state test and set,
and over the protocol layer state transition, which is now possible as
the socket lock is acquired by the protocol layer, rather than vice
versa. This prevents additional state related races in the socket
layer.
This permits the dual transition of socket layer and protocol layer state
to occur while holding locks for both layers, making the two changes
atomic with respect to one another. Similar changes are likely require
elsewhere in the socket/protocol code.
Reported by: Peter Holm <peter@holm.cc>
Review and fixes from: emax, Antoine Brodin <antoine.brodin@laposte.net>
Philosophical head nod: gnn
- refactor ngd_constructor, so that make_dev() is called without
any locks held, since it mallocs memory with M_WAITOK flag.
- rename global mtx, to have name different to per-node mtx
MFC after: 2 weeks
removes netgraph node and unwraps Ethernet interface.
This gives us ability to unload ng_ether.ko, when all interfaces
are detached, making ng_ether(4) developers happy.
Reviewed by: ru
a definite setup was broken: two ng_ksockets are connected to each other,
connect()ed to different remote hosts, and bind()ed to different local
interfaces. In this case one ng_ksocket is fooled with tag from the other
one.
Put node id into tag. In rcvdata method utilize tag only if it has our
own id inside or id equals zero. The latter case is added to support
packets send by some third, not ng_ksocket node.
MFC after: 1 week
with net byte order. Change byte order to net in ng_ipfw_input(), change
byte order to host before ip_output(), do not change before ip_input().
In collaboration with: ru
The difference is that the callout function installed via the
ng_callout() method is guaranteed to NOT fire after the shutdown
method was run (when a node is marked NGF_INVALID). Also, the
shutdown method and the callout function are guaranteed to NOT
run at the same time, as both require the writer lock. Thus
we can safely ignore a zero return value from ng_uncallout()
(callout_stop()) in shutdown methods, and go on with freeing
the node.
The said revision broke the node shutdown -- ng_bridge_timeout()
is no longer fired after ng_bridge_shutdown() was run, resulting
in a memory leak, dead nodes, and inability to unload the module.
Fix this by cancelling the callout on shutdown, and moving part
responsible for freeing a node resources from ng_bridge_timer()
to ng_bridge_shutdown().
Noticed by: ru
Submitted by: glebius, ru
before entering ng_netflow. In this case it will have not NULL m_pkthdr.rcvif.
However, it will enter ng_iface soon with another index. So let in_ifIndex
value configured by user override m_pkthdr.rcvif.
Reported by: Damir Bikmuhametov
MFC after: 1 week
so we need to acquire Giant in netgraph methods, so that we don't
race with line discipline methods. Remove NET_NEEDS_GIANT.
- Packets coming into node from netgraph are queued in ifqueue
attached to node private data.
- Mutex in struct ifqueue is used to lock not only the queue, but
the whole private data, and tp->t_lsc field.
- tp->t_lsc pointer is used to indicate whether line discipline is
attached to netgraph or not.
- Use FLG_DIE flag to indicate that node may be destroyed.
(This protection doesn't work, and it didn't before. Must be redesigned.)
- Increment ngt_unit atomically, removing mutex.
- Acquire Giant, when executing ngt_start() from netgraph context.
- Acquire Giant, when {,de}registering line discipline.
- Uncomment forcing queue mode on peers hook, since this is reasonable.
- Force queue mode on our hook, to avoid acquiring Giant when coming from
network stack. We may already hold some mutexes at this point.
Cleanups:
- Use callout_pending() instead of our own flag.
- Remove spl(9) calls. Now we can use return() instead of ERROUT().
style(9):
- Sort includes.
- Sparse initializer for struct linesw.
- Remove some empty lines, sort declarations.
Reviewed by: julian, phk
MFC after: 1 month
- Use callout_pending() instead of our own flags.
- Remove home-grown protection of node, which has a scheduled
callout().
- Remove spl(9) calls.
Tested by: bz
This is just a workaround for a know problem with Motorola E1000
phone. Something is wrong with the configuration of L2CAP/RFCOMM
channel. Even though we set L2CAP MTU to 132 bytes (default RFCOMM
MTU 127 + 5 bytes RFCOMM frame header) and the phone accepts it,
the phone still sends oversized L2CAP packets. It appears that the
phone wants to use bigger (667 bytes) RFCOMM frames, but it does
not segment them according to the configured L2CAP MTU. The 667
bytes RFCOMM frame size corresponds to the default L2CAP MTU of
672 bytes (667 + 5 bytes RFCOMM frame header).
This problem only appears if connection was initiated from the
phone. I'm not sure who is at fault here, so for now just put
workaround in place. Quick look at the spec did not reveal any
anwser.
Tested by: Jes < jjess at freebsd dot polarhome dot com >
MFC after: 3 days
- Introduce another ng_ether(4) callback ng_ether_link_state_p, which
is called from if_link_state_change(), every time link is changed.
- In ng_ether_link_state() send netgraph control message notifying
of link state change to a node connected to "lower" hook.
Reviewed by: sam
MFC after: 2 weeks
SI_SUB_INIT_IF but before SI_SUB_DRIVERS. Make Netgraph(4)
framework initialize at SI_SUB_NETGRAPH level.
This does not address the bigger problem: MODULE_DEPEND
does not seem to work when modules are compiled in the
kernel, but it fixes the problem with Netgraph Bluetooth
device drivers reported by a few folks.
PR: i386/69876
Reviewed by: julian, rik, scottl
MFC after: 3 days
- Do not put/remove node references, since this no longer
needed.
- Remove timerActive flag, use callout flags.
- Schedule next callout after doing current one.
Reviewed by: archie
Approved by: julian (mentor)
- Always check that index number passed from userland
is <= NG_NETFLOW_MAXIFACES. [1]
- Increase NG_NETFLOW_MAXIFACES up to 512. [2]
Noticed by: Roman Palagin [1]
Requested by: Yuri Y. Bushmelev [2]
MFC after: 1 week
call net_add_domain(). Calling this function too early (or late) breaks
assertations about the global domains list.
Actually it should be forbidden to call net_add_domain() outside of
SI_SUB_PROTO_DOMAIN completely as there are many places where we traverse
the domains list unprotected, but for now we allow late calls (mostly to
support netgraph). In order to really fix this we have to lock the domains
list in all places or find another way to ensure that we can safely walk the
list while another thread might be adding a new domain.
Spotted by: se
Reviewed by: julian, glebius
PR: kern/73321 (partly)
normal PPP compression, as a workaround for certain (arguably) broken
Linux PPP implementations that can't handle this particular case.
MFC after: 1 week
It means, that node listens to flow control messages from downstreams
and removes link from list of active links whenever a LINK_IS_DOWN message
is received. If LINK_IS_UP message is received, then links is put
back into list of active links.
Approved by: julian (mentor), implicitly
MFC after: 1 week
o Implement some netgraph flow control:
- Whenever status of HDLC heartbeat from pear is timed out,
send NGM_LINK_IS_DOWN message.
- If HDLC link changes status from down to up, send
NGM_LINK_IS_UP message.
Approved by: julian (mentor), implicitly
MFC after: 1 week
out c->c_func, we can't take it after callout_stop(). To take it before
we need to acquire callout_lock, to avoid race. This commit narrows
down area where lock is held, but hack is still present.
This should be redesigned.
Approved by: julian (mentor)
field created for line disciplne drivers private use. Also add NET_NEEDS_GIANT
warning. For whatever reason ng_tty(4) was fixed but ng_h4(4) was not :(
(sorele()/sotryfree()):
- This permits the caller to acquire the accept mutex before the socket
mutex, avoiding sofree() having to drop the socket mutex and re-order,
which could lead to races permitting more than one thread to enter
sofree() after a socket is ready to be free'd.
- This also covers clearing of the so_pcb weak socket reference from
the protocol to the socket, preventing races in clearing and
evaluation of the reference such that sofree() might be called more
than once on the same socket.
This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket. The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets. The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.
RELENG_5_3 candidate.
MFC after: 3 days
Reviewed by: dwhite
Discussed with: gnn, dwhite, green
Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by: Vlad <marchenko at gmail dot com>
List of functional changes:
- Make a single device per single node with a single hook.
This gives us parrallelizm, which can't be achieved on a single
node with many devices/hooks. This also gives us flexibility - we
can play with a particular device node, not affecting others.
- Remove read queue as it is. Use struct ifqueue instead. This change
removes a lot of extra memcpy()ing, m_devget()ting and m_copymem()ming.
In ng_device_receivedata() we enqueue an mbuf and wake readers.
In ngdread() we take one mbuf from qeueue and uiomove() it to
userspace. If no mbuf is present we optionally block. [1]
- In ngdwrite() we create an mbuf from uio using m_uiotombuf().
This is faster then uiomove() into buffer, and then m_copydata(),
and this is much better than huge m_pullup().
- Perform locking of device
- Perform locking of connection list.
- Clear out _rcvmsg method, since it does nothing good yet.
- Implement NGM_DEVICE_GET_DEVNAME message.
- #if 0 ioctl method, while nothing is done here yet.
- Return immediately from ngdwrite() if uio_resid == 0.
List of tidyness changes:
- Introduce device2priv(), to remove cut'n'paste.
- Use MALLOC/FREE, instead of malloc/free.
- Use unit2minor().
- Use UID_ROOT/GID_WHEEL instead of 0/0.
- Define NGD_DEVICE_DEVNAME, use it.
- Use more nice macros for debugging. [2]
- Return Exxx, not -1.
style(9) changes:
- No "#endif" after short block.
- Break long lines.
- Remove extra spaces, add needed spaces.
[1] Obtained from: if_tun.c
[2] Obtained from: ng_pppoe.c
Reviewed by: marks
Approved by: julian (mentor)
MFC after: 1 month
The original idea was to use it for firmware upgrading and similar
operations. In real life almost all Bluetooth USB devices do not
need firmware download. If device does require firmware download
then ugen(4) (or specialized driver like ubtbcmfw(8)) should be
used instead.
MFC after: 3 days
- push all bridge logic from if_ethersubr.c into bridge.c
make bridge_in() return mbuf pointer (or NULL).
- call only bridge_in() from ether_input(), after ng_ether_input()
was optinally called.
- call bridge_in() from ng_ether_rcv_upper().
Long description: http://lists.freebsd.org/mailman/htdig/freebsd-net/2004-May/003881.html
Reported by: Jian-Wei Wang <jwwang at FreeBSD.csie.NCTU.edu.tw>
Tested by: myself, Sergey Lyubka
Reviewed by: sam
Approved by: julian (mentor)
MFC after: 2 months
loss links, and 1 second appeared to be too small for high latency links.
If we will receive more complaints, we should make this parameter configurable.
PR: kern/69536
Approved by: archie, julian (mentor)
MFC after: 3 days