1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-17 10:26:15 +00:00
Commit Graph

46971 Commits

Author SHA1 Message Date
Maxime Henrion
0f1db1d60f Use the device sysctl tree instead of rolling our own. Some of the
sysctls were global (hw.fxp_rnr and hw.fxp_noflow), all of them are
now per-device.  Sample output of "sysctl dev.fxp0" with this patch,
with the standard %foo nodes removed :

dev.fxp0.int_delay: 1000
dev.fxp0.bundle_max: 6
dev.fxp0.rnr: 0
dev.fxp0.noflow: 0
2004-06-02 22:52:18 +00:00
Maxime Henrion
2e34ae7a26 As discussed on arch@, flatten the device sysctl tree to make it
more convenient to deal with.  The notion of hierarchy is however
preserved by adding a new %parent node.
2004-06-02 22:43:35 +00:00
Christian Weisgerber
16b4a34316 Add helper functions to calculate the standard ethernet CRC in
little/big endian fashion, so that network drivers can just reference
the standard implementation and don't have to bring their own.

As discussed on arch@.

Obtained from:	NetBSD
2004-06-02 21:34:14 +00:00
Scott Long
03b5fe51bf Collapse sync fib locking into normal i/o locking. The former didn't
protect the registers so it was trivially possible for a sync command and
i/o command to fight each other and confuse the controller.  Make the
sync fib alloc/release functions inline and remove the somewhat worthless
AAC_SYNC_LOCK_FORCE flag.  Thanks to Adil Katchi for helping me to track
this down in RELENG_4.
2004-06-02 18:15:48 +00:00
Max Khon
9455fe9a67 Remove extra semicolon. 2004-06-02 18:03:07 +00:00
Hajimu UMEMOTO
3c751c1b6c do not check super user privilege in ip6_savecontrol. It is
meaningless and can even be harmful.

Obtained from:	KAME
MFC after:	3 days
2004-06-02 15:41:18 +00:00
Tim J. Robbins
e4e815db72 Remove a redundant "td = curthread" statement from profclock(). 2004-06-02 12:05:06 +00:00
Poul-Henning Kamp
d7d485798f Some embedded platforms have no keyboard controller. Give up waiting
for it to react after a timeout.
2004-06-02 09:38:32 +00:00
Tim J. Robbins
aa0aa7a113 Move TDF_SA from td_flags to td_pflags (and rename it accordingly)
so that it is no longer necessary to hold sched_lock while
manipulating it.

Reviewed by:	davidxu
2004-06-02 07:52:36 +00:00
Jeff Roberson
dc03363dd8 - Run sched_balance() and sched_balance_groups() from hardclock via
sched_clock() rather than using callouts.  This means we no longer have to
   take the load of the callout thread into consideration while balancing and
   should make the balancing decisions simpler and more accurate.

Tested on:	x86/UP, amd64/SMP
2004-06-02 05:46:48 +00:00
Robert Watson
2658b3bb8e Integrate accept locking from rwatson_netperf, introducing a new
global mutex, accept_mtx, which serializes access to the following
fields across all sockets:

          so_qlen          so_incqlen         so_qstate
          so_comp          so_incomp          so_list
          so_head

While providing only coarse granularity, this approach avoids lock
order issues between sockets by avoiding ownership of the fields
by a specific socket and its per-socket mutexes.

While here, rewrite soclose(), sofree(), soaccept(), and
sonewconn() to add assertions, close additional races and  address
lock order concerns.  In particular:

- Reorganize the optimistic concurrency behavior in accept1() to
  always allocate a file descriptor with falloc() so that if we do
  find a socket, we don't have to encounter the "Oh, there wasn't
  a socket" race that can occur if falloc() sleeps in the current
  code, which broke inbound accept() ordering, not to mention
  requiring backing out socket state changes in a way that raced
  with the protocol level.  We may want to add a lockless read of
  the queue state if polling of empty queues proves to be important
  to optimize.

- In accept1(), soref() the socket while holding the accept lock
  so that the socket cannot be free'd in a race with the protocol
  layer.  Likewise in netgraph equivilents of the accept1() code.

- In sonewconn(), loop waiting for the queue to be small enough to
  insert our new socket once we've committed to inserting it, or
  races can occur that cause the incomplete socket queue to
  overfill.  In the previously implementation, it was sufficient
  to simply tested once since calling soabort() didn't release
  synchronization permitting another thread to insert a socket as
  we discard a previous one.

- In soclose()/sofree()/et al, it is the responsibility of the
  caller to remove a socket from the incomplete connection queue
  before calling soabort(), which prevents soabort() from having
  to walk into the accept socket to release the socket from its
  queue, and avoids races when releasing the accept mutex to enter
  soabort(), permitting soabort() to avoid lock ordering issues
  with the caller.

- Generally cluster accept queue related operations together
  throughout these functions in order to facilitate locking.

Annotate new locking in socketvar.h.
2004-06-02 04:15:39 +00:00
Robert Watson
f3d055b6de Rather than assert f_type==DTYPE_VNODE, conditionally perform the
file lock release based on f_type==DTYPE_VNODE.  vn_closefile() is
used by non-vnode types as well (fifo).
2004-06-01 23:36:47 +00:00
Bill Paul
a1b1f3821d Explicitly #include <sys/module.h> in these files too (they use
MODULE_DEPEND()).
2004-06-01 23:27:36 +00:00
Bill Paul
8c2dd02b27 Explicitly #include <sys/module.h> instead of depending on <sys/kernel.h>
to do it for us.
2004-06-01 23:24:17 +00:00
Poul-Henning Kamp
9c01a318d2 A major overhaul of the nmdm(4) driver:
It was based on the pty(4) driver which as a tty side an a non-tty side.

Nmdm(4) seems to have inherited two symmetric sides from pty but
unfortunately they are not quite ttys.  Running a getty one one
side and tip on the other failed to produce NL->CRNL mapping for
instance.

Rip out the basically bogus cdevsw->{read,write} functions and rely
on ttyread() and ttywrite() which does the same thing.

Use taskqueue_swi_giant to run a task for either side to do what
needs to be done.  (Direct calling is not an option as it leads to
recursion.)  Trigger the task from the t_oproc and t_stop methods.

Default the ports to not ECHO.  Since we neither rate limiting nor
emulation, two ports echoing each other is a really bad idea, which
can only be properly mitigated by rate limiting, rate emulation or
intelligent detection.  Rate emulation would be a neat feature.

Ditch the modem-line emulation, if needed for some app, it needs
to be thought much more about how it interacts with the open/close
logic.
2004-06-01 22:53:00 +00:00
John Baldwin
c9702514c0 - Add a function ioapic_program_intpin() that completely programs an I/O
APIC interrupt pin based on the settings in the corresponding interrupt
  source structure.
- Use ioapic_program_intpin() in place of manual frobbing of the intpin
  configuration in ioapic_program_destination() and ioapic_register().
- Use ioapic_program_intpin() to implement suspend/resume support for I/O
  APICs.
2004-06-01 20:28:42 +00:00
Joerg Wunsch
c7cfc3b129 Add SVR4-compatible VTOC-style elements to the Sun label. The
FreeBSD kernel doesn't use them but sunlabel(8) shortly will,
and both these files are used by sunlabel(8).
2004-06-01 20:18:25 +00:00
John Baldwin
4468ab0a61 Allow the pir0 device add to fail since pir0 may already exist. This should
fix the panics in device_set_ivars() that people were seeing on boxes with
multiple Host-PCI bridges but not using ACPI.
2004-06-01 19:51:29 +00:00
John Baldwin
d5c5dcf762 Fix legacy_add_child() to properly handle the case where
device_add_child_ordered() fails (due to a duplicate device add for
example) and properly cleanup and return NULL.
2004-06-01 19:50:42 +00:00
John Baldwin
405747db0b Use the local APIC ID rather than the ACPI Processor ID to index the array
of CPUs since local APIC IDs are bounded but ACPI IDs are not bounded.
2004-06-01 19:49:38 +00:00
Robert Watson
302b450136 Replace current locking comments for struct socket/struct sockbuf
with new ones.  Annotate constant-after-creation fields as such.  The
comments describe a number of locks that are not yet merged.
2004-06-01 19:33:06 +00:00
Poul-Henning Kamp
4b5f407579 Remove unused variable. 2004-06-01 19:02:51 +00:00
Robert Watson
948a4734ed Add GIANT_REQUIRED to kqueue_close(), since kqueue currently requires
Giant.
2004-06-01 18:05:41 +00:00
Robert Watson
63732dce22 Push the VOP_ADVLOCK() call to release advisory locks on vnode file
descriptors out of fdrop_locked() and into vn_closefile().  This
removes all knowledge of vnodes from fdrop_locked(), since the lock
behavior was specific to vnodes.  This also removes the specific
requirement for Giant in fdrop_locked(), it's now only required by
code that it calls into.

Add GIANT_REQUIRED to vn_closefile() since VFS requires Giant.
2004-06-01 18:03:20 +00:00
Bosko Milekic
6bc72ab95a Fix a couple of bugs in the mbuf and packet ctors. In the latter case,
nextpkt within the m_hdr was not being initialized to NULL for
!M_PKTHDR cases.  *Maybe* this will fix weird socket buffer
inconsistency panics, but we'll see.
2004-06-01 16:17:10 +00:00
Scott Long
614c22b2a2 Commit the correct version of the patch from last night. This fixes an
immediate panic when doing any i/o, and it closes a completion race.
2004-06-01 15:50:11 +00:00
Poul-Henning Kamp
138fbf675a Gainfully employ the new ttyioctl in the trivial cases. 2004-06-01 13:49:28 +00:00
Poul-Henning Kamp
3a95025ffc Introduce a ttyioctl() cdevsw default function. 2004-06-01 13:39:02 +00:00
Ruslan Ermilov
58acf05ade Removed a leftover from the previous change.
Submitted by:	Gleb Smirnoff
2004-06-01 13:15:32 +00:00
Søren Schmidt
86e711a393 When waiting for drive to become ready, reinit the request params as they
might get trashed by autosensing.
2004-06-01 12:28:45 +00:00
Søren Schmidt
c83c43b66e Use the right cmd+errorcode if we are in autosense/not. 2004-06-01 12:26:08 +00:00
Poul-Henning Kamp
be9bd88238 There is no need to explicitly call the stop function. In all likelyhood
->l_close() did it and ttyclose certainly will.
2004-06-01 11:57:15 +00:00
Poul-Henning Kamp
847dc1fe88 shift the four cdevsw functions for ttys to sys/conf.h and prototype
them with the correct typedef.
2004-06-01 11:56:04 +00:00
Poul-Henning Kamp
a1cda79464 There is no need to explicitly call ttwakeup() and ttwwakeup() after
ttyclose() has been called.  It's already been done once by ttyclose,
and probably once by the line-discipline too.
2004-06-01 11:38:06 +00:00
Søren Schmidt
92b3fb2908 Only set and report error if not set already. 2004-06-01 11:37:24 +00:00
Søren Schmidt
541cd509d3 Dont retry on devices that left the system.
Ignore "fake" devices that has 0x7f status.
2004-06-01 11:34:46 +00:00
Poul-Henning Kamp
bda4474a59 ttyclose() increments t_gen. Remove redundant increments in drivers. 2004-06-01 10:15:56 +00:00
Don Lewis
38665043ab Whitespace correction - #define should be followed by a tab. 2004-06-01 08:59:03 +00:00
Seigo Tanimura
a5dc42def9 Axe the old midi drivers and framework. matk has developed a new
module-friendly midi subsystem to be merged soon.
2004-06-01 06:22:59 +00:00
Scott Long
397fa34f51 Collapse aac_map_command() into aac_startio(). Check the AAC_QUEUE_FRZN in
every iteration of aac_startio().  This ensures that a command that is
deferred for lack of resources doesn't immediately get retried in the
aac_startio() loop.  This avoids an almost certain livelock.
2004-06-01 05:32:26 +00:00
Robert Watson
d087080c1f Add a global mutex, accept_filter_mtx, to protect the global list of
accept filters and prevent read-modify-write races.
2004-06-01 04:08:48 +00:00
Robert Watson
36568179e3 The SS_COMP and SS_INCOMP flags in the so_state field indicate whether
the socket is on an accept queue of a listen socket.  This change
renames the flags to SQ_COMP and SQ_INCOMP, and moves them to a new
state field on the socket, so_qstate, as the locking for these flags
is substantially different for the locking on the remainder of the
flags in so_state.
2004-06-01 02:42:56 +00:00
Bosko Milekic
b83e441b9f Fix a comment above uma_zsecond_create(), describing its arguments.
It doesn't take 'align' and 'flags' but 'master' instead, which is
a reference to the Master Zone, containing the backing Keg.

Pointed out by: Tim Robbins (tjr)
2004-06-01 01:36:26 +00:00
Don Lewis
866046f5a6 Add MSG_NBIO flag option to soreceive() and sosend() that causes
them to behave the same as if the SS_NBIO socket flag had been set
for this call.  The SS_NBIO flag for ordinary sockets is set by
fcntl(fd, F_SETFL, O_NONBLOCK).

Pass the MSG_NBIO flag to the soreceive() and sosend() calls in
fifo_read() and fifo_write() instead of frobbing the SS_NBIO flag
on the underlying socket for each I/O operation.  The O_NONBLOCK
flag is a property of the descriptor, and unlike ordinary sockets,
fifos may be referenced by multiple descriptors.
2004-06-01 01:18:51 +00:00
Nate Lawson
8fb8b5fb27 Remove debugging printf that never triggered because acpi is the first
user of nexus::bus_get_resource.
2004-06-01 01:04:25 +00:00
Max Laier
1fb675e712 "Get rid of the nested include of <sys/module.h> from <sys/kernel.h>" or
better do no longer depend on it.

Requested-by:	phk
Approved-by:	bms(mentor)
2004-05-31 22:48:19 +00:00
Bosko Milekic
099a0e588c Bring in mbuma to replace mballoc.
mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.

Extensions to UMA worth noting:
  - Better layering between slab <-> zone caches; introduce
    Keg structure which splits off slab cache away from the
    zone structure and allows multiple zones to be stacked
    on top of a single Keg (single type of slab cache);
    perhaps we should look into defining a subset API on
    top of the Keg for special use by malloc(9),
    for example.
  - UMA_ZONE_REFCNT zones can now be added, and reference
    counters automagically allocated for them within the end
    of the associated slab structures.  uma_find_refcnt()
    does a kextract to fetch the slab struct reference from
    the underlying page, and lookup the corresponding refcnt.

mbuma things worth noting:
  - integrates mbuf & cluster allocations with extended UMA
    and provides caches for commonly-allocated items; defines
    several zones (two primary, one secondary) and two kegs.
  - change up certain code paths that always used to do:
    m_get() + m_clget() to instead just use m_getcl() and
    try to take advantage of the newly defined secondary
    Packet zone.
  - netstat(1) and systat(1) quickly hacked up to do basic
    stat reporting but additional stats work needs to be
    done once some other details within UMA have been taken
    care of and it becomes clearer to how stats will work
    within the modified framework.

From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used.  The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.

Additional things worth noting/known issues (READ):
   - One report of 'ips' (ServeRAID) driver acting really
     slow in conjunction with mbuma.  Need more data.
     Latest report is that ips is equally sucking with
     and without mbuma.
   - Giant leak in NFS code sometimes occurs, can't
     reproduce but currently analyzing; brueffer is
     able to reproduce but THIS IS NOT an mbuma-specific
     problem and currently occurs even WITHOUT mbuma.
   - Issues in network locking: there is at least one
     code path in the rip code where one or more locks
     are acquired and we end up in m_prepend() with
     M_WAITOK, which causes WITNESS to whine from within
     UMA.  Current temporary solution: force all UMA
     allocations to be M_NOWAIT from within UMA for now
     to avoid deadlocks unless WITNESS is defined and we
     can determine with certainty that we're not holding
     any locks when we're M_WAITOK.
   - I've seen at least one weird socketbuffer empty-but-
     mbuf-still-attached panic.  I don't believe this
     to be related to mbuma but please keep your eyes
     open, turn on debugging, and capture crash dumps.

This change removes more code than it adds.

A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.

Testing and Debugging:
    rwatson,
    brueffer,
    Ketrien I. Saihr-Kesenchedra,
    ...
Reviewed by: Lots of people (for different parts)
2004-05-31 21:46:06 +00:00
Robert Watson
e79962dbce Assert Giant in vn_start_write() and vn_finished_write(). 2004-05-31 20:56:10 +00:00
Bosko Milekic
d1fd2228b8 Giant wasn't dropped here if we have to return EBUSY. This is bad. 2004-05-31 20:21:06 +00:00
Robert Watson
69af1dccdc Release NFS subsystem lock and acquire Giant when calling into
vn_start_write().
2004-05-31 19:08:22 +00:00