1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-21 11:13:30 +00:00
Commit Graph

169688 Commits

Author SHA1 Message Date
Nathan Whitehorn
629e40e45e Give the kernel pmap lock a different name than user pmap locks. It has
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.

MFC after:	5 days
2012-04-06 16:00:37 +00:00
Luigi Rizzo
87a9e4379e we need to specify a NETMAP_API version or the ioctl() will fail. 2012-04-06 14:26:05 +00:00
Alexander V. Chernikov
9431cc1696 Fix build broken by r233938.
Pointed by:     David Wolfskill <david@catwhisker.org>
Approved by:    kib (mentor)
Pointy hat to:  melifaro
2012-04-06 13:34:19 +00:00
Ed Schouten
6ee5808be7 Properly clear the O_NONBLOCK flag after opening the TTY.
Though we should open the TTY with O_NONBLOCK to prevent rc(8) execution
from potentially stalling, we must not forget to clear the flag later
on, to prevent read(2) calls from failing later on.

This prevented the shell pathname prompt from working properly.

Reported by:	kib
2012-04-06 13:06:01 +00:00
Andriy Gapon
bec9e056eb retrofit Safe Mode loader menu item actions
The menu item is now made completely independent with the ACPI item - most
modern systems seem to require ACPI and become even more "unsafe"
without it.
Safe Mode no longer disables APIC for the same reason.
kbdmux is not disabled as this feature has proven itself stable.

New actions:
- SMP is disabled in the Safe Mode now
- eventtimers are forced to periodic mode (some real and virtual systems
  seem to have problems otherwise)
- geom extra vigorous integrity checking is disabled, this is to
  facilitate migration from previous versions

Possible short term to do:
- make SMP switch a separate menu item
- restore APIC switch as a separate menu item

Longer term to do:
- turn various tweaks into separate menu items in a Safe Mode sub-menu

Please consider adding a safety tweak to Safe Mode when introducing
new major features or changes that may cause instabilities.

Discussed with:	jhb, scottl, Devin Teske
MFC after:	3 weeks (stable/9 only)
2012-04-06 09:36:22 +00:00
Michael Tuexen
17b611fb21 Remove duplicate condition in if statement.
Obtained from: brucec@
MFC after: 3 days
2012-04-06 09:03:02 +00:00
Sergey Kandaurov
e84459d04b Free ballooned pages with the corresponding malloc type.
MFC after:	1 week
2012-04-06 08:13:29 +00:00
Alexander V. Chernikov
51ec1eb70d - Improve performace for writer-only BPF users.
Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send
raw ethernet frames. The only FreeBSD interface that can be used to send raw frames
is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses
BPF only to send data. This leads us to the situation when software like cdpd,
being run on high-traffic-volume interface significantly reduces overall performance
since we have to acquire additional locks for every packet.

Here we add sysctl that changes BPF behavior in the following way:
If program came and opens BPF socket without explicitly specifyin read filter we
assume it to be write-only and add it to special writer-only per-interface list.
This makes bpf_peers_present() return 0, so no additional overhead is introduced.
After filter is supplied, descriptor is added to original per-interface list permitting
packets to be captured.

Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of
setting snap length.

Fortunately, most programs explicitly sets (event catch-all) filter after that.
tcpdump(1) is a good example.

So a bit hackis approach is taken: we upgrade description only after second
BIOCSETF is received.

Sysctl is named net.bpf.optimize_writers and is turned off by default.

- While here, document all sysctl variables in bpf.4

Sponsored by Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      4 weeks
2012-04-06 06:55:21 +00:00
Alexander V. Chernikov
e4b3229aa5 - Improve BPF locking model.
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.

- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.

- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.

Found by:       Dmitrij Tejblum <tejblum@yandex-team.ru>
Sponsored by:   Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      3 weeks
2012-04-06 06:53:58 +00:00
Stanislav Sedov
3ef51c5fb9 - Do not use deprecated krb5 error message reporting functions in libtelnet. 2012-04-06 00:03:45 +00:00
Konstantin Belousov
3f4e35f752 Properly handle absent AT_CANARY aux entry.
Submitted by:	Andrey Zonov <andrey zonov org>
MFC after:	3 days
2012-04-05 18:47:54 +00:00
John Baldwin
35818d2e94 Add new ktrace records for the start and end of VM faults. This gives
a pair of records similar to syscall entry and return that a user can
use to determine how long page faults take.  The new ktrace records are
enabled via the 'p' trace type, and are enabled in the default set of
trace points.

Reviewed by:	kib
MFC after:	2 weeks
2012-04-05 17:13:14 +00:00
Pedro F. Giffuni
a90710e961 Fix a typo in GCC affecting calculations with -ffast-math.
The fix is similar to the one applied in GCC-4.3 in
GCCSVN-r117929 under the GPLv2.

Submitted by:	Andrey Simonenko
Reviewed by:	mm
Approved by:	jhb (mentor)
MFC after:	3 days
2012-04-05 15:16:51 +00:00
Andriy Gapon
70542ee01f zfs_ioctl: no need for ddi_copyin/out here because sys_ioctl handles that
On FreeBSD the direct ioctl argument is automatically copied in/out
as necesary by the kernel ioctl entry point.

PR:		kern/164445
Submitted by:	Luis Garces-Erice <lge@ieee.org>
Tested by:	Attila Nagy <bra@fsn.hu>
MFC after:	5 days
2012-04-05 07:59:59 +00:00
Andrey V. Elsukov
59894e4a44 Fix VIMAGE build. 2012-04-05 04:41:06 +00:00
Doug Barton
d0f6280db7 Update to version 9.8.2, the latest from ISC, which contains numerous bug fixes. 2012-04-05 04:29:35 +00:00
David Xu
8931e524bf In sem_post, the field _has_waiters is no longer used, because some
application destroys semaphore after sem_wait returns. Just enter
kernel to wake up sleeping threads, only update _has_waiters if
it is safe. While here, check if the value exceed SEM_VALUE_MAX and
return EOVERFLOW if this is true.
2012-04-05 03:05:02 +00:00
David Xu
17ce606321 umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily.  On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
Doug Barton
6188bf47c8 Add Bv9ARM.pdf to the list of docs to install. 2012-04-04 23:58:41 +00:00
Adrian Chadd
88b3d48316 Implement BAR TX.
A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission.  The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.

In order to do this:

* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
  they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
  via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
  is torn down.

This is a first pass that implements the above.  What needs to be done/
tested:

* What happens during say, a channel reset / stuck beacon _and_ BAR
  TX.  It _should_ be correctly buffered and retried once the
  reset has completed.  But if a bgscan occurs (and they shouldn't,
  grr) the BAR frame will be forcibly failed and the aggregation session
  will be torn down.

  Yes, another reason to disable bgscan until I've figured this out.

* There's way too much locking going on here.  I'm going to do a couple
  of further passes of sanitising and refactoring so the (re) locking
  isn't so heavy.  Right now I'm going for correctness, not speed.

* The BAR TX can fail if the hardware TX queue is full.  Since there's
  no "free" space kept for management frames, a full TX queue (from eg
  an iperf test) can race with your ability to allocate ath_buf/mbufs
  and cause issues.  I'll knock this on the head with a subsequent
  commit.

* I need to do some _much_ more thorough testing in hostap mode to ensure
  that many concurrent traffic streams to different end nodes are correctly
  handled.  I'll find and squish whichever bugs show up here.

But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.
2012-04-04 23:45:15 +00:00
Nathan Whitehorn
ae2e7abf3c Fix typo.
Submitted by:	pawel dot worach at gmail dot com
MFC after:	3 days
2012-04-04 23:14:01 +00:00
Doug Barton
42d3eba523 Vendor import of BIND 9.8.2 2012-04-04 23:11:25 +00:00
Adrian Chadd
084c471979 Track and optionally log the actual sync interrupt cause.
These are involved in tracking host interface issues (ie, PCI/PCIe/AHB
interface.)
2012-04-04 22:51:50 +00:00
Adrian Chadd
d6b2002327 Disable the HWQ contents upon a TX queue reset, rather than a TX queue flush.
This is designed to assist in figuring out what the hardware state is
when something like a queue hang has occured.
2012-04-04 22:24:11 +00:00
Adrian Chadd
d743debcbf Now that I've fixed the BAW TX hangs, disable this verbose debugging
again.
2012-04-04 22:22:50 +00:00
Jung-uk Kim
4db9b6c3c6 Save and restore VGA display memory between suspend and resume. 2012-04-04 22:02:54 +00:00
Adrian Chadd
33d340324a Correctly handle AR_MoreAggr when assembling multi-descriptor final frames.
Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.

Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".

The correct behaviour is:

* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
  in an aggregate.

Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.

I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.
2012-04-04 21:49:49 +00:00
Jung-uk Kim
4c7a7f266f Do not copy VESA state buffer if the VBE call has failed for any reason.
Do not unnecessarily clear the state buffer before calling the function.
2012-04-04 21:38:26 +00:00
John Baldwin
6267007aa6 Disable INET6 support in modules when building the LINT-NOINET6 kernel.
Reviewed by:	bz
MFC after:	1 week
2012-04-04 21:31:20 +00:00
Jung-uk Kim
a3ad2822c4 Remove a useless warning. The mode information is unused for very long time
and this function may be used with VESA mode since r232069.
2012-04-04 21:19:55 +00:00
Marius Strobl
26a93551be - Const'ify the device lookup-table.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
- Enable support for flow control.
  Tested by: yongari

MFC after:	1 week
2012-04-04 21:09:02 +00:00
Adrian Chadd
2fe1131c7b Add a threadid to the ah_decode API.
This adds the current thread ID to each logged register and mark entry,
allowing for easier debugging of concurrent/overlapping NIC operations.
2012-04-04 20:46:20 +00:00
Marius Strobl
bb16310a44 Refine r233827; as it turns out, controllers with a device ID of 0x0059
can be upgraded to MegaRAID mode, in which case mfi(4) should attach to
these based on the sub-vendor and -device ID instead (not currently done).
Therefore, let mpt_pci_probe() return BUS_PROBE_LOW_PRIORITY.
While it, let mpt_pci_probe() return BUS_PROBE_DEFAULT instead of 0 in
the default case.

MFC after:	3 days
2012-04-04 20:42:45 +00:00
Adrian Chadd
7961e32527 Disable a specific Merlin hardware workaround which may cause hangs on some
PCIe controllers.

Obtained from:	Atheros / Linux
2012-04-04 20:42:32 +00:00
Jung-uk Kim
3df9a2bf07 - Do not include machine/atomic.h. It is no longer necessary since r233768.
- Remove bogus "atomic" macros and a read-only variable from softc.

Reviewed by:	ambrisko
2012-04-04 16:15:40 +00:00
Jaakko Heinonen
63dfd849a1 Add a check for unsupported file flags to ufs_setattr().
Discussed with:	bde
MFC after:	2 weeks
2012-04-04 14:50:21 +00:00
Gleb Smirnoff
07b6b55dce Merge from OpenBSD:
revision 1.173
  date: 2011/11/09 12:36:03;  author: camield;  state: Exp;  lines: +11 -12
  State expire time is a baseline time ("last active") for expiry
  calculations, and does _not_ denote the time when to expire.  So
  it should never be added to (set into the future).

  Try to reconstruct it with an educated guess on state import and
  just set it to the current time on state updates.

  This fixes a problem on pfsync listeners where the expiry time
  could be double the expected value and cause a lot more states
  to linger.
2012-04-04 14:47:59 +00:00
John Baldwin
20b5d3bf40 Add descriptions after the 'device' line for several NICs to match the
existing style.
2012-04-04 13:49:22 +00:00
John Baldwin
8b39d094aa Fix build on i386. 2012-04-04 11:55:20 +00:00
Gleb Smirnoff
16410af7fa With pf 4.5 import the name of pfsync stats sysctl has changed, thus
'netstat -sp pfsync' got broken. Fix this.
2012-04-04 08:30:32 +00:00
Navdeep Parhar
60a305887a - Remove redundant call to pr_ctloutput from code that handles SO_SETFIB.
- Add a check for errors during copyin while here.

Reviewed by:	julian, bz
MFC after:	2 weeks
2012-04-03 18:38:00 +00:00
Gleb Smirnoff
79f6687df0 Document syncdev, syncpeer and defer keywords for
pfsync(4) interfaces.
2012-04-03 18:11:30 +00:00
Gleb Smirnoff
74e9ff6519 Make it possible to switch pfsync(4) deferral mechanism on/off.
Obtained from:	OpenBSD
2012-04-03 18:10:48 +00:00
Gleb Smirnoff
64484cf630 Since pf 4.5 import pf(4) has a mechanism to defer
forwarding a packet, that creates state, until
pfsync(4) peer acks state addition (or 10 msec
timeout passes).

This is needed for active-active CARP configurations,
which are poorly supported in FreeBSD and arguably
a good idea at all.

Unfortunately by the time of import this feature in
OpenBSD was turned on, and did not have a switch to
turn it off. This leaked to FreeBSD.

This change make it possible to turn this feature
off via ioctl() and turns it off by default.

Obtained from:	OpenBSD
2012-04-03 18:09:20 +00:00
Bernhard Schmidt
a2a4a2aa53 Add basic HT channel setup to ieee80211_init_channels(), this will be
used by at least ral(4).

Reviewed by:	ray
2012-04-03 17:48:42 +00:00
Marius Strobl
6fe8be540f Fix probing of SAS1068E with a device ID of 0x0059 after r232411.
Reported by:	infofarmer

MFC after:	3 days
2012-04-03 08:28:43 +00:00
Kirk McKusick
23d6e518da A file cannot be deallocated until its last name has been removed
and it is no longer referenced by a user process. The inode for a
file whose name has been removed, but is still referenced at the
time of a crash will still be allocated in the filesystem, but will
have no references (e.g., they will have no names referencing them
from any directory).

With traditional soft updates these unreferenced inodes will be
found and reclaimed when the background fsck is run. When using
journaled soft updates, the kernel must keep track of these inodes
so that it can find and reclaim them during the cleanup process.
Their existence cannot be stored in the journal as the journal only
handles short-term events, and they may persist for days. So, they
are tracked by keeping them in a linked list whose head pointer is
stored in the superblock. The journal tracks them only until their
linked list pointers have been commited to disk. Part of the cleanup
process involves traversing the list of unreferenced inodes and
reclaiming them.

This bug was triggered when confusion arose in the commit steps
of keeping the unreferenced-inode linked list coherent on disk.
Notably, a race between the link() system call adding a link-count
to a file and the unlink() system call removing a link-count to
the file. Here if the unlink() ran after link() had looked up
the file but before link() had incremented the link-count of the
file, the file's link-count would drop to zero before the link()
incremented it back up to one. If the file was referenced by a
user process, the first transition through zero made it appear
that it should be added to the unreferenced-inode list when in
fact it should not have been added. If the new name created by
link() was deleted within a few seconds (with the file still
referenced by a user process) it would legitimately be a candidate
for addition to the unreferenced-inode list. The result was that
there were two attempts to add the same inode to the unreferenced-inode
list which scrambled the unreferenced-inode list's pointers leading
to a panic. The fix is to detect and avoid the false attempt at
adding it to the unreferenced-inode list by having the link()
system call check to see if the link count is zero before it
increments it. If it is, the link() fails with ENOENT (showing that
it has failed the link()/unlink() race).

While tracking down this bug, we have added additional assertions
to detect the problem sooner and also simplified some of the code.

Reported by:      Kirk Russell
Fix submitted by: Jeff Roberson
Tested by:        Peter Holm
PR:               kern/159971
MFC (to 9 only):  2 weeks
2012-04-02 21:58:37 +00:00
Konstantin Belousov
5085ecb75a When process exists, not only the children shall be reparented to
init, but also the orphans shall be removed from the orphan list,
because the list header is destroyed.

Reported and tested by:	pho
MFC after:	3 days
2012-04-02 19:35:36 +00:00
Konstantin Belousov
2e39e24f64 Add helper function to remove the process from the orphans list and
use it instead of inlined code.

Tested by:	pho
MFC after:	3 days
2012-04-02 19:34:56 +00:00
Doug Ambrisko
7e4dd9e112 Move struct megasas_sge from mfi_ioctl.h to mfivar.h so we can
remove including machine/bus.h.  Add some more mfi_ prefixes to
avoid name space pollution.

This should address the last tinderbox issues.
2012-04-02 19:13:02 +00:00