1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-15 10:17:20 +00:00
Commit Graph

173244 Commits

Author SHA1 Message Date
Gleb Smirnoff
1d6139c0e4 Make ruleset anchors in pf(4) reentrant. We've got two problems here:
1) Ruleset parser uses a global variable for anchor stack.
2) When processing a wildcard anchor, matching anchors are marked.

To fix the first one:

o Allocate anchor processing stack on stack. To make this allocation
  as small as possible, following measures taken:
  - Maximum stack size reduced from 64 to 32.
  - The struct pf_anchor_stackframe trimmed by one pointer - parent.
    We can always obtain the parent via the rule pointer.
  - When pf_test_rule() calls pf_get_translation(), the former lends
    its stack to the latter, to avoid recursive allocation 32 entries.

The second one appeared more tricky. The code, that marks anchors was
added in OpenBSD rev. 1.516 of pf.c. According to commit log, the idea
is to enable the "quick" keyword on an anchor rule. The feature isn't
documented anywhere. The most obscure part of the 1.516 was that code
examines the "match" mark on a just processed child, which couldn't be
put here by current frame. Since this wasn't documented even in the
commit message and functionality of this is not clear to me, I decided
to drop this examination for now. The rest of 1.516 is redone in a
thread safe manner - the mark isn't put on the anchor itself, but on
current stack frame. To avoid growing stack frame, we utilize LSB
from the rule pointer, relying on kernel malloc(9) returning pointer
aligned addresses.

Discussed with:		dhartmei
2012-09-18 10:54:56 +00:00
Gleb Smirnoff
9e8c4accee - Add $FreeBSD$ to allow modifications to this file.
- Move $OpenBSD$ to a more standard place.
2012-09-18 10:52:46 +00:00
Adrian Chadd
f1bc738ece Implement my first cut at filtered frames in aggregation sessions.
The hardware can optionally "filter" frames if successive transmissions
to a given node (ie, "entry in the keycache") fail.  That way the hardware
can implement a kind of early abort of all the other frames queued to
that destination, rather than simply trying to TX each frame to that
destination (and failing.)

The background:

* If a frame comes back as being filtered, the hardware didn't try to
  TX it (or it was outside the TX burst opportunity.) So, take it as a hint
  that some (but not all, see below) frames to the destination may be
  filtered.

* If the CLRDMASK bit is set in a TX descriptor, the "filter to this
  destination" bit in the keycache entry is cleared and TX to that host
  will be unconditionally retried.

* Right now everything has the CLRDMASK bit set, so filtered frames
  tend to be aggregates and frames that fall outside of the WME burst
  window. It was a bit worse in the past as I had messed up the TX
  flags and CLRDMASK wasn't being set on aggregate frames.

The annoying bits:

* It's easy (ish) to do for aggregate session frames - firstly, they
  can be retried in any order as long as they're within the BAW, and
  there's already a bunch of infrastructure tracking how many frames
  the TID has queued to the hardware (tid->hwq_depth.) However, for
  frames that bypassed the software queue, hwq_depth doesn't get
  incremented. I'll fix that in a subsequent commit.

* For non-aggregate session frames, the only retries that can occur
  are ones for sequence numbers that hvaen't successfully been TXed yet.
  Since there's no re-ordering going on in non-aggregate sessions, if any
  subsequent seqno frames make it out, any filtered frames before that
  seqno need to be dropped.

  Hence why this initially is just for aggregate session frames.

* Since there may be intermediary frames to the destination that
  have CLRDMASK set - for example, any directly dispatched management
  frames to that destination - it's possible that there will be some
  filtered frames followed up by some non filtered frames.  Thus,
  it can't be assumed that once you see a filtered frame for the given
  destination node, all subsequent frames for all TIDs will be filtered.

Ok, with that in mind:

* Create a per-TID filtered frame queue for frames that the hardware
  returns as filtered.

* Track filtered frames per-tid, rather than per-node.  It just makes
  the locking much easier.

* When a filtered frame appears in the completion function, the node
  transitions to "filtered", and all subsequent completed error frames
  (filtered or otherwise) are put on the filtered frame queue.  The TID
  is paused once (during the transition from non-filtered to filtered).

* If a filtered frame retry count exceeds SWMAX_RETRIES, a BAR should be
  sent.

* Once all the frames queued to the hardware for the given filtered frame
  TID, transition back from filtered frame to non-filtered frame, which
  means pre-pending all the filtered frames onto the head of the software
  queue, clearing the filtered frame state and unpausing the TID.

Things get quite hairy around handling completion (aggr, non-aggr, norm,
direct-dispatched frames to a hardware queue); whether it's an "error",
"cleanup" or "BAR" state as well as filtered, which order to do things
in (eg do filtered BEFORE checking for BAR, as the filter completion
may be needed to actually transmit a BAR frame.)

This work has definitely reminded me that I have to tidy up all the locking
and remove some of the ridiculous lock/unlock/lock/unlock going on in the
completion functions.

It's also reminded me that I should really split out TID versus hardware TXQ
locking, even if the underlying locking is still the destination hardware TXQ.

Finally, this is all pre-requisite for working on AP mode power save support
(PS-POLL, uAPSD) as well as improving performance to misbehaving nodes (as
they can transition into filter mode, stopping any TX until everything has
caught up.)

Finally (ish) - this should also be done for non-aggregate sessions as
there are still plenty of laptops and mobile devices that don't speak
802.11n but do wish for stable, useful power save AP support where packets
aren't simply dropped.  This requires software retransmission for
non-aggregate sessions to be implemented, which includes the caveats I've
mentioned above.

Finally finally - this doesn't yet do anything about the CLRDMASK bit in the
TX descriptor.  That's still unconditionally set to 1.  I'll debug the
current work (mostly ensuring I haven't busted up the hairy transitions
between BAR, filtered, error (all frames in an aggregate failing) and
cleanup (when transitioning from aggregation -> non-aggregation.))

Finally finally finally - this is all original work by yours truely, rather
than ported from the Atheros internal driver codebase or Linux ath9k.

Tested:
 * AR9280, AR5416 in STA mode
 * AR9280, AR9130 in hostap mode
 * Lots and lots of iperf testing in very marginal and non-marginal conditions,
   complete with inducing filtered frames + BAR TX conditions.
2012-09-18 10:14:17 +00:00
Gleb Smirnoff
effbcf3842 Fix DIOCNATLOOK: zero key padding before performing lookup. 2012-09-18 09:15:32 +00:00
Andriy Gapon
a80a10b13b loader/i386: replace ugly inb/outb re-implementations with cpufunc.h
Use of __builtin_constant_p in a function that is only called via
a pointer is a good example of how out-of-date it was.

Suggested by:	bde
MFC after:	1 week
2012-09-18 08:53:11 +00:00
Andriy Gapon
154fc7b6c7 acpi_cpu: explicitly notify userland about c-state changes
... after they are committed.
A notification is sent per CPU.

Reviewed by:	imp
MFC after:	3 weeks
2012-09-18 08:17:29 +00:00
Joel Dahl
b93e731fd4 mdoc: remove superfluous paragraph macro. 2012-09-18 08:12:28 +00:00
Andriy Gapon
6ed9e9f32f zfs: correctly calculate dn_bonuslen for saving SAs to disk
Since all attribute values start at 8-byte aligned boundary, we would
previously incorrectly calculate dn_bonuslen if any attribute but the
last had a variable-length value with length not multiple of 8.

Reported by:	Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Tested by:	Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Reviewed by:	Matthew Ahrens <mahrens@delphix.com> (for upstream)
MFC after:	2 weeks
2012-09-18 08:02:54 +00:00
Andriy Gapon
ea559fb573 zfs: allow both DEBUG and ZFS_DEBUG to be defined on command line
Discussed with:	pjd
MFC after:	10 days
2012-09-18 08:00:56 +00:00
Kevin Lo
9f614af4cf Add missing break 2012-09-18 08:00:43 +00:00
Andriy Gapon
85f5b9aa70 g_disk_flushcache definitely should not be traced under G_T_TOPOLOGY
... use G_T_BIO instead

MFC after:	1 week
2012-09-18 07:57:34 +00:00
Benjamin Kaduk
3a99e819f3 Whitespace cleanup for ipfw.8 -- start each sentence on a new line,
and put a comma after e.g. and i.e..  While here, wrap long lines.

PR:		docs/157452
Approved by:	hrs (mentor)
2012-09-18 02:33:23 +00:00
Kevin Lo
08466b02d4 Remove bogus break statements.
Obtained from:	DragonFly
2012-09-18 02:19:43 +00:00
Adrian Chadd
8122c3163f Add a couple of accessor inline functions for state that exists in net80211.
Obtained from:	Qualcomm Atheros
2012-09-18 01:27:24 +00:00
Attilio Rao
6a612df12c Remove namespace pollution in _rmlock.h by defining rm_queue structure
directly in _rmlock.h and then including it (and its dependencies)
in pcpu.h. This leads to few _*.h headers to be included in pcpu.h
but this is not considered a big deal.

Really pc_rm_queue should be implemented as a dynamic member with
DPCPU interface, but we really want to keep the read acquisition as
fast as possible, so even the further pc_dynamic indirection should be
avoided, and the pollution is dealt like this.

Discussed with:	jhb
MFC after:	1 week
2012-09-18 00:43:15 +00:00
Adrian Chadd
d94f2d7f34 Rename AH_MIMO_MAX_CHAINS to AH_MAX_CHAINS, for compatibility with
internal atheros HAL code.
2012-09-17 23:24:45 +00:00
David E. O'Brien
e279d5688e yes(1) actually comes from V7.
Submitted by:	Simon Gerraty <sjg@juniper.net>
2012-09-17 23:04:15 +00:00
Jim Harris
a724927cf3 Integrate nvmecontrol(8) into the amd64 and i386 builds.
This includes adding NVMe header files to /usr/include/dev/nvme.

Sponsored by:  Intel
2012-09-17 21:41:38 +00:00
Jim Harris
4cb79292f7 Add nvmecontrol(8) source code and beginnings of a man page to the tree.
Sponsored by:		Intel
Contributions from:	Joe Golio/EMC <joseph dot golio at emc dot com>
2012-09-17 21:26:55 +00:00
Jim Harris
978b27047d Add nvme(4) and nvd(4) Makefiles to the tree.
Noticed by:	pluknet
Pointy-hat to:  jimharris
2012-09-17 19:58:02 +00:00
Jim Harris
eb85d44f06 Integrate nvme(4) and nvd(4) into the amd64 and i386 builds.
Sponsored by:	Intel
2012-09-17 19:26:33 +00:00
Jim Harris
bb0ec6b359 This is the first of several commits which will add NVM Express (NVMe)
support to FreeBSD.  A full description of the overall functionality
being added is below.  nvmexpress.org defines NVM Express as "an optimized
register interface, command set and feature set fo PCI Express (PCIe)-based
Solid-State Drives (SSDs)."

This commit adds nvme(4) and nvd(4) driver source code and Makefiles
to the tree.

Full NVMe functionality description:
Add nvme(4) and nvd(4) drivers and nvmecontrol(8) for NVM Express (NVMe)
device support.

There will continue to be ongoing work on NVM Express support, but there
is more than enough to allow for evaluation of pre-production NVM Express
devices as well as soliciting feedback.  Questions and feedback are welcome.

nvme(4) implements NVMe hardware abstraction and is a provider of NVMe
namespaces.  The closest equivalent of an NVMe namespace is a SCSI LUN.
nvd(4) is an NVMe consumer, surfacing NVMe namespaces as GEOM disks.
nvmecontrol(8) is used for NVMe configuration and management.

The following are currently supported:
nvme(4)
- full mandatory NVM command set support
- per-CPU IO queues (enabled by default but configurable)
- per-queue sysctls for statistics and full command/completion queue
     dumps for debugging
- registration API for NVMe namespace consumers
- I/O error handling (except for timeoutsee below)
- compilation switches for support back to stable-7

nvd(4)
- BIO_DELETE and BIO_FLUSH (if supported by controller)
- proper BIO_ORDERED handling

nvmecontrol(8)
- devlist: list NVMe controllers and their namespaces
- identify: display controller or namespace identify data in
      human-readable or hex format
- perftest: quick and dirty performance test to measure raw
      performance of NVMe device without userspace/physio/GEOM
      overhead

The following are still work in progress and will be completed over the
next 3-6 months in rough priority order:
- complete man pages
- firmware download and activation
- asynchronous error requests
- command timeout error handling
- controller resets
- nvmecontrol(8) log page retrieval

This has been primarily tested on amd64, with light testing on i386.  I
would be happy to provide assistance to anyone interested in porting
this to other architectures, but am not currently planning to do this
work myself.  Big-endian and dmamap sync for command/completion queues
are the main areas that would need to be addressed.

The nvme(4) driver currently has references to Chatham, which is an
Intel-developed prototype board which is not fully spec compliant.
These references will all be removed over time.

Sponsored by:        Intel
Contributions from:  Joe Golio/EMC <joseph dot golio at emc dot com>
2012-09-17 19:23:01 +00:00
Hans Petter Selasky
d7dd13419e Add UQ_UMS_IGNORE quirk.
Wrap two long lines.
Some minor spelling correction.

PR:	usb/171721
2012-09-17 19:06:35 +00:00
Hans Petter Selasky
e2524b2ec9 Implement support for USB Audio v2.0. Remove some redundant
USB audio v1.0 debug data, hence userspace tools like lsusb
exist to show this information properly.
2012-09-17 15:43:57 +00:00
John Baldwin
0fca6f8bf5 Add locking to mlx(4) to make it MPSAFE along with some other fixes:
- Use callout(9) rather than timeout(9).
- Add a mutex as an I/O lock that protects the adapter and is used
  for the I/O path.
- Add an sx lock as a configuration lock that protects the relationship
  of configured volumes.
- Freeze the request queue when a DMA load is deferred with EINPROGRESS
  and unfreeze the queue when the DMA callback is invoked.
- Explicitly poll the hardware while waiting to submit a command to
  allow completed commands to free up slots in the command ring.
- Remove driver-wide 'initted' variable from mlx_*_fw_handshake() routines.
  That state should be per-controller instead.  Add it as an argument
  since the first caller knows when it is the first caller.
- Remove explicit bus_space tag/handle and use bus_*() rather than
  bus_space_*().
- Move duplicated PCI device ID probing into a  mlx_pci_match() routine.
- Don't check for PCIM_CMD_MEMEN (the PCI bus will enable that when
  allocating the resource) and use pci_enable_busmaster() rather than
  manipulating the register directly.

Tested by:	no one despite multiple requests (hope it works)
2012-09-17 15:27:30 +00:00
Alexander V. Chernikov
8b3daf895a Make systat(1) accept fractional number of seconds.
Make old alarm(3)-based code use select(2).

MFC after:	2 weeks
2012-09-17 13:36:47 +00:00
Gavin Atkinson
058ede33bf - Add #defines for the bits within the iPCI Express PCIR_EXPRESS_LINK_CTL
register
- Add missing register PCIR_EXPRESS_ROOT_CAP
- Correct a spelling mistake (SLAT -> SLOT) [1]

Reviewed by:	jhb [1]
2012-09-17 12:51:48 +00:00
Kevin Lo
4e4eb12038 Remove unused variable cd.
This variable is initialized but not used.
2012-09-17 09:32:11 +00:00
Andrew Turner
71f5a44d88 Add a kernel config for the Toshiba AC100. The AC100 is an ARM laptop with
an NVidia Tegra 2 CPU.

Tegra 2 needs an external patch to pmap for atomic operations to work. Even
with this the Kernel only gets to the mount root prompt. As such Tegra
support is considered experimental, however adding the kernel config will
help ensure the Tegra code builds.
2012-09-17 09:22:59 +00:00
Mikolaj Golub
16c3b091ae In snmp_hostres, device_map table is used for consistent device table
indexing. When a device has gone it is not removed from device_map
table but just its entry_p field is set to NULL.

So when traversing device_map in disk_OS_get_ATA_disks() and
disk_OS_get_MD_disks() check for entry_p being NULL, otherwise the
bsnmpd crash is possible when a removed map entry is dereferenced.

Before the fix, for disk_OS_get_ATA_disks() the crash could be easily
reproduced running:

  atacontrol detach ata1

The crash was not observed in disk_OS_get_MD_disks() because currently
snmp_hostres does no see md(4) disks: to get the device list it uses
devinfo(3), which does not return md devices.

Reported by:	Miroslav Lachman 000.fbsd quip.cz
MFC after:	1 week
2012-09-17 07:32:53 +00:00
Andrew Turner
a7dc3573ca Add the Tegra2 DTS files. Now our dtc supports including other files use
this support to pull out the SoC specific parts of the dts file.
2012-09-17 07:14:07 +00:00
Adrian Chadd
c6e9cee205 Take credit for the work I've done in this source file. 2012-09-17 03:17:42 +00:00
Matt Jacob
6f7aeb5fe3 Minor correction.
MFC after:	1 day
2012-09-17 02:50:16 +00:00
Matt Jacob
8b382bc2b5 Add some edits to the changed comments so that they make more sense.
MFC after:	1 day
2012-09-17 02:49:02 +00:00
Glen Barber
0cd4fb9206 Update release(7) to reflect changes from r240586 and r240587:
- Remove cvs(1) references.
 - Remove CVS* environment references.
 - Add default entries for the default SVNROOT for the Ports
   Collection, and Documentation Project.
 - While here, update 'SGML-based documentation' to 'XML-based',
   since the recent SGML->XML conversion.
 - Update an example providing SVNROOT environment usage.

Reminded by:	nwhitehorn
MFC After:	1 week
X-MFC-With:	r240586, r240587
2012-09-17 02:45:52 +00:00
Glen Barber
214e898796 Update usage() in comment section.
Remove CVS_* variables in comments.

MFC after:	1 week
X-MFC-With:	r240586
2012-09-17 02:35:00 +00:00
Glen Barber
738257e157 Update generate-release.sh script:
- Use svn for ports and doc trees
 - When installing a binary textproc/docproj package,
   switch pkg_add(1) to pkg(8) [1]

Reviewed by:	nwhitehorn
Approved by:	nwhitehorn
Enhanced by:	glebius [1]
MFC After:	1 week
X-MFC-To:	9-only
2012-09-17 02:23:03 +00:00
Adrian Chadd
de8e4d6436 Add a per-TID filter queue and filter state bits.
These are intended for software TX filtering support, where the NIC
decides there has been too many successive failues to a destination
and will filter it.

Although the filtering is done per-destination (via the keycache),
the state and queue is kept per-TID for now.  It simplifies the overall
architecture design and locking.

Whilst here, add ATH_TID_UNLOCK_ASSERT().
2012-09-17 01:21:55 +00:00
Adrian Chadd
355cae39e9 Add a debug bit for TX destination filtering. 2012-09-17 01:18:47 +00:00
Adrian Chadd
e69db8df7d Improve performance of the Sample rate algorithm on 802.11n networks.
* Don't treat high percentage failures as "sucessive failures" - high
  MCS rates are very picky and will quite happily "fade" from low
  to high failure % and back again within a few seconds.  If they really
  don't work, the aggregate will just plain fail.

* Only sample MCS rates +/- 3 from the current MCS.  Sample will back off
  quite quickly, so there's no need to sample _all_ MCS rates between
  a high MCS rate and MCS0; there may be a lot of them.

* Modify the smoothing rate to be 75% rather than 95% - it's more adaptive
  but it comes with a cost of being slightly less stable at times.
  A per-node, hysterisis behaviour would be nicer.
2012-09-17 01:09:17 +00:00
Edward Tomasz Napierala
d8c4c83314 Remove references to userstat(1) and jailstat(1). Those tools were never
merged from the Perforce branch.  They might be brought in when %CPU limits
go into the tree.

PR:		docs/171240
MFC after:	2 weeks
2012-09-16 23:25:13 +00:00
Adrian Chadd
7b5a343596 Fix a crash bug introduced in the iterate node work recently done.
When resuming, the first VAP is checked for max_aid; however if there
is no VAP, this results in a NULL pointer dereference and kernel
panic.
2012-09-16 22:45:00 +00:00
Joel Dahl
9f668383b7 Remove trailing whitespace. 2012-09-16 21:17:28 +00:00
John-Mark Gurney
3dad5721a6 fix the kernel files to match our standard "option<space><tab>" format
such that when commenting/uncommentting lines, horizontal spacing is
maintained...

Also fix some minor comment formatting to line things up, etc...

Reviewed by:	gnn, imp
MFC after:	1 week
2012-09-16 19:48:48 +00:00
John-Mark Gurney
2b539bdcb3 remove some unnecessary debugging statements, dead code and incorrect
comment...

Reviewed by:	gnn, imp
2012-09-16 19:42:27 +00:00
Tijl Coosemans
71dad5d6ad Optimise i387 trigonometric functions. Replace "andw 0x400,%ax \ jnz" with
"sahf \ jp", "fprem1" with "fprem" and "fstsw %ax" with "fnstsw %ax".
2012-09-16 16:58:49 +00:00
Eitan Adler
65e415b16e Revert 240527:
mntbuf can poit to memory allocated by getmntinfo(3) which can't be freed

PR:		bin/171634
Approved by:	cperciva (implicit)
2012-09-16 16:08:20 +00:00
Dag-Erling Smørgrav
6cbae38f63 Warn about filesystem-based attacks. 2012-09-16 15:22:15 +00:00
Andrey Zonov
5695afded4 - Make truss thread-aware.
Approved by:	kib (mentor)
MFC after:	2 weeks
2012-09-16 14:38:01 +00:00
Alexander V. Chernikov
54202ab3d1 Add section describing existing filtering points.
Document byteorder behavior in AF_INET[6] hooks in new section.

MFC after:	2 weeks
2012-09-16 13:13:02 +00:00