o code reorg (relative to old netbsd-derived code) for future growth
o drivers now specify available channels and rates and 802.11 layer handles
almost all ifmedia actions
o multi-mode support for 11a/b/g devices
o 11g protocol additions (incomplete)
o new element id additions (for other than 11g)
o node/station table redone for proper locking and to eliminate driver
incestuousness
o split device flags and capabilities to reduce confusion and provide room
for expansion
o incomplete power management infrastructure (need to revisit)
o incomplete hooks for software retry
o more...
read permision only required for listing, read/write required for
read/write to registers
fix a possible memory leak
clean up error handling a bit
Reviewed by: silence
the MAC policy modules to improve robustness against C string
bugs and vulnerabilities. Following these revisions, all
string construction of labels for export to userspace (or
elsewhere) is performed using the sbuf API, which prevents
the consumer from having to perform laborious and intricate
pointer and buffer checks. This substantially simplifies
the externalization logic, both at the MAC Framework level,
and in individual policies; this becomes especially useful
when policies export more complex label data, such as with
compartments in Biba and MLS.
Bundled in here are some other minor fixes associated with
externalization: including avoiding malloc while holding the
process mutex in mac_lomac, and hence avoid a failure mode
when printing labels during a downgrade operation due to
the removal of the M_NOWAIT case.
This has been running in the MAC development tree for about
three weeks without problems.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
attributes from objects over vop_setextattr() with a NULL uio; if
the file system doesn't support the vop_rmextattr() method, fall
back to the vop_setextattr() method.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
interface, rather than relying on a NULL uio for the deletion
operation.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
specify what credential to use when authorizing vn_open() and later
write operations, rather than curthread->td_ucred.
When writing KTR traces to an ALQ, specify the credential of the thread
generating the sysctl request.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
releasing the lock only if we are about to sleep (e.g., vm_pager_get_pages()
or vm_pager_has_pages()). If we sleep, we have marked the vm object with
the paging-in-progress flag.
"ipid" options. This feature has been requested by several users.
On passing, fix some minor bugs in the parser. This change is fully
backward compatible so if you have an old /sbin/ipfw and a new
kernel you are not in trouble (but you need to update /sbin/ipfw
if you want to use the new features).
Document the changes in the manpage.
Now you can write things like
ipfw add skipto 1000 iplen 0-500
which some people were asking to give preferential treatment to
short packets.
The 'MFC after' is just set as a reminder, because I still need
to merge the Alpha/Sparc64 fixes for ipfw2 (which unfortunately
change the size of certain kernel structures; not that it matters
a lot since ipfw2 is entirely optional and not the default...)
PR: bin/48015
MFC after: 1 week
policy definition structure; this permits policies to reduce their
number of gratuitous includes for required for entry points they
don't implement. This also facilitates building the MAC Framework
on Darwin.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.
By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.
At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
console, even if there is a TIOCCONS console tty. We were already
doing this after a panic, but it's also useful when entering DDB
for some other reason too.
TIOCCONS console (e.g. xconsole) via a timeout routine instead of
calling into the tty code directly from printf(). This fixes a
number of cases where calling printf() at the wrong time (such as
with locks held) would cause a panic if xconsole is running.
The TIOCCONS message buffer is 8k in size by default, but this can
be changed with the kern.consmsgbuf_size sysctl. By default, messages
are checked for 5 times per second. The timer runs and the buffer
memory remains allocated only at times when a TIOCCONS console is
active.
Discussed on: freebsd-arch
Reorder how the pci probing in handled. before adding devices, check to
see if the slot is a multi-function device to see if we should probe all
the functions.
Original idea by: imp
with a new implementation that has a mostly reentrant "addchar"
routine, supports multiple message buffers in the kernel, and hides
the implementation details from callers.
The new code uses a kind of sequence number to represend the current
read and write positions in the buffer. This approach (suggested
mainly by bde) permits the read and write pointers to be maintained
separately, which reduces the number of atomic operations that are
required. The "mostly reentrant" above refers to the way that while
it is now always safe to have any number of concurrent writers,
readers could see the message buffer after a writer has advanced
the pointers but before it has witten the new character.
Discussed on: freebsd-arch
data access errors when trying to read/write to non-existant PCI devices.
fix the psycho bridge to use peek for probing devices. This no longer
fakes it if the OFW node doesn't exist (and the reg == 0).
Reviewed by: jake, tmm
when serving up more than about 32 active files. For details see
section 6.3 (pg 111) of Daniel Ellard and Margo Seltzer, ``NFS
Tricks and Benchmarking Traps'' in the Proceedings of the Usenix
2003 Freenix Track, June 9-14, 2003 pg 101-114.
Obtained from: Daniel Ellard <ellard@eecs.harvard.edu>
Sponsored by: DARPA & NAI Labs.
causing poor interactive performance while unnice processes were running.
The new scheme still allows nice to have an effect on priority but it is
not as dramatic as the effect of the interactivity score.
- Avoid calling bread() with different sizes on the same blkno.
Although the buffer cache is designed to handle differing size
buffers, it erroneously tries to write the incorrectly-sized buffer
buffer back to disk before reading the correctly-sized one, even
when it's not dirty. This behaviour caused a panic for read-only
NTFS mounts when INVARIANTS was enabled ("bundirty: buffer x still
on queue y"), reported by NAKAJI Hiroyuki.
- Fix a bug in the code handling holes: a variable was incremented
instead of decremented, which could cause an infinite loop.
for vnodes reached through double indirection (i.e. **vpp). This
is worked-around by special-casing the identifier "vpp" (adding one
level of indirection).
The alternative fix mentioned in the PR had required substantial
changes to this script.
In case there are locking violations that had been hidden without
this patch, they may suddenly show up, now ...
This change does not affect code compiled without DEBUG_VFS_LOCKS.
PR: kern/46652
before calling it for bound thread. To avoid this problem, change
thread_schedule_upcall to not put new thread on run queue, let caller
do it, so we can tweak the new thread before setting it to run.
Reported by: pho
threads in the process have already masked the signal, so job control
is delayed. But later a thread unmasking the STOP signal should enable
job control, so in issignal(), scanning all threads in process to see
if we can direct suspend some of them, not just suspend current thread.
we can deadlock because of lock order reversals. This was not
caught because Witness ignores pool mutexes right now.
Diagnosis and help: truckman
Noticed by: pho
"maxproc limit exceeded by uid %i, please see tuning(7) and login.conf(5)."
Which will be triggered whenever a user hits his/her maxproc limit or
the systemwide maxproc limit is reached.
MFC after: 1 week
- Don't require all receivers of ipis to wait for all other receivers,
only that the sender wait for all receivers. This should reduce the
amount of time spent with interrupts disabled, which may be a cause
of ipi timeouts.
Discussed with: tmm
mutexes are supposed to only be used as leaf mutexes, and what appear
to be separate pool mutexes could be aliased together, it is bad idea
for a thread to attempt to hold two pool mutexes at the same time.
Slightly rearrange the code in kern_open() so that FILE_UNLOCK() is
called before calling VOP_GETVOBJECT(), which will grab the v_vnlock
mutex.
drivers that implemnt the i2c bit banging bus interface not have to
recompile iicbb in order to add an attachment for it.
This will mean the bktr and other definitions can go back to their
respective drivers.
- Mark that it cannot handle greater than 4GB of RAM at this time. Fixing
that will come later. Fail any attempts to dump above thati limit.
- If a call to aac_disk_dump() needs to be split into multiple i/o's,
increment the virtual offset after each i/o instead of just dumping the
same offset over and over again.
- Bail out if bus_dmamap_load() returns an error. Error recovery is likely
not possible.
systems to fail more gracefully when a file descriptor exhaustion situation
occurs.
Original patch by: David G. Andersen <dga@lcs.mit.edu>
PR: 45353
MFC after: 1 week
the values that are "reserved", but they are not reserved anywhere else
so I'm assuming this is what they were unreserved for. Unfortunately
some of the values for local syscons types overlap the values used for
sbus adapters elsewhere, so we can't have all the same values.
- Move prototypes for sparc64-specific helper functions from bus.h to
bus_private.h
- Move the method pointers from struct bus_dma_tag into a separate
structure; this saves some memory, and allows to use a single method
table for each busdma backend, so that the bus drivers need no longer
be changed if the methods tables need to be modified.
- Remove the hierarchical tag method lookup. It was never really useful,
since the layering is fixed, and the current implementations do not
need to call into parent implementations anyway. Each tag inherits
its method table pointer and cookie from the parent (or the root tag)
now, and the method wrapper macros directly use the method table
of the tag.
- Add a method table to the non-IOMMU backend, remove unnecessary
prototypes, remove the extra parent tag argument.
- Rename sparc64_dmamem_alloc_map() and sparc64_dmamem_free_map() to
sparc64_dma_alloc_map() and sparc64_dma_free_map(), move them to a
better place and use them for all map allocations and deallocations.
- Add a method table to the iommu backend, and staticize functions,
remove the extra parent tag argument.
- Change the psycho and sbus drivers to just set cookie and method table
in the root tag.
- Miscellaneous small fixes.
redundant paths to the same device.
This class reacts to a label in the first sector of the device,
which is created the following way:
# "0123456789abcdef012345..."
# "<----magic-----><-id-...>
echo "GEOM::FOX someid" | dd of=/dev/da0 conv=sync
NB: Since the fact that multiple disk devices are in fact the same
device is not known to GEOM, the geom taste/spoil process cannot
fully catch all corner cases and this module can therefore be
confused if you do the right wrong things.
NB: The disk level drivers need to do the right thing for this to
be useful, and that is not by definition currently the case.
smbfs_close(). This fixes paging to and from mmap()'d regions of smbfs
files after the descriptor has been closed, and makes thttpd, GNU ld,
and perhaps more things work that depend on being able to do this.
PR: 48291
because the run time exceeds the largest value a signed int can hold.
The real solution involves calculating how far we are over the limit.
To quickly solve this problem we loop removing 1/5th of the current value
until it falls below the limit. The common case requires no passes.
- Emulate lock draining (LK_DRAIN) in null_lock() to avoid deadlocks
when the vnode is being recycled.
- Don't allow null_nodeget() to return a nullfs vnode from the wrong
mount when multiple nullfs's are mounted. It's unclear why these checks
were removed in null_subr.c 1.35, but they are definitely necessary.
Without the checks, trying to unmount a nullfs mount will erroneously
return EBUSY, and forcibly unmounting with -f will cause a panic.
- Bump LOG2_SIZEVNODE up to 8, since vnodes are >256 bytes now. The old
value (7) didn't cause any problems, but made the hash algorithm
suboptimal.
These changes fix nullfs enough that a parallel buildworld succeeds.
Submitted by: tegge (partially; LK_DRAIN)
Tested by: kris
and run time.
- Scale the sleep and run time back via sched_interact_update() in more
places. This is to keep the statistic more accurate.
- Charge a parent one tick for forking a child.
- Add only the run time and not the sleep time to the parents kg when a
thread exits. This allows us to give a penalty for having an expensive
thread exit but does not give a bonus for having an interactive thread
exit.
- Change the SLP_RUN_THROTTLE to limit us to 4/5th and not 1/2.
- Change the SLP_RUN_MAX to two seconds. This keeps bursty interactive
applications like mozilla and openoffice in the interactive range even
through expensive tasks.
- Recalculate the slice after every sleep. This ensures that once a task
has been marked interactive it only has a slice of 1 at the risk of
giving tasks that sleep for a very brief period a longer time slice.
this makes connect act more sensibly in these cases.
PR: 50839
Submitted by: Barney Wolff <barney@pit.databus.com>
Patch delayed by laziness of: silby
MFC after: 1 week
This version of the driver code is compatible with near-release FreeBSD 5.1
kernel/driver interfaces.
modules/Makefile, man page and other bindings to follow shortly, once I get
this part of the check-in right.
Approved by: markm(mentor)
state, as inp_socket will then be NULL. This fixes a panic that occurs when one
tries to bind a port that was previously binded with remaining TIME_WAIT
sockets.
the terminating '\0'. Since the initialisation of rootpath in
libstand/bootp.c may copy junk into the rest of the buffer, it was
possible for the code to find a ':' after the '\0' and do the wrong
thing.
Reviewed by: ps
MFC after: 1 week
resource deallocation back to fifo_close(). This eliminates any
stale data that might be stuck in the socket buffers after all the
readers and writers have closed the fifo.
Tested by: Thorsten Schroeder <ths@katjusha.de>
on friday 13th and without making a universe). This adds struct and
constant definitions for ATM traffic parameters and re-enables the
build of the midway driver.
Tested by: make universe
Previously, any normal I/O on an fdc(4) device would fail with ENXIO
if the device had been opened in non-blocking mode and then closed
prior to the conventional access; that would last until the floppy
disk was ejected and re-inserted to raise the unit attention condition.
Add a clarifying comment.
populated. Apparently, if you use an ehci controller, it's not.
Use usbd_device2interface_handle() to retrieve the interface handle.
NOTE: uaa->iface is populated in the probe routine, so I suspect the
fact that it's NULL in the attach routine is a bug in the ehci driver.
Also, don't depend on the PHY addresses returned by the AXE_CMD_READ_PHYID
command. The address is correct for my LinkSys NIC, but a user has
reported that with a D-Link NIC, the PHYID command returns address 4
while the attached Broadcom PHY is in fact strapped for address 0.
Instead, latch onto the first PHY address that returns valid data
during a readreg operation.
- Add vm page queue locking in certain places that are only needed on
sparc64.
This should make pmap_qenter and pmap_qremove MP-safe.
Discussed with: alc
as should every block device strategy routine.
There was at least one evil consequence of not doing so:
Some errors returned by fdstrategy() could be lost (EAGAIN,
in particular.)
PR: kern/52338 (in the audit-trail)
Discussed with: bde
to access floppy parameters through it.
Note: The DIOCGSECTORSIZE and DIOCGMEDIASIZE handlers withing
fdioctl() couldn't be just moved to below the existing check
for blocking mode because fd->ft can be non-NULL while still
in non-blocking mode (fd->ft can be set with the FD_STYPE ioctl.)
PR: kern/52338
No MFC: Not applicable to STABLE
schedules an upcall. Signal delivering to a bound thread is same as
non-threaded process. This is intended to be used by libpthread to
implement PTHREAD_SCOPE_SYSTEM thread.
2. Simplify kse_release() a bit, remove sleep loop.
ulpt_status() afterwards. This fixes a crash that can occur if a
USB printer is power-cycled when printing is just starting. The
problem is similar to that fixed in revision 1.33, but it is much
less likely to occur.
MFC after: 1 week
panics. Before revision 1.38, we used to just point panicstr at the
format string if panicstr was NULL, but since we now use a static
buffer for the formatted panic message, we have to be careful to
only write to it during the first panic.
Pointed out by: bde
UFS quota implementation. Push some quite broken access control
logic out of ufs_quotactl() into the individual command
implementations in ufs_quota.c; fix that logic. Pass in the thread
argument to any quotactl command that will need to perform access
control.
o quotaon() requires privilege (PRISON_ROOT).
o quotaoff() requires privilege (PRISON_ROOT).
o getquota() requires that:
If the type is USRQUOTA, either the effective uid match the
requested quota ID, that the unprivileged_get_quota flag be
set, or that the thread be privileged (PRISON_ROOT).
If the type is GRPQUOTA, require that either the thread be
a member of the group represented by the requested quota ID,
that the unprivileged_get_quota flag be set, or that the
thread be privileged (PRISON_ROOT).
o setquota() requires privilege (PRISON_ROOT).
o setuse() requires privilege (PRISON_ROOT).
o qsync() requires no special privilege (consistent with what
was present before, but probably not very useful).
Add a new sysctl, security.bsd.unprivileged_get_quota, which when
set to a non-zero value, will permit unprivileged users to query user
quotas with non-matching uids and gids. Set this to 0 by default
to be mostly consistent with the previous behavior (the same for
USRQUOTA, but not for GRPQUOTA).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
from the tree until it is fixed. Since it is an atm driver, it isn't
commonly used so this will not negatively impact too many people.
harti can reconnect it when he resurfaces and corrects the en module
problems. This should allow snapshots to start succeeding again.
Reported by: lots of people
which meant no process would run for longer than 20ms.
- Slightly redo the interactivity scorer. It follows the same algorithm but
in a slightly more correct way. Previously values above half were
incorrect.
- Lower the interactivity threshold to 20. It seems that in testing non-
interactive tasks are hardly ever near there and expensive interactive
tasks can sometimes surpass it. This area needs more testing.
- Remove an unnecessary KTR.
- Fix a case where an idle thread that had an elevated priority due to
priority prop. would be placed back on the idle queue.
- Delay setting NEEDRESCHED until userret() for threads that haad their
priority elevated while in kernel. This gives us the same context switch
optimization as SCHED_4BSD.
- Limit the child's slice to 1 in sched_fork_kse() so we detect its behavior
more quickly.
- Inhert some of the run/slp time from the child in sched_exit_ksegrp().
- Redo some of the priority comparisons so they are more clear.
- Throttle the frequency of sched_pctcpu_update() so that rounding errors
do not make it invalid.
and return NULL.
vinum_scandisk: Don't handle NULL device pointers.
Only look at compatibility partition for i386. This
is a kludge which should go away once I have adequate
documentation for the New World Order.
Together, these fixes remove occasional error messages about
non-existent drives. They may also fix a number of problems that have
been reported without a PR.
PRs: None
directory vnodes use to refer to their constituent vnodes, into
union_dircache_free(). Also s/union_dircache/union_dircache_get/ and
tweak the structure of union_dircache_r().
MFC after: 3 days
to the machine-independent parts of the VM. At the same time, this
introduces vm object locking for the non-i386 platforms.
Two details:
1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set
KSTACK_GUARD_PAGES to 0.
2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().
Devices below may experience a change in geometry.
* Due to a bug, aic(4) never used extended geometry. Changes all drives
>1G to now use extended translation.
* sbp(4) drives exactly 1 GB in size now no longer use extended geometry.
* umass(4) drives exactly 1 GB in size now no longer use extended geometry.
For all other controllers in this commit, this should be a no-op.
Looked over by: scottl
Devices below may experience a change in geometry.
* Due to a bug, aic(4) never used extended geometry. Changes all drives
>1G to now use extended translation.
* sbp(4) drives exactly 1 GB in size now no longer use extended geometry.
* umass(4) drives exactly 1 GB in size now no longer use extended geometry.
For all other controllers in this commit, this should be a no-op.
PR:
Submitted by:
Looked over by: scottl
Approved by:
Obtained from:
MFC after:
in smb_fphelp(): the parent vnode may have already been recycled
since we don't hold a reference to it. Fixes a panic when rebooting
with mdconfig -t vnode devices referring to vnodes on a smbfs mount.
how we registered pccard_intr, it is MPSAFE. However, since we
register the pccard_intr handler with the flags of the ISR we call,
that is the gating factor. We need do nothing specific here.
Prompted by: seeing pccard_intr in a panic.
are the same that those of the kernel in the KLD_MODULE case. If
we ever want to detect that kind of problems, this is not the right
place to do this since every network driver would be affected by
such desynchronisation.
toggle several media options (sonet/sdh, for example) with ifconfig and
to see the carrier state in ifconfig's output. It gives also read/write
access (given the right privilegs) to the S/Uni registers to user space
programs.
been tested extensively, but -CURRENT testing has been hampered by a
number of panics that also occur without the patch. Since the
destabilizing changes between 4.X and 5.X are external to unionfs,
I believe this patch applies equally well to both.
Thanks to scrappy for assistance testing these and other changes.
MFC after: 4 days
small but noticeable increase in performance for name lookup operations.
The code uses two zones, one for short names (less than 32 characters)
and one for long names (up to NAME_MAX). Since most file names are
fairly short, this saves a considerable amount of space that would
otherwise be wasted if we always allocated NAME_MAX bytes. The cutoff
value of 32 characters was picked arbitrarily and may benefit from some
tweaking; it could also be made into a tunable.
Submitted by: hmp
Restructure the error handling portion of the resource allocation
code to eliminate duplicated code.
Test for the O_NONBLOCK && fi_readers == 0 case before incrementing
fi_writers and modifying the the socket flag to avoid having to
undo these operations in this error case.
Restructure and simplify the code that handles blocking opens.
There should be no change to functionality.
than once. This appears to work around the hanging issues, at the
expense of warnings about bad RID allocations. I'm not sure this is a
permanant workaround, but does appear to help in the tests that I've
done here.
has not been cleaned in the meantime, since this can happen during
a forced unmount. Also add a comment that nfs_removeit() should
really be locking the directory vnode before calling nfs_removerpc().
Reported by: mbr
Tested by: mbr
MFC after: 1 week
It currently supports the PMC Sierra Lite, Ultra and 622 chips and
the IDT 77105. The driver handles media options and state in a consistent
manner for ATM drivers. The next commit to the midway driver will make
it use utopia.
doesn't have one. The test was bogus on these architectures, but
recent changes broke it altogether.
Prompted by: phk
This should fix the recent SPARC 64 build problems.
defined, then #error out. This is protected inside of #ifdef _KERNEL.
This allows one to tag code in the tree that will be deleted in 6.x
with the 'OBSOLETE_IN_6 #define at the top of the file. This makes
for easy grepping, plus a mechanism that automatically fails the
compilation of those files that are so tagged after we do the cutover.
However, GENERIC has wdc commented out, and COMPAT_OLDISA is required
for that. Comment out COMPAT_OLDISA and sdd a comment to this effect
near wdc.
Reviewed by: nyan@
o Register ISR INTR_MPSAFE.
o Loop on KTHREAD_DONE == 0 in the thread.
o Safe the INTR_MPSAFE flag for client drivers (don't know if there are any
CardBus/PCI drivers that are INTR_MPSAFE)
o Read status after acquiring mtx_lock(Giant) rather than before so that we
catch state changes that happen while Giant is being acquired.
o Turn off the CD bit when we see a CD interrupt, and turn it back on after
we've attached/detached the card.
o On suspend, actually set the CBB_SOCKET_MASK to zero rather than oring
in '0' to turn it off on suspend.
o If the ISR that's registerd is MPSAFE, don't acquire Giant around call to
client ISR.
o Fix comments to reflect these changes.
PCB contains FP registers, whose alignment must be 16 bytes at least.
Since the PCB pointed to by pc_pcb is immediately after the PCPU
itself, round-up the size of thge PCPU to a multiple of 16 bytes. The
PCPU is page aligned.
This fixes a misalignment trap caused by stopping a CPU in a SMP
kernel, such as been done when entering the debugger.
Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>