missed in the previous commit; a line that exceeded 80 characters. No
functional changes, but the object file's md5 checksum changes because some
lines have been displaced.
1) Allow the sending of more than one control message at a time
over a unix domain socket. This should cover the PR 29499.
2) This requires that unp_{ex,in}ternalize and unp_scan understand
mbufs with more than one control message at a time.
3) Internalize and externalize used to work on the mbuf in-place.
This made life quite complicated and the code for sizeof(int) <
sizeof(file *) could end up doing the wrong thing. The patch always
create a new mbuf/cluster now. This resulted in the change of the
prototype for the domain externalise function.
4) You can now send SCM_TIMESTAMP messages.
5) Always use CMSG_DATA(cm) to determine the start where the data
in unp_{ex,in}ternalize. It was using ((struct cmsghdr *)cm + 1)
in some places, which gives the wrong alignment on the alpha.
(NetBSD made this fix some time ago).
This results in an ABI change for discriptor passing and creds
passing on the alpha. (Probably on the IA64 and Spare ports too).
6) Fix userland programs to use CMSG_* macros too.
7) Be more careful about freeing mbufs containing (file *)s.
This is made possible by the prototype change of externalise.
PR: 29499
MFC after: 6 weeks
We only support the size of frame we are currently allocating, which
is MCLBYTES - sizeof (struct ether_header) usable, so don't set an
MTU that would go over this.
first thought and may require serious work to the VOP_RENAME() api itself.
Basically, by the time the VOP_RENAME() function is called, it's already
too late.
it now really gets in the way.
This allows us to fix several problems- not least of which was problems
of ordering about when you'd have a device softc for an miibus child
available or not. Move some steps of things around.
Put the ifnet/arpcom structure at the head of the softc (PR 29249).
Don't do tx gc in the interrupt service routine- that seems to make
things a bit more efficient.
Enable jumbo support by default- but this version of 'jumbo' is broken
because it really is just using multiple tfd/rfd's to match a packet,
which will never be > CLSIZE anyway.
This should begin the first steps toward cleaning this driver up.
PR: 29249
MFC after: 1 week
with an ifnet structure (so device_get_softc will get one).
If memory allocation fails in mii_phy_probe, don't just march ahead into
a panic- return ENOMEM.
MFC after: 1 week
will be one for struct proc in stable. those drivers needing to have
cross version portability should use d_thread_t instead of inventing
their own means. Non-drivers, and drivers that either only run on
-current or must look under the covers of the struct proc/thread
should must not use this.
As noted in arch@, this minorly violates style(9), but the sys/conf.h
devsw already violates this and all I'm doing is extending the
violation to ease the burdon on device driver writers. It was judged
that this minor violation, which doesn't impact userland or those
people not using it, was preferable to the alternatives (eg #define
proc thread). C does not allow a way to rename or alias structs
easily, so we fall back to using a typedef.
Bump FreeBSD_version to reflect this change (porters guide to be done
in a separate commit).
in vfs_syscalls.c:
if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid &&
(error = suser_td(td)) != 0) {
unwrap_lots_of_stuff();
return (error);
}
to:
if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid) {
error = suser_td(td);
if (error) {
unwrap_lots_of_stuff();
return (error);
}
}
This makes the code more readable when complex clauses are in use,
and minimizes conflicts for large outstanding patchsets modifying the
kernel authorization code (of which I have several), especially where
existing authorization and context code are combined in the same if()
conditional.
Obtained from: TrustedBSD Project
- When the video BIOS is called to clear the region (x, y)-(79, 24)
(by scrolling), the slashed region in Fig.1 is cleared. CD() is
supposed to clear the region shown in Fig.2.
x x
+-------+ +-------+
| | | |
y| ////| y| ////|
| ////| |///////|
| ////| |///////|
+-------+ +-------+
Fig.1 Fig.2
- Don't move the cursor during this operation.
- Be consistent about placing spaces around keywords and
operators; don't mix statements like "if(A==B)" and "if (X == Y)",
"return(0)" and "return (-1)", "P=10" and "Q = 0", etc.
- Consitently indent lines. It's not good to indent by 8 columns
in one part of the file, and by 4 columns in the other part.
to avoid removing higher level directory vnodes from the namecache has
no perceivable effect and will be removed. This is especially true
when vmiodirenable is turned on, which it is by default now. ( vmiodirenable
makes a huge difference in directory caching ). The vfs.vmiodirenable and
vfs.nameileafonly sysctls have been left in to allow further testing, but
I expect to rip out vfs.nameileafonly soon too.
I have also determined through testing that the real problem with numvnodes
getting too large is due to the VM Page cache preventing the vnode from
being reclaimed. The directory stuff made only a tiny dent relative
to Poul's original code, enough so that some tests succeeded. But tests
with several million small files show that the bigger problem is the VM Page
cache. This will have to be addressed by a future commit.
MFC after: 3 days
YA pseudofs megacommit, part 2:
- Merge the pfs_vnode and pfs_vdata structures, and make the vnode cache
a doubly-linked list. This eliminates the need to walk the list in
pfs_vncache_free().
- Add an exit callout which revokes vnodes associated with the process
that just exited. Since it needs to lock the cache when it does this,
pfs_vncache_mutex needs MTX_RECURSE.
- Add a third callback to the pfs_node structure. This one simply returns
non-zero if the specified requesting process is allowed to access the
specified node for the specified target process. This is used in
addition to the usual permission checks, e.g. when certain files don't
make sense for certain (system) processes.
- Make sure that pfs_lookup() and pfs_readdir() don't yap about files
which aren't pfs_visible(). Also check pfs_visible() before performing
reads and writes, to prevent the kind of races reported in SA-00:77 and
SA-01:55 (fork a child, open /proc/child/ctl, have that child fork a
setuid binary, and assume control of it).
- Add some more trace points.
per-command component that we *don't* try and pass thru CAM. CAM just
is too risky and too much of a pain- structures get copied, but not
all info of interest can be considered safely transported thru all
consumers (including user space) from the incoming ATIO to the outgoing
CTIO- it's just much safer to have a buddy structure, identified by the
command's tag which *does* make it thru safely.
Pay attention to link speed and report 200MB/s xfer speed for a
23XX card in 2GPs mode.
MFC after: 1 week
Handle overlap in bcopy.
Add routines for copying and zeroing pages using physical addresses
directly.
Remove all the hacks to account for calling the firmware on its own
trap table, we use the kernel trap table. There is still a problem
with OF_exit().
- Rearrange the flag constants a little to simplify specifying and testing
for readability and writeability.
pseudofs_vnops.c:
- Track the aforementioned change.
- Add checks to pfs_open() to prevent opening read-only files for writing
or vice versa (pfs_{read,write} would block the actual reads and writes,
but it's still a bug to allow the open() to succeed). Also, return
EOPNOTSUPP if the caller attempts to lock the file.
- Add more trace points.
easier and hopefully this code is done changing radically.
Don't use the mmu tlb register to address the kernel page table, nor
the 8k pointer register. The hardware will do some of the page table
lookup by storing the the base address in an internal register and
calculating the address of the tte in the table. However it is limited
to a 1 meg tsb, which only maps 512 megs. The kernel page table only
has one level, so its easy to just do it by hand, which has the advantage
of supporting abitrary amounts of kvm and only costs a few more instructions.
Increase kvm to 1 gig now that its easy to do so and so we don't waste
most of a 4 meg page.
Fix some traces. Fix more proc locking.
Call tsb_stte_promote if we get a soft fault on a mapping in the upper
levels of the tsb. If there is an invalid or unreferenced mapping
in the primary tsb, it will be replaced.
Immediately fail for faults occuring in {f,s}uswintr.
one 4 meg page can map both the kernel and the openfirmware mappings.
Add the openfirmware mappings to the kernel tsb so we can call the firmware
on the kernel trap table and access kernel memory normally.
Implement pmap_swapout_proc, pmap_swapin_proc, pmap_swapout_thread,
pmap_swapin_thread, pmap_activate, pmap_page_exists, and pmap_phys_address.
Add a guard page at the bottom of the kernel stack. Its unclear how easy
it will be to detect these faults and do something useful.
Setup the registers on exec how the c runtime expects.
Implement various {fill,set}_*regs.
Fix proc locking.
when I changed the allocator bits. This implements per-CPU mbtypes
stats by keeping net number of decrements/increments of a given mbtype
per-CPU and then summing all of the per-CPU mbtypes to produce the total
net number of allocated mbufs of the given mbtype.
Counters are carefully balanced to avoid/prevent underflows/overflows.
mbtypes stats are re-enabled with the idea that we may occasionally
(although very rarely) observe slight inconsistencies in the stat
reporting. Most of the time, we should be fine, though.
Also make appropriate modifications to netstat(1) and systat(1) to do
the necessary reporting.
Submitted by: Jiangyi Liu <jyliu@163.net>
built without support for miibus PHYs. Most ed cards don't need
miibus support, so it's useful to be able to avoid the bloat of
all the mii devices for small fixed-purpose kernels.
once so there isn't a window with the ones for the 23XX cards being wrong.
When being verbose, print out some more FC NVRAM values (like framesize).
MFC after: 1 week
. Make internal service routines static.
. Use a consistent ordering of checks in MII_TICK. Do the work in the
mii_phy_tick() subroutine if appropriate.
. Call mii_phy_update() to trigger the callbacks.
the static callout list allocated by the system.
Change malloc type from M_TEMP to M_KQUEUE to better track memory.
Add a kern.kq_calloutmax to globally limit the amount of kernel memory
that can be allocated by callouts.
Submitted by: iedowse (items 1, 2)
appear in /dev. Interface hardware ioctls (not protocol or routing) can
be performed on the descriptor. The SIOCGIFCONF ioctl may be performed
on the special /dev/network node.
- Remove hardcoded uid, gid, mode from struct pfs_node; make pfs_getattr()
smart enough to get it right most of the time, and allow for callbacks
to handle the remaining cases. Rework the definition macros to match.
- Add lots of (conditional) debugging output.
- Fix a long-standing bug inherited from procfs: don't pretend to be a
read-only file system. Instead, return EOPNOTSUPP for operations we
truly can't support and allow others to fail silently. In particular,
pfs_lookup() now treats CREATE as LOOKUP. This may need more work.
- In pfs_lookup(), if the parent node is process-dependent, check that
the process in question still exists.
- Implement pfs_open() - its only current function is to check that the
process opening the file can see the process it belongs to.
- Finish adding support for writeable nodes.
- Bump module version number.
- Introduce lots of new bugs.
next to equivalent m_len adjustments. Move the nfsm_subs.h macros
into groups depending on which phase they are used in, since that
affects the error recovery requirements. Collect some of the common error
checking into a single macro as preparation for unwinding some more.
Have nfs_rephead return a value instead of secretly modifying args.
Remove some unused function arguments that were being passed around.
Clarify nfsm_reply()'s error handling (I hope).
broken and fixing it only creates a duplicate of what is already
in the FreeBSD kernel. Therefore, map the syscall directly to
getpgid().
PR: kern/21402
Submitted by: Christian Weisgerber <naddy@mips.inka.de>
While here, redefine the second entry for setpgid() so that we
don't need a stub. This is achieved by giving the second instance
the type NODEF.
broken and fixing it only creates a duplicate of what is already
in the FreeBSD kernel. Therefore, map the syscall directly to
getpgid().
PR: kern/21402
Submitted by: Christian Weisgerber <naddy@mips.inka.de>
have its entry in the syscall table added. Nothing else is
done. This differs from type NOPROTO in that NOPROTO adds a
definition to syscall.h besides adding a sysent. A syscall can
now have multiple entries without conflict. Note that the
argssize is fixed and depends on the syscall name.
and provide a valid STDC/C++ definition for function NDINIT
queue.h libkern.h: put explicit casts from void * in insque, remque and memset
(for the records, these changes are necessary to let the files
compile with g++, which is used to build a FreeBSD module
for "Click" -- see www.pdos.lcs.mit.edu/click/ .
Given that they have zero impact on our code, it is worthwhile
to have them in.
MFC after: 3 days
ethernet controllers. This adds support for the 3Com 3c996-T, the
SysKonnect SK-9D21 and SK-9D41, and the built-in gigE NICs on
Dell PowerEdge 2550 servers. The latter configuration hauls ass:
preliminary measurements show TCP speeds of over 900Mbps using
only normal size frames.
TCP/IP checksum offload, jumbo frames and VLAN tag insertion/stripping
are supported, as well as interrupt moderation.
Still need to fix autonegotiation support for 1000baseSX NICs, but
beyond that, driver is pretty solid.
+ implement "limit" rules, which permit to limit the number of sessions
between certain host pairs (according to masks). These are a special
type of stateful rules, which might be of interest in some cases.
See the ipfw manpage for details.
+ merge the list pointers and ipfw rule descriptors in the kernel, so
the code is smaller, faster and more readable. This patch basically
consists in replacing "foo->rule->bar" with "rule->bar" all over
the place.
I have been willing to do this for ages!
MFC after: 1 week
- Move the SPECIAL_FLAG #define up next to the NOHOLDER #define and fix a
little nit that caused it to be defined as -(sizeof (struct thread) + 1)
instead of -2.
the current interrupt thread routines will guarantee the condition this is
checking for at a higher level but inthand_add() and inthand_remove() as
they currently exist don't satisfy this condition. (Which does need to be
fixed but which will take a bit more work.) This fixes shared interrupts.
sio_lock has been initialized. This prevents the low level console
output (kernel printf) from clobbering the sio settings if the system
happens to be in the middle of comstart().
not referenced in Stevens, and does not compile with g++.
There is an equivalent structure, struct ipoption in ip_var.h
which is actually used in various parts of the kernel, and also referenced
in Stevens.
Bill Fenner also says:
... if you want the trivia, struct ip_opts was introduced
in in.h SCCS revision 7.9, on 6/28/1990, by Mike Karels.
struct ipoption was introduced in ip_var.h SCCS revision 6.5,
on 9/16/1985, by... Mike Karels.
MFC-after: 3 days
PRISON_ROOT to the suser_xxx() check. Since securelevels may now
be raised in specific jails, use of system flags can still be
restricted in jail(), but in a more configurable way.
o Users of jail() expecting system flags (such as schg) to restrict
jail()'s should be sure to set the securelevel appropriately in
jail()'s.
o This fixes activities involving automated system flag removal in
jail(), including installkernel and friends.
Obtained from: TrustedBSD Project
securelevel_gt(), determine first if a local securelevel exists --
if so, perform the check based on imax(local, global). Otherwise,
simply use the global value.
o Note: even though local securelevels might lag below the global one,
if the global value is updated to higher than local values, maximum
will still be used, making the global dominant even if there is local
lag.
Obtained from: TrustedBSD Project
one is present in the current jail, otherwise, to return the global
securelevel.
o If the securelevel is being updated, require that it be greater than
the maximum of local and global, if a local securelevel exists,
otherwise, just maximum of the global. If there is a local
securelevel, update the local one instead of the global one.
o Note: this does allow local securelevels to lag behind the global one
as long as the local one is not updated following a global increase.
Obtained from: TrustedBSD Project
a time change, and callers so that they provide td->td_proc.
o Modify settime() to use securevel_gt() for securelevel checking.
Obtained from: TrustedBSD Project
in vn_rdwr_inchunks(), allowing other processes to gain an exclusive
lock on the vnode. Specifically: directory scanning, to avoid a race to the
root directory, and multiple child processes coring simultaniously so they
can figure out that some other core'ing child has an exclusive adv lock and
just exit instead.
This completely fixes performance problems when large programs core. You
can have hundreds of copies (forked children) of the same binary core all
at once and not notice.
MFC after: 3 days
We still have to account for a copyin. Make sure the copyin will
succeed by passing the FreeBSD syscall a pointer to userspace,
albeit one that's automagically mapped into kernel space.
Reported by: mr, Mitsuru IWASAKI <iwasaki@jp.FreeBSD.org>
Tested by: Mitsuru IWASAKI <iwasaki@jp.FreeBSD.org>
with weird PCI-PCI bridge configurations to work. Defining
PCI_ALLOW_UNSUPPORTED_IO_RANGE causes the sanity checks to pass even
with out of range values.
Reviewed by: msmith
for securelevel_ge() and securelevel_gt(), I was a little surprised,
but fixed it. Turns out that it was the code that was inverted, during
a whitespace cleanup in my commit tree. This commit inverts the
checks, and restores the comment.
This fixed the problem on the 3 platforms I've been able to test on.
I'm still of the oppinion that the BIOS should take care of this,
however some board makers only apply this when they spot a
SBLive! soundcard, but the problem exists even without a SBLive!.
This fix should probably go somewhere else, but for now I'll
keep it here since we havn't got a central place to put
such things.
problems currently experienced in -CURRENT.
This should fix the problem that the PS/2 mouse is detected
twice if the acpi module is not loaded on some systems.
refering to securelevels; also, update the unprivileged process text
to better indicate the scope of actions permittable when any system
flags are already set (limited).
Submitted by: Udo Schweigert <udo.schweigert@siemens.com>
sizeof(struct inode) into a new malloc bucket on the i386. This
didn't happen in -current due to the removal of i_lock, but it does
no harm to apply the workaround to -current first.
Reduce the size of the i_spare[] array in struct inode from 4 to
3 entries, and change ext2fs to use i_din.di_spare[1] so that it
does not need i_spare[3].
Reviewed by: bde
MFC after: 3 days
we're at least consistent with what tcsendbreak(3) is documented
to do.
MFC after: 2 weeks
Note, the MFC will be to sys/dev/dgb/dgm.c on the RELENG_4 branch
I am not sure if this is absolutely necessary on all systems. Yet,
there certainly are motherboards and notebook systems which require
this, although there are other systems which just don't. I hope we
shall know when to do this on which systems, as the development of our
ACPI subsystem progresses... (I know we didn't need this for the APM
resume.)
all the debugging code into the function versions of the mutex operations
in kern_mutex.c. This reduced the __mtx_* macros to simply wrappers of
the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in
favor of just calling the _{get,rel}_lock_* macros. The tangled hairy mass
of macros calling macros is at least a bit more sane now.
* Don't get confused when memory regions don't lie on page boundaries -
remember our page size is typically larger than the firmware's page size.
* Add a function ia64_running_in_simulator() which is intended to detect
whether the kernel is running in SKI or on real hardware.