This is a pretty invasive change, but there are three good
reasons to do this:
1. We'll never have > 16 bits of handle.
2. We can (eventually) enable the RIO (Reduced Interrupt Operation)
bits which return multiple completing 16 bit handles in mailbox
registers.
3. The !)$*)$*~)@$*~)$* Qlogic target mode for parallel SCSI spec
changed such that at_reserved (which was 32 bits) was split into
two pieces- and one of which was a 16 bit handle id that functions
like the at_rxid for Fibre Channel (a tag for the f/w to correlate
CTIOs with a particular command). Since we had to muck with that
and this changed the whole handler architecture, we might as well...
Propagate new at_handle on through int ct_fwhandle. Follow
implications of changing to 16 bit handles.
These above changes at least get Qlogic 1040 cards working in target
mode again. 1080/12160 cards don't work yet.
In isp.c:
Prepare for doing all loop management in outer layers.
with egcs-1.1.1. bus_space_write_multi_2() had an extra operation that
should have been removed.
Remove it.
This fixes the panic when bus_space_write_multi_2() is used.
Obtained from: jake
- Add a KASSERT() to ensure an ithread has a backing kernel thread when we
schedule it.
- Don't attempt to preemptively switch to an ithread if p_stat of curproc
is not SRUN.
newbus in revision 1.19. As a result, lnc was, I believe, broken
for all PCI cards. The softc fields `lnc_btag' and `lnc_bhandle'
were not initialised, `rap', `rdp' and `bdp' were initialised to
the wrong values, and the size of the DMA ring memory was calculated
incorrectly.
Paul Richards has further cleanups in the pipeline, but this at
least is enough to make the driver usable with VMware.
Approved by: paul
is to return EINPROGRESS, EALREADY, (so_error ONCE), EISCONN. Certain
linux applications rely on the so_error (normally 0) being returned in
order to operate properly.
Tested by: Thomas Moestl <tmoestl@gmx.net>
An initial tidyup of the mount() syscall and VFS mount code.
This code replaces the earlier work done by jlemon in an attempt to
make linux_mount() work.
* the guts of the mount work has been moved into vfs_mount().
* move `type', `path' and `flags' from being userland variables into being
kernel variables in vfs_mount(). `data' remains a pointer into
userspace.
* Attempt to verify the `type' and `path' strings passed to vfs_mount()
aren't too long.
* rework mount() and linux_mount() to take the userland parameters
(besides data, as mentioned) and pass kernel variables to vfs_mount().
(linux_mount() already did this, I've just tidied it up a little more.)
* remove the copyin*() stuff for `path'. `data' still requires copyin*()
since its a pointer into userland.
* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each
filesystem. This variable is generally initialised with `path', and
each filesystem can override it if they want to.
* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.
of memory, rather than from the start.
This fixes problems allocating bouncebuffers on alphas where there is only
1 chunk of memory (unlike PCs where there is generally at least one small
chunk and a large chunk). Having 1 chunk had been fatal, because these
structures take over 13MB on a machine with 1GB of ram. This doesn't leave
much room for other structures and bounce buffers if they're at the front.
Reviewed by: dfr, anderson@cs.duke.edu, silence on -arch
Tested by: Yoriaki FUJIMORI <fujimori@grafin.fujimori.cache.waseda.ac.jp>
for selecting unit). Instead, use the resource hints mechanism.
One unfortunate situation here is that there is no resource_quad_value
function- which is what I needed for WWN boot time replacement. Worse-
you can't store the hint as just plain
hint.isp.0.nodewwn="0x50000000aaaa0001"
because this gets interpreted as an int- incorrectly because it can't
be converted to an int. I can't even get this as a string. To work
around this particular case for nodewwn && portwwn setting, this
rather grotesque form will be used:
hint.isp.0.nodewwn="w50000000aaaa0001"
hint.isp.0.portwwn="w50000000aaaa0002"
At the same time, if we have no hinted WWN, set the default WWN (which, btw,
gets overridden if the card has valid NVRAM, which is usual) to
0x400000007F000009ull (which translates to NAA == IPv4, 127.0.0.9).
Eliminate more printf's and replace them either with device_printf or
isp_prt calls.
ones where we have a CAM path) and replacing them with calls to isp_prt.,
Eliminate isp_unit references- we no longer have an isp_unit- we now
have an isp_dev that device_get_unit can work with.
`rootvnode' pointer, but vfs_syscalls.c's checkdirs() assumed that
it did. This bug reliably caused a panic at reboot time if any
filesystem had been mounted directly over /.
The checkdirs() function is called at mount time to find any process
fd_cdir or fd_rdir pointers referencing the covered mountpoint
vnode. It transfers these to point at the root of the new filesystem.
However, this process was not reversed at unmount time, so processes
with a cwd/root at a mount point would unexpectedly lose their
cwd/root following a mount-unmount cycle at that mountpoint.
This change should fix both of the above issues. Start_init() now
holds an extra vnode reference corresponding to `rootvnode', and
dounmount() releases this reference when the root filesystem is
unmounted just before reboot. Dounmount() now undoes the actions
taken by checkdirs() at mount time; any process cdir/rdir pointers
that reference the root vnode of the unmounted filesystem are
transferred to the now-uncovered vnode.
Reviewed by: bde, phk
a regular basis. Adjust our linux emulation to conform. This will
cause more dirty pages to be left for the pagedaemon to deal with,
but our new low-memory handling code can deal with it. The linux
way appears to be a trend, and we may very well make MAP_NOSYNC the
default for FreeBSD as well (once we have reasonable sequential
write-behind heuristics for random faults).
(will be MFC'd prior to 4.3 freeze)
Suggested by: Andrew Gallatin
make sure that PG_NOSYNC is properly set. Previously we only set it
for a write-fault, but this can occur on a read-fault too.
(will be MFCd prior to 4.3 freeze)
hit on the client side and prevent the server side from retiring writes.
Pipeline operations turned off for all READs (no big loss since reads are
usually synchronous) and for NFS writes, and left on for the default bwrite().
(MFC expected prior to 4.3 freeze)
Testing by: mjacob, dillon
update native priority, it is diffcult to get right and likely
to end up horribly wrong. Use an honestly wrong fixed value
that seems to work; PUSER for user threads, and the interrupt
priority for ithreads. Set it once when the process is created
and forget about it.
Suggested by: bde
Pointy hat: me
if an arriving packet belongs to us, also check that the packet arrived
through the correct interface. Skip this check if the packet was locally
generated.
CVSrepo deletion of the previous attempt will be requested:
--original message--
Add the 'virtual nulmodem driver'
Particularly useful for debuging kernels using vmware.
If your name is Bruce evans and you are a WIZ at tty interfaces,
then you should probably rip this to shreds and offer lots of suggestions and
patches. I've been using this since 4.0-CURRENT and it's never caused
problems but I'm sure I got something wrong. This is similar to the pty/cty
device driver except that both sides are ttys. Even minor numbers
are side A and odd minor numbers are side B.
Work needs to be done regarding what happens to the other side when you
close a node.
to use with vmware, configure vmware to redirect COM2 out to side A of one
of these and boot a kernel with teh gdb remote port set to sio1.
AFTER dropping into the gdb kernel debugger in your test kernel,
fire up gdb with it's remote port pointing at the appropriate side B.
To catch all console output, you can boot the vmware kernel with a serial
console, (COM1) similarly redirected to a nulmodem, and use 'tip' to observe it.
This is practically unaltered since pre 4.0 days except for
changes made along the way needed to make it compile, so any suggestions
or offers of total rewrites will be listenned to :-)
When we recieve a fragmented TCP packet (other than the first) we can't
extract header information (we don't have state to reference). In a rather
unelegant fashion we just move on and assume a non-match.
Recent additions to the TCP header-specific section of the code neglected
to add the logic to the fragment code so in those cases the match was
assumed to be positive and those parts of the rule (which should have
resulted in a non-match/continue) were instead skipped (which means
the processing of the rule continued even though it had already not
matched).
Fault can be spread out over Rich Steenbergen (tcpoptions) and myself
(tcp{seq,ack,win}).
rwatson sent me a patch that got me thinking about this whole situation
(but what I'm committing / this description is mine so don't blame him).
rather than in silly places like "VFS Cluster debugging". People
should really be using COMPAT_LINUX instead of the linux module on
dynamic systems like -current.
process's priority go through the roof when it released a (contested)
mutex. Only set the native priority in mtx_lock if hasn't already
been set.
Reviewed by: jhb
in VMware reports 0x00000000 in the PCI subsystem ID register, but
0x10001000 when you read the mirror registers in I/O space. This causes
pcn_probe() to think it's found a card in 32-bit mode, and performing
a 32-bit I/O access makes on a 16-bit port makes VMware go boom. Special
case the 0x10001000 value until somebody at VMware grows a clue.
Finally discovered by: Andrew Gallatin
For TCP, verify that the sequence number in the ICMP packet falls within
the tcp receive window before performing any actions indicated by the
icmp packet.
Clean up some layering violations (access to tcp internals from in_pcb)
handle read and write requests for widths of multiple bytes. This
can be used to read 16-bit battery status registers for example.
- Remove some unused variables and #if 0'd debugging cruft.
- Don't complain about a GPE query that fails due to AE_NOT_FOUND if the
query method was _Q00.
from a BIF, use the size of the destinatino buffer, not the length of the
string to determine where to put the nul char. As a side effect, the
old code would truncate the string by one character while it was possibly
overflowing the buffer.
This piece of code has not been referenced since it was put there
in 1995. Also done a codebased search on popular networking libraries
and third-party applications. This is an orphan.
Reviewed by: jesper
o Allocate memory mapped by pcic even when not used for ncv.
This is for PC-Cards which needs offset, because I/O space should not be
used by other devices.
Pointed-out-by: YAMAMOTO Shigeru <shigeru@iij.ad.jp>
Incredibally useful for debugging kernels using vmware.
Vmware com1 is diverted to one side, and gdb listens to the other side.
viola.. instant debugging sandbox on one system.
Since we know there's always an upper bound we force that bound,
otherwise users can cause a panic via malloc getting hit with a
odd (huge or negative) amount of memory to allocate.
Tested by: kris
Pointed out by: Andrey Valyaev <dron@infosec.ru>
If you ever want to run midi(4) out of the giant lock, uncomment
MIDI_OUTOFGIANT in midi.h. Confirmed to work for csamidi with WITNESS
and INVARIANTS.
- midi_info, midi_open and seq_info are now tailqs, allowing arbitrary
numbers of devices to be configured.
- Do not send an active sensing message to reset midi modules.
- Clone /dev/sequencer*. /dev/sequencer0 and /dev/sequencer are generated
upon initialization.
Includes the following revisions from KAME (two of these were actually
committed previously but the CVS revisions weren't documented):
1.40 kame/kame/sys/netinet6/ah_core.c (committed in previous rev)
1.41 kame/kame/sys/netinet6/ah_core.c
1.28 kame/kame/sys/netinet6/ah_output.c (committed in previous rev)
1.29 kame/kame/sys/netinet6/ah_output.c
1.30 kame/kame/sys/netinet6/ah_output.c
1.129 kame/kame/sys/netinet6/nd6.c
1.130 kame/kame/sys/netinet6/nd6.c
1.24 kame/kame/sys/netinet6/dest6.c
1.25 kame/kame/sys/netinet6/dest6.c
Obtained from: KAME
- Convert to a more efficient queueing implementation.
- Don't allocate command buffers on the fly; simply work from a
static pool.
- Add a control device interface, for later use.
- Handle controller overload better as a consequence of the
improved queue implementation.
- Add support for the XPT_GET_TRAN_SETTINGS ccb, and correctly
set the virtual SCSI channels up for multiple outstanding I/Os.
- Update copyrights for 2001.
- Some whitespace fixes to improve readability.
Due to a misunderstanding on my part, previous versions of the
driver were limited to a single outstanding I/O per virtual drive.
Needless to say, this update improves performance substantially.
connection, but send it immediately. Prior to this change, it was possible
to delay a delayed-ack for multiple times, resulting in degraded TCP
behavior in certain corner cases.
o Offset and period in synch messages and width negotiation should be
done for per target not per lun. Move these from *lun_info to
*targ_info.
o Change in handling XPT_RESET_DEV and XPT_GET_TRAN_SETTINGS .
o Change CAM_* xpt_done return values.
o Busy loop did not timeout. Change this to timeout as original NetBSD/pc98.
Reviewed by: bsd-nomads ML
by the compiler. ie: char foo[0] comes out as 4 bytes on a.out, and
we depended on it coming out as 0 for the script version. :-(
Make double sure that genassym.o is built and nm'ed in elf mode.
(ia64 skipped since it is stuck on the linux toolchain and doesn't
understand the -elf switches)
gcc -aout -mno-underscores. The bioscall.s tweak is not an a.out
requirement really, but to work around the bugs in the antique version of
gas that used for a.out. Makefile hacks are all that is needed to
get an a.out kernel. There is no telling if it will work though.
This is little more than an academic curiosity anyway since all it is
good for is situations where the boot code is hard wired, eg: rom
bootstraps (such as the gnat box).
GENERIC:
...
size -aout kernel ; chmod 755 kernel
text data bss dec hex
3051520 368640 198688 3618848 373820
and used in C or vice versa. The elf compiler uses the same names
for both. Remove asnames.h with great prejudice; it has served its
purpose.
Note that this does not affect the ability to generate an aout kernel
due to gcc's -mno-underscores option.
moral support from: peter, jhb
ehternet frames to a netgraph hook.
Submitted by: "Vitaly V. Belekhov" <vitaly@riss-telecom.ru>
translated to 5.0 by me. man page not yet written.
This node still needs a little work.. don't use yet. Not yet linked into
the build.
This helps to stop it from geting out of sync.
It is not part of the normal build but I can use it with all the others
when I make changes to netgraph to ensure it is buildable.
to be more like Xint0x80_syscall and less like c function syscall().
- Reduce code duplication between the int0x80 and lcall handlers by
shuffling the elfags into the right place, saving the sizeof the
instruction in tf_err and jumping into the common int0x80 code.
Reviewed by: peter
passed in filename and line number in the KTR tracepoint message.
- Even though it is #if 0'd code, change the code to detect that a process
is an interrupt thread to check p->p_ithd against NULL rather than
checking non-existant process flags from BSD/OS.
- Use '%p' to print pointers in KTR log messages instead of assuming
sizeof(int) == sizeof(void *).
- Don't set p_mtxname to NULL when releasing a mutex. It doesn't hurt
to leave it set (we don't clear w_mesg for example) and at least at
one time in the past, there used to be race conditions in the kernel
that would result in setting this to NULL causing the kernel to
dereference NULL.
- Make the _mtx_assert() function be compiled in if INVARIANTS_SUPPORT is
defined rather than if INVARIANTS is defined so that a KLD compiled
with INVARIANTS that uses mtx_assert() can be used with a kernel that
just has INVARIANT_SUPPORT compiled in.
allow the watermark to be passed in via the data field during the EV_ADD
operation.
Hook this up to the socket read/write filters; if specified, it overrides
the so_{rcv|snd}.sb_lowat values in the filter.
Inspired by: "Ronald F. Guilmette" <rfg@monkeys.com>
the current socket error in fflags. This may be useful for determining
why a connect() request fails.
Inspired by: "Jonathan Graehl" <jonathan@graehl.org>
error will be passed up to the user, who will close the connection, so
it does not appear to make a sense to leave the connection open.
This also fixes a bug with kqueue, where the filter does not set EOF
on the connection, because the connection is still open.
Also remove calls to so{rw}wakeup, as we aren't doing anything with
them at the moment anyway.
Reviewed by: alfred, jesper
reset TCP connections which are in the SYN_SENT state, if the sequence
number in the echoed ICMP reply is correct. This behavior can be
controlled by the sysctl net.inet.tcp.icmp_may_rst.
Currently, only subtypes 2,3,10,11,12 are treated as such
(port, protocol and administrative unreachables).
Assocaiate an error code with these resets which is reported to the
user application: ENETRESET.
Disallow resetting TCP sessions which are not in a SYN_SENT state.
Reviewed by: jesper, -net
this information via the vm.nswapdev sysctl (number of swap areas)
and vm.swapdevX nodes (where X is the device), which contain the MIBs
dev, blocks, used, and flags. These changes are required to allow
top and other userland swap-monitoring utilities to run without
setgid kmem.
Submitted by: Thomas Moestl <tmoestl@gmx.net>
Reviewed by: freebsd-audit
and add a sysctl to pppoe to activate non standard ethertypes
so that idiot ISPs (apparently in France) who use
equipment from idiot suppliers (rumour says 3com)
who use nonstandard ethertypes can still connect.
"yep, sure we do pppoe, we use a different identifier to that dictated in
the standard, but sure it's pppoe!"
sysctl -w net.graph.stupid_isp=1 enables the changeover.
attached and ifconfigable. The card doesn't interrupt yet.
Also, move towards bus space by introducing new macros/inline
functions which make such a move much easier than before.
These inline functions are setup now to work around an IBM EtherJet
pccard cardbus bridge incompatibility. The card works in 8 bit mode,
but not in 16-bit mode when it is connected to a cardbus bridge for
reasons unknown. The Linux driver also has a similar workaround in
it.
Future work will include making the above workaround runtime
conditional rather than compile time conditional, as well as fixing
the interrupts in pccards and converting it to bus space.
for the ICB firmware options meant- *I* had taken it to
mean that if you set it, Node Name would be ignored and
derived from Port Name. Actually, it meant the opposite.
As a consequence- change ICBOPT_USE_PORTNAME to the
define ICBOPT_BOTH_WWNS- makes more sense.
Fix wrong input bitmap for MBOX_DUMP_RAM command. Call
ISP_DUMPREGS if we get a f/w crash. Add ISPCTL_RUN_MBOXCMD
control command (so outer layers can run a mailbox command
directly) and add a ISPASYNC_UNHANDLED_RESPONSE hook so
outer layers can understand response queue entries we
might not know about.
depend on this. The linux ABI emulator tries to use it for some linux
binaries too. VM86 had a bigger cost than this and it was made default
a while ago.
Reviewed by: jhb, imp
and 1.84 of src/sys/netinet/udp_usrreq.c
The changes broken down:
- remove 0 as a wildcard for addresses and port numbers in
src/sys/netinet/in_pcb.c:in_pcbnotify()
- add src/sys/netinet/in_pcb.c:in_pcbnotifyall() used to notify
all sessions with the specific remote address.
- change
- src/sys/netinet/udp_usrreq.c:udp_ctlinput()
- src/sys/netinet/tcp_subr.c:tcp_ctlinput()
to use in_pcbnotifyall() to notify multiple sessions, instead of
using in_pcbnotify() with 0 as src address and as port numbers.
- remove check for src port == 0 in
- src/sys/netinet/tcp_subr.c:tcp_ctlinput()
- src/sys/netinet/udp_usrreq.c:udp_ctlinput()
as they are no longer needed.
- move handling of redirects and host dead from in_pcbnotify() to
udp_ctlinput() and tcp_ctlinput(), so they will call
in_pcbnotifyall() to notify all sessions with the specific
remote address.
Approved by: jlemon
Inspired by: NetBSD
interrupts.
Protect usage of the per processor switchtime variable against
interrupts in calcru().
This seem to eliminate the "microuptime() went backwards" warnings.
the the original trapframe of the syscall, trap, or interrupt that entered
the kernel. Before SMPng, ast's were handled via a psuedo trap at the
end of doerti. With the SMPng commit, ast's were broken out into a
separate ast() function that was called from doreti to match the behavior
of other architectures. Unfortunately, when this was done, the
p_md.md_regs member of curproc was not updateda in ast(), thus when
signals are handled by userret() after an interrupt that returns to
userland, we end up using a stale trapframe that will result in the
registers from the old trapframe overwriting the real trapframe and
smashing all the registers right before we return to usermode. The saved
%cs:%eip from where we were in usermode are saved in the trapframe for
example.
- Don't use an atomic operation to update cnt.v_soft in ast(). This is
the only place the variable is written to, and sched_lock is always
held when it is written, so it is already protected and the mutex release
of sched_lock asserts a memory barrier that ensures the value will be
updated in a timely fashion.
packet flow into two unidirectional flows.
Part of a suite of nodes developed for packet flow control.
More to follow as I have time to port them to 5.x or
as others do so. The ipfw node will be the hardest..
Submitted by: "Vitaly V. Belekhov" <vitaly@riss-telecom.ru>
- Remove unneeded spl()'s around mi_switch() in userret().
- Don't hold sched_lock across addupc_task().
- Remove the MD function child_return() now that the MI function
fork_return() is used instead.
- Use TRAPF_USERMODE() instead of dinking with the trapframe directly to
check for ast's in kernel mode.
- Check astpending(curproc) and resched_wanted() in ast() and return if
neither is true.
- Use astoff() rather than setting the non-existent per-cpu variable
astpending to 0 to clear an ast.
- Don't hold sched_lock around addupc_task() as this apparently breaks
profiling badly due to sched_lock being held across copyin().
Reported by: bde (2)
for us.
- Change the switch_trampoline() to call fork_exit() passing in the
required arguments instead of calling the fork trampoline callout
function directly.
Warning: this hasn't been tested.
Looked over by: dfr
an interrupt thread while the interrupt thread is blocked on Giant waiting
to execute the interrupt handler being removed. The result was that the
intrhand structure would be free'd, and we would call 0xdeadc0de. The work
around is to check to see if the interrupt thread is idle when removing a
handler. If not, then we mark the interrupt handler as being dead using
the new IH_DEAD flag and don't remove it from the interrupt threads' list
of handlers. When the interrupt thread resumes, it will see a dead handler
while traversing the list of handlers and will remove the handler then.
work because opt_preemption.h wasn't #include'd. Instead, make use of the
do_switch parameter to ithread_schedule() and do the check in the alpha
interrupt code.
- Use pci_get_powerstate()/pci_set_powerstate() in all the other drivers
that need them so we don't have to fiddle with the PCI power management
registers directly.
- Use pci_enable_busmaster()/pci_enable_io() to turn on busmastering and
PIO/memory mapped accesses.
- Add support to the RealTek driver for the D-Link DFE-530TX+ which has
a RealTek 8139 with its own PCI ID. (Submitted by Jason Wright)
- Have the SiS 900/National DP83815 driver be sure to disable PME
mode in sis_reset(). This apparently fixes a problem on some
motherboards where the DP83815 chip fails to receive packets.
(Submitted by Chuck McCrobie <mccrobie@cablespeed.com>)
Use the target offset rather than the target Id to reference
the untagged SCB array. The offset and id are identical save
in the twin channel case. This should correct several issues
with the 2742T.
Set the user and goal settings prior to setting the current
settings. This allows the async update routine to filter out
intermediate transfer negotiation updates that may be less
than interesting. The Linux OSM uses this to reduce the amount
of stuff printed to the console.
aic7xxx.seq:
Correct an issue with the aic7770 in twin channel mode.
We could continually attempt to start a selection even
though a selection was already occurring on one channel.
This might have the side effect of hanging our selection
or causing us to select the wrong device.
While here, create a separate polling loop for when we
have already started a selection. This should reduce
the latency of our response to a (re)selection. The diffs
look larger than they really are due to some code rearrangement
to optimize out a jmp.
aic7xxx_freebsd.c:
Use the target offset rather than the target Id to reference
the untagged SCB array. The offset and id are identical save
in the twin channel case. This should correct several issues
with the 2742T.
aic7xxx_inline.h:
Get back in sync with perforce revision ID.
aic7xxx_pci.c:
Identify adapters in ARO mode as such.
Ensure that not only the subvendor ID is correct (9005)
but also that the controller type field is valid before
looking at other information in the subdevice id. Intel
seems to have decided that their subdevice id of 8086
is more appropriate for some of their MBs with aic7xxx
parts than Adaptec's sanctioned scheme.
Add an exclusion entry for SISL (AAC on MB based adapters).
Adapters in SISL mode are owned by the RAID controller, so
even if a driver for the RAID controller is not present,
it isn't safe for us to touch them.
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
pr_free(), invoked by the similarly named credential reference
management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
mutex use.
Notes:
o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
required to protect the reference count plus some fields in the
structure.
Reviewed by: freebsd-arch
Obtained from: TrustedBSD Project
treat 0 as a wildcard in src/sys/in_pbc.c:in_pcbnotify()
It's sufficient to check for src|local port, as we'll have no
sessions with src|local port == 0
Without this a attacker sending ICMP messages, where the attached
IP header (+ 8 bytes) has the address and port numbers == 0, would
have the ICMP message applied to all sessions.
PR: kern/25195
Submitted by: originally by jesper, reimplimented by jlemon's advice
Reviewed by: jlemon
Approved by: jlemon
userland tool:
Use the vfs.devfs.generation sysctl to test for devfs presense
(thanks phk!) when devfs is active it will not try to create the
device nodes in /dev and therefore will not complain about the
failure to do so.
Revert the change in the #define for VINUM_DIR in the kernel
header so that vinum can find its device nodes.
Replace perror() with vinum_perror() to print file/line when
DEVBUG is defined (not defined by default).
kernel:
Don't use the #define names for the "superdev" creation since
they will be prepended by "/dev/" (based on VINUM_DIR), instead
use string constants.
Create both debug and non-debug "superdev" nodes in the devfs.
Problem noticed and fix tested by: Martin Blapp <mblapp@fuchur.lan.attic.ch>
remove_sd_entry() to:
Simplify (hopefully) it by moving all error returns closer to
the beginning of the function.
Return an error when "Error removing subdisk %s: not found in
plex %s\n" would have been reported, as I doubt that we are "OK"
after printing that error message.
Adding make_dev() and destroy_dev() calls in (hopefully) the right
places.
This is done by calling make_dev() in each object constructor and
caching the dev_t's returned from make_dev() in each struct
'subdisk'(sd), 'plex' and 'volume' such that the 'object'_free()
functioncs can call destroy dev.
This change makes a subset of the old /dev/vinum appear under devfs.
Enough nodes appear such that I'm able to mount my striped volume.
There may be more work needed to get vinum configuration working
properly.
that was introduced in revision 1.80. The problem manifested
itself with a `locking against myself' panic and could also
result in soft updates inconsistences associated with inodedeps.
The two problems are:
1) One of the background operations could manipulate the bitmap
while holding it locked with intent to create. This held lock
results in a `locking against myself' panic, when the background
processing that we have been coopted to do tries to lock the bitmap
which we are already holding locked. To understand how to fix this
problem, first, observe that we can do the background cleanups in
inodedep_lookup only when allocating inodedeps (DEPALLOC is set in
the call to inodedep_lookup). Second observe that calls to
inodedep_lookup with DEPALLOC set can only happen from the following
calls into the softdep code:
softdep_setup_inomapdep
softdep_setup_allocdirect
softdep_setup_remove
softdep_setup_freeblocks
softdep_setup_directory_change
softdep_setup_directory_add
softdep_change_linkcnt
Only the first two of these can come from ffs_alloc.c while holding
a bitmap locked. Thus, inodedep_lookup must not go off to do
request_cleanups when being called from these functions. This change
adds a flag, NODELAY, that can be passed to inodedep_lookup to let
it know that it should not do background processing in those cases.
2) The return value from request_cleanup when helping out with the
cleanup was 0 instead of 1. This meant that despite the fact that
we may have slept while doing the cleanups, the code did not recheck
for the appearance of an inodedep (e.g., goto top in inodedep_lookup).
This lead to the softdep inconsistency in which we ended up with
two inodedep's for the same inode.
Reviewed by: Peter Wemm <peter@yahoo-inc.com>,
Matt Dillon <dillon@earth.backplane.com>
filename insteada of copying the first 32 characters of it.
- Add in const modifiers for the passed in format strings and filenames
and their respective members in the ktr_entry struct.
scheduling an interrupt thread to run when needed. This has the side
effect of enabling support for entropy gathering from interrupts on
all architectures.
- Change the software interrupt and x86 and alpha hardware interrupt code
to use ithread_schedule() for most of their processing when scheduling
an interrupt to run.
- Remove the pesky Warning message about interrupt threads having entropy
enabled. I'm not sure why I put that in there in the first place.
- Add more error checking for parameters and change some cases that
returned EINVAL to panic on failure instead via KASSERT().
- Instead of doing a documented evil hack of setting the P_NOLOAD flag
on every interrupt thread whose pri was SWI_CLOCK, set the flag
explicity for clk_ithd's proc during start_softintr().
- Add pager capability to the 'show ktr' command. It functions much like
'ps': Enter at the prompt displays one more entry, Space displays
another page, and any other key quits.
This is useful when doing copies of packet where some leading
space has been preallocated to insert protocol headers.
Note that there are in fact almost no users of m_copypacket.
MFC candidate.
in mi_switch() just before calling cpu_switch() so that the first switch
after a resched request will satisfy the request.
- While I'm at it, move a few things into mi_switch() and out of
cpu_switch(), specifically set the p_oncpu and p_lastcpu members of
proc in mi_switch(), and handle the sched_lock state change across a
context switch in mi_switch().
- Since cpu_switch() no longer handles the sched_lock state change, we
have to setup an initial state for sched_lock in fork_exit() before we
release it.
Please note:
When committing changes to this file, it is important to note that
linux is not freebsd -- their system call numbers (and sometimes names)
are different on different platforms. When in doubt (and you always need
to be) check the arch-specific unistd.h and entry.S files in the linux
kernel sources to see what the syscall numbers really are.
is sent to a process, psignal() needs to schedule an AST for the
process if the process is runnable, not just if it is current, so that
pending signals get checked for on the next return of the process to
user mode. This wasn't practical until recently because the AST flag
was per-cpu so setting it for a non-current process would usually just
cause a bogus AST for the current process.
For non-current processes looping in user mode, it took accidental
(?) magic to deliver signals at all. Signals were usually delivered
late as a side effect of rescheduling (need_resched() sets astpending,
etc.). In pre-SMPng, delivery was delayed by at most 1 quantum (the
need_resched() call in roundrobin() is certain to occur within 1
quantum for looping processes). In -current, things are complicated
by normal interrupt handlers being threads. Missing handling of the
complications makes roundrobin() a bogus no-op, but preemptive
scheduling sort of works anyway due to even larger bogons elsewhere.
always on curproc. This is needed to implement signal delivery properly
(see a future log message for kern_sig.c).
Debogotified the definition of aston(). aston() was defined in terms
of signotify() (perhaps because only the latter already operated on
a specified process), but aston() is the primitive.
Similar changes are needed in the ia64 versions of cpu.h and trap.c.
I didn't make them because the ia64 is missing the prerequisite changes
to make astpending and need_resched per-process and those changes are
too large to make without testing.
tsc_present in the right places (together with other variables of the
same linkage), and don't use messy ifdefs just to avoid exporting it in
some cases.
actually in the kernel. This structure is a different size than
what is currently in -CURRENT, but should hopefully be the last time
any application breakage is caused there. As soon as any major
inconveniences are removed, the definition of the in-kernel struct
ucred should be conditionalized upon defined(_KERNEL).
This also changes struct export_args to remove dependency on the
constantly-changing struct ucred, as well as limiting the bounds
of the size fields to the correct size. This means: a) mountd and
friends won't break all the time, b) mountd and friends won't crash
the kernel all the time if they don't know what they're doing wrt
actual struct export_args layout.
Reviewed by: bde
Add new PRC_UNREACH_ADMIN_PROHIB in sys/sys/protosw.h
Remove condition on TCP in src/sys/netinet/ip_icmp.c:icmp_input
In src/sys/netinet/ip_icmp.c:icmp_input set code = PRC_UNREACH_ADMIN_PROHIB
or PRC_UNREACH_HOST for all unreachables except ICMP_UNREACH_NEEDFRAG
Rename sysctl icmp_admin_prohib_like_rst to icmp_unreach_like_rst
to reflect the fact that we also react on ICMP unreachables that
are not administrative prohibited. Also update the comments to
reflect this.
In sys/netinet/tcp_subr.c:tcp_ctlinput add code to treat
PRC_UNREACH_ADMIN_PROHIB and PRC_UNREACH_HOST different.
PR: 23986
Submitted by: Jesper Skriver <jesper@skriver.dk>
case there is nothing to do. This happens normally when the card shares
the interrupt line with other devices.
This code saves a couple of microseconds per interrupt even on a
fast CPU. You normally would not care, except under heavy tinygram
traffic where you can have some 50-100.000 interrupts per second...
On passing, correct a spelling error.
lookup vop so that it defaulted to using vop_eopnotsupp for strange
lookups like the ones for open("/dev/null/", ...) and stat("/dev/null/",
...). This mainly caused the wrong errno to be returned by vfs syscalls
(EOPNOTSUPP is not in POSIX, and is not documented in connection with
specfs in open.2 and is not documented in stat.2 at all). Also, lookup
vops are apparently required to set *ap->a_vpp to NULL on error, but
vop_eopnotsupp is too broken to do this.