files which Compaq open-sourced (with a BSD license).
This commit adds support for proper PCI interrupt mapping and much
better support for swizzling between "standard" isa IRQs and the stdio
irqs used by the t2. This also adds enabling/disabling/eoi support
for AlphaServer 2100A machines. The 2100A (or lynx) interrupt
hardware is is very different (and much nicer) than the 2100.
Previously, only AS2100 and AS2000 machines worked.
This commits also lays the groundwork for supporting ExtIO modules.
These modules are essentially a second hose. This work is left
unfinished pending testing on real hardware. Wilko tells me that
ExtIO modules are quite rare, and may not actually exist in the wild.
Obtained from: Tru64
Tested by: wilko
Backout the previous delta (rev 1.4), it didn't make any difference.
If the requested handle is NULL then don't add it to the list of
objects, to be found by handle.
The problem is that when asking for a NULL handle you are implying
you want a new object. Because objects with NULL handles were
being added to the list, any further requests for phys backed
objects with NULL handles would return a reference to the initial
NULL handle object after finding it on the list.
Basically one couldn't have more than one phys backed object without
a handle in the entire system without this fix. If you did more
than one shared memory allocation using the phys pager it would
give you your initial allocation again.
Deal with excessive dirty buffers when msync() syncs non-contiguous
dirty buffers by checking for the case in UFS *before* checking for
clusterability.
__P() prototypes when an ansi-style static inline is a prototype already.
Since vnode_if.[ch] are generated on the fly, there are no CVS diffs to
mess up.
SMP problem. Compaq, in their infinite wisdom, forgot to put the IO apic
intpin #0 connection to the 8259 PIC into the mptable. This hack is to
look and see if intpin #0 has *no* table entry and adds a fake ExtInt
entry for the remap routines to use. isa/clock.c will still test the
interrupts. This entry is only ever used on an already broken system.
where fork1() could put the process on the run queue where it could be
snatched up by another CPU before kthread_create() had set the proper
fork handler. Instead, we put the new kthread on the runqueue after its
fork handler has been sent.
Noticed by: jake
Looked over by: peter
mutex operations. In the future they may call functions that verify
correct locking order between processes in the INVARIANTS case.
- Lock the process in PHOLD() and PREL().
mpapic.c. This gives us the benefit of C type checking. These functions
are not called in any critical paths and are not used by the interrupt
routines.
place the LOCKing macros within the areas within if_wxvar.h that
is set aside for them. Put any platform specific data also in those
areas.
For ease of maintenance purposes, merge in the OpenBSD version codebase here.
- Whitespace fixes.
- Comment fixes (mostly capitalization and trailing periods).
- Reorderings to put variables all in the same place, function prototypes
sorted alphabetically, etc.
2. Removed unused and #ifdef'd out members from struct ithd.
3. Changed ESTCPULIM() to use an explicit (PRIO_MAX - PRIO_MIN) rather than
PRIO_TOTAL.
4. Remove <sys/lock.h> #include and resort #include's.
Submitted by: bde (1, 3, 4)
a tunstart function, which is called when a packet is sucessfully
placed on the queue. This allows us to properly do output byte accounting
within the handoff routine.
Add a test against isp->isp_osinfo.islocked prior to trying to see
whether --isp->isp_osinfo.islocked is zero to cause us to unlock
(non-SMPLOCK case).
in the future:
o Remove pcic_softc from pcic_handle and replace it with a void *
o Reduce dependence on accessing softc via a pcic_handle
Minor cleanups:
o Define a macro to count the size of an array and use it.
o Minor whitespace alignment
o make no slots found a printf not a panic.
spending, which was unused now that all software interrupts have
their own thread. Make the legacy schednetisr use an atomic op
for setting bits in the netisr mask.
Reviewed by: jhb
HID passed in as an argument at all; callers are typically going to be
sending us static strings anyway.
Submitted by: Munehiro Matsuda <haro@tk.kubota.co.jp>
Also, while here, run up to 32 interrupt sources on APIC systems.
Normalize INTREN/INTRDIS so they are the same on both UP and SMP systems
rather than sometimes a macro, and sometimes a function.
Reviewed by: jhb, jakeb
field in CDB' error when attempting to start a caddy-type CD drive,
since those drives apparently in general refuse to load a medium. Since
we never advertised the feature to load the medium upon calling
cdstartunit() (i. e. upon receipt of a CDIOCSTART ioctl command), nobody
should have relied on it. Besides, nobody noticed so far at all that
this command is failing for caddy-type drives... Only few applications
seem to use it at all (among them is workman, which made me notice it).
Reviewed by: ken
tweak to enable/disable interrupt sources. Seems to work. It is unclear
how many of the PC164 models actually might needs this, and whether or
not there are other hidden issues.
Obtained from:Bernd Walter <ticso@cicely8.cicely.de>
MPLOCKED macro
(2) Use decimal 12 rather than hex 0xc in an addl
(3) Implement MTX_ENTER for the I386_CPU case
(4) Use semi-colons between instructions to allow MTX_ENTER
and MTX_ENTER_WITH_RECURSION to be assembled
(5) Use incl instead of incw to increment the recusion count
(6) 10 is not a valid label, use 7, 8 and 9 rather than 8, 9 and 10
(7) Sort numeric labels
Submitted by: bde (2, 4, and 5)
on. So stop failing the attach if the IRQ is unassigned. With this
patch, I can now boot with PNP-OS YES in my BIOS no differently than
PNP-OS NO (which is a good thing since Windows hangs with PNP-OS NO).
Obtained from: msmith
- standardise error reporting for commands
- simplify the driver-to-controller bio transfer
- add bio in/out accounting
- correctly preserve the command ID in twe_ioctl (thanks to joel@3ware)
pushl that of the new process, rather than doing a movl (%esp) and
assuming that the stack has been setup right. This make the initial
stack setup slightly more sane, and will make it easier to stick
an interrupted process onto the run queue without its knowing.
and numvnodes are longs in the kernel. They should remain longs in systat,
what really needs to change is that they should be using SYSCTL_LONG rather
than SYSCTL_INT. I also changed wantfreevnodes to SYSCTL_LONG because I
happened to notice it.
I wish there was a way to find all of these automatically..
Pointed out by: bde
There is no more TAILQ fifo to harvest the entropy; instead, there
is a circular buffer of constant size (changeable by macro) that
pretty dramatically improves the speed and fixes potential slowdowns-
by-locking.
Also gone are a slew of malloc(9) and free(9) calls; all harvesting
buffers are static.
All-in-all, this is a good performance improvement.
Thanks-to: msmith for the circular buffer concept-code.
(specifically, how many entries we've looked at so far). Maintain
interrupt instrumentation. Use USEC_SLEEP instead of USEC_DELAY in
a number of places (this allows us to drop locks and sleep instead
of spin). Track changes to configuration options for topology preference.
Fix botched order of printout for Channel, Target, Lun.
from struct proc, which are now unused (p_nthread already was).
Remove process flag P_KTHREADP which was untested and only set
in vfs_aio.c (it should use kthread_create). Move the yield
system call to kern_synch.c as kern_threads.c has been removed
completely.
moral support from: alfred, jhb
not return ENOEXEC. This is because image activators should return -1 if they
don't claim an image. They should return ENOEXEC if they do claim it,
but cannot load it due to sime problem with the image. This bug was
preventing static compilation of the osf/1 module. I'm surprised it
did not cause more problems.
EOI after the ithread runs, send the EOI when we get the interrupt and
disable the source. After the ithread is run, the source is renabled.
Also, add isa_handle_fast_intr() which handles fast interrupts by sending
an EOI after the handler is run.
This fixes the chronic missing interrupt problems under heavy NFS load
on my UP1000 and should result in greater stability for alphas which
route all irqs through an isa pic.
Discussed with: jhb, bde (sending non-specific EOIs early was bde's idea)
like the args to the config space accessors these functions replaced.
This reduces the likelyhood of overflow when the args are used in
macros on the alpha. This prevents memory management faults when
probing the pci bus on sables, multias and nonames.
Approved by: dfr
Tested by: Bernd Walter <ticso@cicely8.cicely.de>
- Use ACPI_PHYSICAL_ADDRESS
- RSDT -> XSDT
- FACP -> FADT
- No APIC table support
- Don't install a global EC handler; this has bad side-effects
(it invokes _REG in *all* EC spaces in the namespace!)
- Check for PCI bus instances already existing before adding them
but a hack! Add `flags 0x8000' to the psm driver to enable it.
The psm driver will try to get out of out-of-sync situation
by disabling the mouse and immediately enable it again.
If you are seeing this out-of-sync problem because of an
incompetent(?!) KVM switch, this hack will NOT be good
for you. However, if you are occasionally seeing the
problem because of lost mouse interrupt, this might help.
lock. Otherwise, if we block on the backing mutex while releasing the
allproc lock, then when we resume, we will be at SRUN, and we will stay
that way all the way through cpu_exit. As a result, our parent will never
harvest us.
depend on MUTEX_DEBUG. The MUTEX_DEBUG option turns on extra assertions
and checks to verify that mutexes themselves are implemented properly.
The WITNESS option uses extra checks and diagnostics to verify that other
code is using mutexes properly.
passed vnode must be locked; this is the case because of calls
to VOP_GETATTR(), VOP_ACCESS(), and VOP_OPEN(). This becomes
more of an issue when VOP_ACCESS() gets a bit more complicated,
which it does when you introduce ACL, Capability, and MAC
support.
Obtained from: TrustedBSD Project
recently discussed at -hackers. The problem is a null-pointer
dereference that happens in kern/vfs_lookup.c when accessing ".."
with a v_mount entry for the current directory vnode of NULL. This
happens when a volume is forcibly unmounted, and the vnode for a
working directory in the mounted volume is cleared.
PR: 23191
Submitted by: Thomas Moestl <tmoestl@gmx.net>
locking the global hash on each uifree()
make struct uidinfo only visible to the kernel
make uihold() a function rather than a macro to reduce bloat
swap the order of a spl/mutex to maintain consistancy
process is on the alternate stack or not. For compatibility
with sigstack(2) state is being updated if such is needed.
We now determine whether the process is on the alternate
stack by looking at its stack pointer. This allows a process
to siglongjmp from a signal handler on the alternate stack
to the place of the sigsetjmp on the normal stack. When
maintaining state, this would have invalidated the state
information and causing a subsequent signal to be delivered
on the normal stack instead of the alternate stack.
PR: 22286
-current and RELENG_4 with GENERIC.
NKPT is the number of initial bootstrap page table pages we create for
the kernel during startup. Once VM is up, we resize it as needed, but
with 4G ram, the size of the vm_page_t structures was pushing it over
the limit. The fact that trimmed down kernels boot on 4G ram machines
suggests that we were pretty close to the edge.
The "30" is arbitary, but smaller than the 'nkpt' variable on all
machines that I checked.
- Use a better test for determining when a process is running.
- Convert some checks to assertions.
- Remove unnecessary tests.
- Save the priority before acquiring a mutex rather than in msleep(9).
4) The cardbus CIS code treats the CIS_PTR as a mapping register if
it is mentioned in the CIS. I don't have a spec handy to understand
why the CIS_PTR is mentioned in the CIS, but allocating a memory range
for it is certainly bogus. My patch ignores bar #6 to prevent the
mapping.
[The pccard spec says that BAR 0 and 7 (-1 and 6 in thic case since we
did a minus one) is "reserved". The off by 1 error has been fixed.
also bar=5 is invalid for IO maps, so we check it.]
5) The CIS code allocated duplicate resources to those already found
by cardbus_add_resources(). The fix is to pass in the bar computed
from the CIS instead of the particular resource ID for that bar,
so bus_generic_alloc_resource succeeds in finding the old resource.
[fixed, also removed superfluous (and incorrect) writing back to the
PCI config space.]
7) The CIS code seems to use the wrong bit to determine rather a particular
register mapping is for I/O or memory space. From looking at the
two cards I have, it seems TPL_BAR_REG_AS should be 0x10 instead
of 0x08. Otherwise, all registers that should be I/O mapped gain
a second mapping in memory space.
[Oops, the spec does say 0x10..., fixed]
Submitted by: Justin Gibbs
Providing shell scripts that do nothing but load a similarly named
kernel loadable module in out of vogue.
There is no ibcs2(4) manual page, and I haven't managed to coax
anyone into contributing one based on the linux(4) manual page.
our kernel module system learned how to handle dependencies.
Providing a whole bunch of shell scripts that do nothing but load
a similarly named kernel loadable module is out of vogue.
The svr4(8) manual page has been replaced with a much better svr4(4)
page.
timeout. If DIAGNOSTIC is turned on, then display a message to the console
with a map of which CPUs failed to stop or restart. This gives an SMP box
at least a fighting chance of getting into DDB if one of the other CPUs has
interrupts disabled.
Fix amr_map_command so that 40LD-specific commands get the scatter-gather
list count in the right place. I don't understand why AMI did it like
this, but now the AMI MegaManager can talk to the newer (1600 and later)
controllers.
Remove an unused variable.
Include <machine/clock.h> when necessary.
Tweak some debugging levels to make things more intelligible.
Alter consumers of this method to conform to the new convention.
Minor cosmetic adjustments to bus.h.
This isn't of concern as this interface isn't in use yet.
io or memory space access enabled. This patch defers the setting
of these bits until after all of the mapping registers are probed.
It might be even better to defer this until a particular mapping
is activated and to disable that type of access when a new
register is activated.
2) The PCI spec is very explicit about how mapping registers and
the expansion ROM mapping register should be probed. This patch
makes cardbus_add_map() follow the spec.
3) The PCI spec allows a device to use the same address decoder for
expansion ROM access as is used for memory mapped register access.
This patch carefully enables and disables ROM access along with
resource (de)activiation.
This doesn't include the prefetching detection stuff (maybe later when code is written to actually turn on prefetching). It also does not use the PCI definitions (yet, I'll try to put this in all at once later)
Submitted by: Justin T. Gibbs
- Make pccbb/cardbus kld loadable and unloadable.
- Make pccbb/cardbus use the power interface from pccard instead of inventing its own.
- some other minor fixes
The release engineer keeps using the wrong /boot/cdboot when creating the
ISO images. So we'll add the 4.0-RELEASE cdboot to the tree until someone
bothers to fix the source so a working `cdboot' is built.
1) mpsafe (protect the refcount with a mutex).
2) reduce duplicated code by removing the inlined crdup() from crcopy()
and make crcopy() call crdup().
3) use M_ZERO flag when allocating initial structs instead of calling bzero
after allocation.
4) expand the size of the refcount from a u_short to an u_int, by using
shorts we might have an overflow.
Glanced at by: jake
use a mutex lock when looking up/deleting entries on the hashlist
use a mutex lock on each uidinfo when updating fields
make uifree() a void function rather than 'int' since no one cares
allocate uidinfo structs with the M_ZERO flag and don't explicitly initialize
them
Assisted by: eivind, jhb, jakeb
call instead.
This makes a pretty dramatic difference to the amount of work that
the harvester needs to do - it is much friendlier on the system.
(80386 and 80486 class machines will notice little, as the new
get_cyclecounter() call is a wrapper round nanotime(9) for them).
the interface to use callout_* instead of timeout(). Also add an
IS_MPSAFE #define (currently off) which will mark the driver as mpsafe
to the upper layers.
before adding/removing packets from the queue. Also, the if_obytes and
if_omcasts fields should only be manipulated under protection of the mutex.
IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on
the queue. An IF_LOCK macro is provided, as well as the old (mutex-less)
versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which
needs them, but their use is discouraged.
Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF,
which takes care of locking/enqueue, and also statistics updating/start
if necessary.
using a cardbus based system with pccbb providing the pcic interface).
Something isn't quite right.. when the driver allocates and activates
its resources, the IO space that was requested reads as all zeros (versus
the original 0xff's as it normally is when there is no device responding).
Also, deactivate the resources before releasing them. OLDCARD doesn't
seem to care but NEWCARD/CARDBUS get rather unhappy if you release
a resource that hasn't been deactivated yet.
Make pcic_p.c only compile with oldcard kernels.
(identified by the IO map being 256 bytes long instead of 128)
This chip works very unreliably on my Lanner embedded PC with the rl driver.
Lots of watchdog timeouts or poor performance.
Forcing the media type to 10 Meg (ifconfig rl0 media 10baseT/UTP) is a good
workaround.
This looks very similar to the problem reported in PR kern/18790
It is interesting to note that the linux driver has lots of special
case code for this chip.
* Some dummynet code incorrectly handled a malloc()-allocated pseudo-mbuf
header structure, called "pkt," and could consequently pollute the mbuf
free list if it was ever passed to m_freem(). The fix involved passing not
pkt, but essentially pkt->m_next (which is a real mbuf) to the mbuf
utility routines.
* Also, for dummynet, in bdg_forward(), made the code copy the ethernet header
back into the mbuf (prepended) because the dummynet code that follows expects
it to be there but it is, unfortunately for dummynet, passed to bdg_forward
as a seperate argument.
PRs: kern/19551 ; misc/21534 ; kern/23010
Submitted by: Thomas Moestl <tmoestl@gmx.net>
Reviewed by: bmilekic
Approved by: luigi
struct sigframe. We need more than only the signal context.
o Properly convert the signal mask when setting up the signal
frame in linux_sendsig and properly convert it back in
linux_sigreturn.
Do some cleanups and improve style while here.
can unload. Doing so leaves the linuxulator in a crippled
state (no ioctl support) when Linux binaries are run at
unload time.
While here, consistently spell ELF in capitals and perform
some minor style improvements.
ELF spelling submitted by: asmodai
is already in 32-bit mode, we need to be able to detect this and still
read the chip ID code. Detecting 32-bit mode is actually a little
tricky, since we want to avoid turning it on accidentally. The easiest
way to do it is to just try and read the PCI subsystem ID from the
bus control registers using 16-bit accesses and compare that with the
value read from PCI config space. If they match, then we know we're in
16-bit mode, otherwise we assume 32-bit mode.
instead of ng_send_data().
The latter could lead to running the IP stack at splimp
instead of splnet, (among other problems) (that MAY be safe
but I wouldn't count on it).
Noticed while preparing a new set of netgraph stuff.
counter register in-CPU.
This is to be used as a fast "timer", where linearity is more important
than time, and multiple lines in the linearity caused by multiple CPUs
in an SMP machine is not a problem.
This adds no code whatsoever to the FreeBSD kernel until it is actually
used, and then as a single-instruction inline routine (except for the
80386 and 80486 where it is some more inline code around nanotime(9).
Reviewed by: bde, kris, jhb
a kevent upon completion of the I/O. Specifically, introduce a new type
of sigevent notification, SIGEV_EVENT. If sigev_notify is SIGEV_EVENT,
then sigev_notify_kqueue names the kqueue that should receive the event
and sigev_value contains the "void *" is copied into the kevent's udata
field.
In contrast to the existing interface, this one: 1) works on
the Alpha 2) avoids the extra copyin() call for the kevent because all
of the information needed is in the sigevent and 3) could be
applied to request a single kevent upon completion of an entire lio_listio().
Reviewed by: jlemon
- move the call to cia_init_sgmap() to after we've determined if we're a pyxis
- convert needed splhigh() in cia_sgmap_invalidate_pyxis() to disable_intr()
Previously, any isa DMA on a pyxis based machine would cause a panic
in cia_sgmap_invalidate_pyxis() because the pyxis workaround was never
setup.
- while i'm at it, convert needed splhigh() in cia_swiz_set_hae_mem to
disable_intr()
in the face of multiple processes doing massive numbers of filesystem
operations. While this patch will work in nearly all situations, there
are still some perverse workloads that can overwhelm the system.
Detecting and handling these perverse workloads will be the subject
of another patch.
Reviewed by: Paul Saab <ps@yahoo-inc.com>
Obtained from: Ethan Solomita <ethan@geocast.com>
could not compress into clusters. This could result in lots of
wasted clusters while recieving small packets from an interface
that uses clusters for all it's packets.
Patch is partially from BSDi (limiting the size of the copy) and
based on a patch for 4.1 by Ian Dowse <iedowse@maths.tcd.ie> and
myself.
Reviewed by: bmilekic
Obtained From: BSDi
Submitted by: iedowse
- Use the mutex in hardclock to ensure no races between it and
softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
CALLOUT_MPSAFE, which specifies that a callout handler does not
need giant. There is still no way to set this flag when
regstering a callout.
Reviewed by: -smp@, jlemon
Removed most of the hacks that were trying to deal with low-memory
situations prior to now.
The new code is based on the concept that I/O must be able to function in
a low memory situation. All major modules related to I/O (except
networking) have been adjusted to allow allocation out of the system
reserve memory pool. These modules now detect a low memory situation but
rather then block they instead continue to operate, then return resources
to the memory pool instead of cache them or leave them wired.
Code has been added to stall in a low-memory situation prior to a vnode
being locked.
Thus situations where a process blocks in a low-memory condition while
holding a locked vnode have been reduced to near nothing. Not only will
I/O continue to operate, but many prior deadlock conditions simply no
longer exist.
Implement a number of VFS/BIO fixes
(found by Ian): in biodone(), bogus-page replacement code, the loop
was not properly incrementing loop variables prior to a continue
statement. We do not believe this code can be hit anyway but we
aren't taking any chances. We'll turn the whole section into a
panic (as it already is in brelse()) after the release is rolled.
In biodone(), the foff calculation was incorrectly
clamped to the iosize, causing the wrong foff to be calculated
for pages in the case of an I/O error or biodone() called without
initiating I/O. The problem always caused a panic before. Now it
doesn't. The problem is mainly an issue with NFS.
Fixed casts for ~PAGE_MASK. This code worked properly before only
because the calculations use signed arithmatic. Better to properly
extend PAGE_MASK first before inverting it for the 64 bit masking
op.
In brelse(), the bogus_page fixup code was improperly throwing
away the original contents of 'm' when it did the j-loop to
fix the bogus pages. The result was that it would potentially
invalidate parts of the *WRONG* page(!), leading to corruption.
There may still be cases where a background bitmap write is
being duplicated, causing potential corruption. We have identified
a potentially serious bug related to this but the fix is still TBD.
So instead this patch contains a KASSERT to detect the problem
and panic the machine rather then continue to corrupt the filesystem.
The problem does not occur very often.. it is very hard to
reproduce, and it may or may not be the cause of the corruption
people have reported.
Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>)
Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>
Pre-rfork code assumed inherent locking of a process's file descriptor
array. However, with the advent of rfork() the file descriptor table
could be shared between processes. This patch closes over a dozen
serious race conditions related to one thread manipulating the table
(e.g. closing or dup()ing a descriptor) while another is blocked in
an open(), close(), fcntl(), read(), write(), etc...
PR: kern/11629
Discussed with: Alexander Viro <viro@math.psu.edu>