Logical volumes on these devices show up as LUNs behind another
controller (also known as proxy controller). In order to issue
firmware commands for a volume on a proxy controller, they must be
targeted at the address of the proxy controller it is attached to,
not the Host/PCI controller.
A proxy controller is defined as a device listed in the INQUIRY
PHYSICAL LUNS command who's L2 and L3 SCSI addresses are zero. The
corresponding address returned defines which "bus" the controller
lives on and we use this to create a virtual CAM bus.
A logical volume's addresses first byte defines the logical drive
number. The second byte defines the bus that it is attached to
which corresponds to the BUS of the proxy controller's found or the
Host/PCI controller.
Change event notification to be handled in its own kernel thread.
This is needed since some events may require the driver to sleep
on some operations and this cannot be done during interrupt context.
With this change, it is now possible to create and destroy logical
volumes from FreeBSD, but it requires a native application to
construct the proper firmware commands which is not publicly
available.
Special thanks to John Cagle @ HP for providing remote access to
all the hardware and beating on the storage engineers at HP to
answer my questions.
Specifically, we used to enable the source after locking sched_lock
and just before we had already decided to do a context switch.
This meant that an ithread could never process more than one interrupt
per context switch. Enabling earlier in the loop before sched_lock is
acquired allows an ithread to handle multiple interrupts per context
switch if interrupts fire very rapidly. For the case of heavy interrupt
load this can reduce the number of context switches (and thus overhead)
as well as reduce interrupt latency.
- Now that we can handle multiple interrupts per context switch, add simple
interrupt storm protection to threaded interrupts. If X number of
consecutive interrupts are triggered before the itherad voluntarily
yields to another thread, then the interrupt thread will sleep with the
associated interrupt source disabled (masked) for 1/10th of a second.
The default value of X is 500, but it can be tweaked via the tunable/
sysctl hw.intr_storm_threshold. If an interrupt storm is detected, then
a message is output to the kernel console on the first occurrence per
interrupt thread. Interrupt storm protection can be disabled completely
by setting this value to 0. There is no scientific reasoning for the
1/10th of a second or 500 interrupts values, so they may require tweaking
at some point in the future.
Tested by: rwatson (an earlier version w/o the storm protection)
Tested by: mux (reportedly made a machine with two PCI interrupts
storming usable rather than hard locked)
Reviewed by: imp
different BIOSs use the same exact settings to mean two very different and
incompatible things for the SCI. Thus, if the SCI is remapped to a PCI
interrupt, we now trust the trigger/polarity that the MADT provides by
default. However, the SCI can be forced to level/lo as 1.10 did by setting
the tunable "hw.acpi.force_sci_lo" to a non-zero value from the loader.
Thus, if rev 1.10 caused an interrupt storm, it should nwo fix your
machine. If rev 1.10 fixed an interrupt storm on your machine, you
probably need to set the aforementioned tunable in /boot/loader.conf to
prevent the interrupt storm.
The more general problem of getting the SCI's trigger/polarity programmed
"correctly" (for some value of correctly meaning several workarounds for
broken BIOSs and inconsistent "implementations" of the ACPI standard) is
going to require more work, but this band-aid should improve the current
situation somewhat.
Requested by: njl
uiomove(9) is not properly locked. So, return to NEEDGIANT
mode. Later, when uiomove is finely locked, I'll revisit.
While I'm here, provide some temporary debugging output to
help catch blocking startups.
unconditionally initialize the mbuf header even if cluster allocation
failed, which could result in a NULL pointer dereference in low-memory
conditions.
PR: kern/65548
Submitted by: Stephan Uphoff <ups@tree.com>
the TAILQ_FOREACH() form.
Comment the need to store the same info (mac address for ethernet-type
devices) in two different places.
No functional changes. Even the compiler output should be unmodified
by this change.
of an interface. No functional change.
On passing, comment an useless invocation of TAILQ_INIT(&ifp->if_addrhead)
which could probably be removed in the interest of clarity.
of an interface. No functional change.
On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist()
which, if confirmed, would deserve to be fixed and MFC'ed
ntoskrnl_unlocl_dpc().
- hal_raise_irql(), hal_lower_irql() and hal_irql() didn't work right
on SMP (priority inheritance makes things... interesting). For now,
use only two states: DISPATCH_LEVEL (PI_REALTIME) and PASSIVE_LEVEL
(everything else). Tested on a dual PIII box.
- Use ndis_thsuspend() in ndis_sleep() instead of tsleep(). (I added
ndis_thsuspend() and ndis_thresume() to replace kthread_suspend()
and kthread_resume(); the former will preserve a thread's priority
when it wakes up, the latter will not.)
- Change use of tsleep() in ndis_stop_thread() to prevent priority
change on wakeup.
if the link-level address has been initialized already.
The majority of modern drivers never does this and works fine, which
makes me think that the check is totally unnecessary and a residue
of cut&paste from other drivers.
This change is done to simplify locking because now almost none of the
drivers uses this field. The exceptions are "ct" "ctau" and "cx"
where i am not sure if i can remove that part.
This avoids presenting invalid data to the client's applications
when the file is modified, and then extended within the window of
the resolution of the modifcation timestamp.
Reviewed By: iedowse
PR: kern/64091
because they bogusly check for defined(INTR_MPSAFE) -- something which
never was a #define. Correct the definitions.
This make INTR_TYPE_AV finally get used instead of the lower-priority
INTR_TYPE_TTY, so it's quite possible some improvement will be had
on sound driver performance. It would also make all the drivers
marked INTR_MPSAFE actually run without Giant (which does seem to
work for me), but:
INTR_MPSAFE HAS BEEN REMOVED FROM EVERY SOUND DRIVER!
It needs to be re-added on a case-by-case basis since there is no one
who will vouch for which sound drivers, if any, willy actually operate
correctly without Giant, since there hasn't been testing because of
this bug disabling INTR_MPSAFE.
Found by: "Yuriy Tsibizov" <Yuriy.Tsibizov@gfk.ru>
attempting to duplicate Windows spinlocks. Windows spinlocks differ
from FreeBSD spinlocks in the way they block preemption. FreeBSD
spinlocks use critical_enter(), which masks off _all_ interrupts.
This prevents any other threads from being scheduled, but it also
prevents ISRs from running. In Windows, preemption is achieved by
raising the processor IRQL to DISPATCH_LEVEL, which prevents other
threads from preempting you, but does _not_ prevent device ISRs
from running. (This is essentially what Solaris calls dispatcher
locks.) The Windows spinlock itself (kspin_lock) is just an integer
value which is atomically set when you acquire the lock and atomically
cleared when you release it.
FreeBSD doesn't have IRQ levels, so we have to cheat a little by
using thread priorities: normal thread priority is PASSIVE_LEVEL,
lowest interrupt thread priority is DISPATCH_LEVEL, highest thread
priority is DEVICE_LEVEL (PI_REALTIME) and critical_enter() is
HIGH_LEVEL. In practice, only PASSIVE_LEVEL and DISPATCH_LEVEL
matter to us. The immediate benefit of all this is that I no
longer have to rely on a mutex pool.
Now, I'm sure many people will be seized by the urge to criticize
me for doing an end run around our own spinlock implementation, but
it makes more sense to do it this way. Well, it does to me anyway.
Overview of the changes:
- Properly implement hal_lock(), hal_unlock(), hal_irql(),
hal_raise_irql() and hal_lower_irql() so that they more closely
resemble their Windows counterparts. The IRQL is determined by
thread priority.
- Make ntoskrnl_lock_dpc() and ntoskrnl_unlock_dpc() do what they do
in Windows, which is to atomically set/clear the lock value. These
routines are designed to be called from DISPATCH_LEVEL, and are
actually half of the work involved in acquiring/releasing spinlocks.
- Add FASTCALL1(), FASTCALL2() and FASTCALL3() macros/wrappers
that allow us to call a _fastcall function in spite of the fact
that our version of gcc doesn't support __attribute__((__fastcall__))
yet. The macros take 1, 2 or 3 arguments, respectively. We need
to call hal_lock(), hal_unlock() etc... ourselves, but can't really
invoke the function directly. I could have just made the underlying
functions native routines and put _fastcall wrappers around them for
the benefit of Windows binaries, but that would create needless bloat.
- Remove ndis_mtxpool and all references to it. We don't need it
anymore.
- Re-implement the NdisSpinLock routines so that they use hal_lock()
and friends like they do in Windows.
- Use the new spinlock methods for handling lookaside lists and
linked list updates in place of the mutex locks that were there
before.
- Remove mutex locking from ndis_isr() and ndis_intrhand() since they're
already called with ndis_intrmtx held in if_ndis.c.
- Put ndis_destroy_lock() code under explicit #ifdef notdef/#endif.
It turns out there are some drivers which stupidly free the memory
in which their spinlocks reside before calling ndis_destroy_lock()
on them (touch-after-free bug). The ADMtek wireless driver
is guilty of this faux pas. (Why this doesn't clobber Windows I
have no idea.)
- Make NdisDprAcquireSpinLock() and NdisDprReleaseSpinLock() into
real functions instead of aliasing them to NdisAcaquireSpinLock()
and NdisReleaseSpinLock(). The Dpr routines use
KeAcquireSpinLockAtDpcLevel() level and KeReleaseSpinLockFromDpcLevel(),
which acquires the lock without twiddling the IRQL.
- In ndis_linksts_done(), do _not_ call ndis_80211_getstate(). Some
drivers may call the status/status done callbacks as the result of
setting an OID: ndis_80211_getstate() gets OIDs, which means we
might cause the driver to recursively access some of its internal
structures unexpectedly. The ndis_ticktask() routine will call
ndis_80211_getstate() for us eventually anyway.
- Fix the channel setting code a little in ndis_80211_setstate(),
and initialize the channel to IEEE80211_CHAN_ANYC. (The Microsoft
spec says you're not supposed to twiddle the channel in BSS mode;
I may need to enforce this later.) This fixes the problems I was
having with the ADMtek adm8211 driver: we were setting the channel
to a non-standard default, which would cause it to fail to associate
in BSS mode.
- Use hal_raise_irql() to raise our IRQL to DISPATCH_LEVEL when
calling certain miniport routines, per the Microsoft documentation.
I think that's everything. Hopefully, other than fixing the ADMtek
driver, there should be no apparent change in behavior.