freebsd

mirror of https://git.FreeBSD.org/src.git synced 2025-01-25 16:13:17 +00:00

Author	SHA1	Message	Date
John Baldwin	a5b6b9a68e	Fix the triple fault used as a last resort during a reboot to actually fault. The previous method zero'd out the page tables, invalidated the TLB, and then entered a spin loop. The idea was that the instruction after the TLB invalidate would result in a page fault and the page fault and subsequent double fault wouldn't be able to determine the physical page for their fault handlers' first instruction. This stopped working when PGE (PG_G PTE/PDE bit) support was added as a TLB invalidate via %cr3 reload doesn't clear TLB entries with PG_G set. Thus, the CPU was still able to map the virtual address for the spin loop and happily performed its infinite loop. The triple fault now uses a much more deterministic sledge-hammer approach to generate a triple fault. First, the IDT descriptor is set to point to an empty IDT, so any interrupts (including a double fault) will instantly fault. Second, we trigger a int 3 breakpoint to force an interrupt and kick off a triple fault. MFC after: 3 days	2007-04-24 21:17:45 +00:00
John Baldwin	b72d374cee	Update comments for the 0xcf9 and 0x92 reset methods to explain what we are actually doing and what the various bits mean.	2007-04-24 15:16:27 +00:00
John Baldwin	194617769b	Tweak printf string.	2007-04-23 22:53:01 +00:00
Robert Watson	c14d15ae3e	Remove MAC Framework access control check entry points made redundant with the introduction of priv(9) and MAC Framework entry points for privilege checking/granting. These entry points exactly aligned with privileges and provided no additional security context: - mac_check_sysarch_ioperm() - mac_check_kld_unload() - mac_check_settime() - mac_check_system_nfsd() Add mpo_priv_check() implementations to Biba and LOMAC policies, which, for each privilege, determine if they can be granted to processes considered unprivileged by those two policies. These mostly, but not entirely, align with the set of privileges granted in jails. Obtained from: TrustedBSD Project	2007-04-22 15:31:22 +00:00
Stephan Uphoff	31b4f4a916	Modify TLB invalidation handling. Reviewed by: alc@, peter@ MFC after: 1 week	2007-04-21 14:17:30 +00:00
Stephane E. Potvin	0e5179e441	Add support for specifying a minimal size for vm.kmem_size in the loader via vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will be at least 256mb (for example) without forcing a particular value via vm.kmem_size. Approved by: njl (mentor) Reviewed by: alc	2007-04-21 01:14:48 +00:00
Poul-Henning Kamp	3f17cc74af	style nit	2007-04-19 09:18:51 +00:00
Poul-Henning Kamp	cc76e59ded	On AMD's Geode LX: Force the TSC to run through core-suspension so we can use it as a timecounter. Sponsored by: Soekris Engineering	2007-04-18 10:08:24 +00:00
John Baldwin	88a5255bc4	Honor the BUS_DMA_NOCACHE flag to bus_dmamem_alloc() on amd64 and i386 by mapping the pages as UC (uncacheable) using pmap_change_attr(). MFC after: 1 week Requested by: ariff Reviewed by: scottl	2007-04-17 21:05:34 +00:00
Alan Cox	0b76504872	Eliminate the misuse of PG_FRAME to truncate a virtual address to a virtual page boundary. Reviewed by: ru@	2007-04-13 16:07:29 +00:00
Alan Cox	1434c1f34b	MFamd64 Define PGEX_RSV.	2007-04-12 17:00:56 +00:00
Ruslan Ermilov	f76f6cfd25	Fix PAE on SMP by enabling EFER_NXE on all APs. Reported by: kris Diagnosed by: alc	2007-04-12 11:05:24 +00:00
Pawel Jakub Dawidek	fef2a25971	Remove trailing '.' for consistency!	2007-04-10 21:40:13 +00:00
Pawel Jakub Dawidek	57bcf75fd2	Add UFS_GJOURNAL options to the GENERIC kernel. Approved by: re (kensmith)	2007-04-10 16:49:41 +00:00
Ruslan Ermilov	2e137367b4	Add the PG_NX support for i386/PAE. Reviewed by: alc	2007-04-06 18:15:03 +00:00
Jung-uk Kim	357afa7113	MFP4: Turn emul_lock into a mutex. Submitted by: rdivacky	2007-04-02 18:38:13 +00:00
John Baldwin	4e7f640dfb	Optimize sx locks to use simple atomic operations for the common cases of obtaining and releasing shared and exclusive locks. The algorithms for manipulating the lock cookie are very similar to that rwlocks. This patch also adds support for exclusive locks using the same algorithm as mutexes. A new sx_init_flags() function has been added so that optional flags can be specified to alter a given locks behavior. The flags include SX_DUPOK, SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature to the similar flags for mutexes. Adaptive spinning on select locks may be enabled by enabling the ADAPTIVE_SX kernel option. Only locks initialized with the SX_ADAPTIVESPIN flag via sx_init_flags() will adaptively spin. The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock() are now performed inline in non-debug kernels. As a result, <sys/sx.h> now requires <sys/lock.h> to be included prior to <sys/sx.h>. The new kernel option SX_NOINLINE can be used to disable the aforementioned inlining in non-debug kernels. The size of struct sx has changed, so the kernel ABI is probably greatly disturbed. MFC after: 1 month Submitted by: attilio Tested by: kris, pjd	2007-03-31 23:23:42 +00:00
Jung-uk Kim	46bd727a1e	Correct BB-profiling and adjust comments. Pointed out by: bde Reviewed by: bde	2007-03-31 01:47:37 +00:00
Jung-uk Kim	6a4abad780	Fix off-by-4 error in address validation for i386, reduce PCB reloading, and fix more style(9) nits. Pointed out by: bde Discussed with: kib Reviewd by: bde	2007-03-30 23:19:08 +00:00
Jung-uk Kim	80f87d5e55	Fix more style(9) nits[1] and remove unnecessary use of '#if !defined(_KERNEL)'. Pointed out by: bde[1]	2007-03-30 19:33:53 +00:00
Jung-uk Kim	6403d3a160	Use the same wisdom of sys/i386/i386/support.s 1.97 to remove obfuscation. Pointed out by: bde	2007-03-30 18:27:57 +00:00
Jung-uk Kim	a328699b34	MFP4: Linux futex support for amd64. Initial patch was submitted by kib and additional work was done by Divacky Roman. Tested by: emulation	2007-03-30 01:07:28 +00:00
Julian Elischer	6734f35eac	Implement the openat() linux syscall Submitted by: Roman Divacky (rdivacky@) MFC after: 2 weeks	2007-03-29 02:11:46 +00:00
Nick Hibma	f29fa1dfa4	Revisit the watchdogs: Resetting the error to EINVAL after failing to set the watchdog might hide the succesful arming of an earlier one. Accept that on failing to arm any watchdog (because of non-supported timeouts) EOPNOTSUPP is returned instead of the more appropriate EINVAL. MFC after: 3 days	2007-03-27 21:03:37 +00:00
Kris Kennaway	67eae018cb	Remove unnecessary giant acquisition around panic in #ifdef DIAGNOSTIC code. # There is some question about whether this code is even relevant any # longer (it dates back to prehistoric times, i.e. present in r1.1), # especially on amd64. Reviewed by: jhb	2007-03-26 21:45:44 +00:00
Nate Lawson	0d4ac62a35	Add an interface for drivers to be notified of changes to CPU frequency. cpufreq_pre_change is called before the change, giving each driver a chance to revoke the change. cpufreq_post_change provides the results of the change (success or failure). cpufreq_levels_changed gives the unit number of the cpufreq device whose number of available levels has changed. Hook in all the drivers I could find that needed it. * TSC: update TSC frequency value. When the available levels change, take the highest possible level and notify the timecounter set_cputicker() of that freq. This gets rid of the "calcru: runtime went backwards" messages. * identcpu: updates the sysctl hw.clockrate value * Profiling: if profiling is active when the clock changes, let the user know the results may be inaccurate. Reviewed by: bde, phk MFC after: 1 month	2007-03-26 18:03:29 +00:00
John Baldwin	9a53f6d97a	Fix a silly bogon that broke ibcs2_rename(). CID: 1065 Found by: Coverity Prevent (tm) Reported by: netchild	2007-03-26 15:39:49 +00:00
Alan Cox	8a5e898d63	In order to satisfy ACPI's need for an identity mapping, modify the temporary mapping created by locore so that the lowest two to four megabytes can become a permanent identity mapping. This implementation avoids any use of a large page mapping.	2007-03-24 19:53:22 +00:00
Jung-uk Kim	2be4e4713a	Catch up with ACPI-CA 20070320 import.	2007-03-22 18:16:43 +00:00
John Baldwin	d66ff27773	Change the amd64, i386, and ia64 nexus drivers to setup bus space tags and handles when activating a resource via bus_activate_resource() rather than doing some of the work in bus_alloc_resource() and some of it in bus_activate_resource(). One note is that when using isa_alloc_resourcev() on PC-98, drivers now need to just use bus_release_resource() without explicitly calling bus_deactivate_resource() first. nyan@ has already fixed all of the PC-98 drivers.	2007-03-21 15:36:38 +00:00
John Baldwin	b8783b00f8	Add a new apic0 psuedo-device to claim memory resources for the memory address ranges used by local and I/O APICs in the system. Some systems also reserve these ranges as system resources via either PnPBIOS or ACPI, so this device currently attaches after acpi0 and legacy0 so that the system resources are given precedence.	2007-03-20 21:53:31 +00:00
John Baldwin	95a07592ee	Add a new ram0 pseudo-device that claims memory resouces for physical addresses corresponding to system RAM. On amd64 ram0 uses the SMAP and claims all the type 1 SMAP regions. On i386 ram0 uses the dump_avail[] array. Note that on i386 we have to ignore regions above 4G in PAE kernels since bus resources use longs.	2007-03-20 21:08:39 +00:00
Jung-uk Kim	2498f259d4	- Add macros for newly added CPUID bits in the corresponding header files. - Use correct capticalization in xTPR as Intel uses in their documents. - Use proper description instead of vendor code name in comment.	2007-03-20 20:22:45 +00:00
John Baldwin	ce533e82a2	Tweak the probe/attach order of devices on the x86 nexus devices. Various BIOS-related psuedo-devices are added at an order of 5. acpi0 is added at an order of 10, and legacy0 is added at an order of 11.	2007-03-20 20:21:44 +00:00
Sam Leffler	09dce7a13d	display two new Intel feature bits Submitted by: "Rui Paulo" <rpaulo@gmail.com> MFC after: 2 weeks	2007-03-19 05:23:42 +00:00
Alan Cox	8cfba7267f	Eliminate an unused parameter.	2007-03-17 19:42:06 +00:00
Nate Lawson	9b0df55b61	Create an identity mapping (V=P) super page for the low memory region on boot. Then, just switch to the kernel pmap when suspending instead of allocating/freeing our own mapping every time. This should solve a panic of pmap_remove() being called with interrupts disabled. Thanks to Alan Cox for developing this patch. Note: this means that ACPI requires super page (PG_PS) support in the CPU. This has been present since the Pentium and first documented in the Pentium Pro. However, it may need to be revisited later. Submitted by: alc MFC after: 1 month	2007-03-14 22:30:02 +00:00
Jung-uk Kim	ab5916a526	Add another CPUID for AMD CPUs and fix style(9) while I am here.	2007-03-12 20:27:21 +00:00
Alan Cox	c640357f04	Push down the implementation of PCPU_LAZY_INC() into the machine-dependent header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it atomic with respect to interrupts. Reviewed by: bde, jhb	2007-03-11 05:54:29 +00:00
John Baldwin	07bb813626	Defer calling lapic_init() until we've completed the 'MPTable: <...>' printf. Otherwise, printfs inside of lapic_init() (such as during a verbose boot) can uglify the output.	2007-03-09 15:49:57 +00:00
Mohan Srinivasan	f9bb753844	Over NFS, an open() call could result in multiple over-the-wire GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.	2007-03-09 04:02:38 +00:00
Scott Long	94e6bc303f	Don't increment total_bounced when doing no-op dmamap_sync ops.	2007-03-06 18:28:43 +00:00
John Baldwin	4c5bec1161	Change the x86 interrupt code to use FreeBSD CPU IDs (i.e. PCPU_GET(cpuid)) rather than local APIC IDs to keep track of CPUs which can handle interrupts.	2007-03-06 17:16:47 +00:00
John Baldwin	6cd68eaa7e	Trim trailing whitespace.	2007-03-05 22:27:55 +00:00
Alan Cox	8da3fc95a7	Acquiring smp_ipi_mtx on every call to pmap_invalidate_() is wasteful. For example, during a buildworld more than half of the calls do not generate an IPI because the only TLB entry invalidated is on the calling processor. This revision pushes down the acquisition and release of smp_ipi_mtx into smp_tlb_shootdown() and smp_targeted_tlb_shootdown() and instead uses sched_pin() and sched_unpin() in pmap_invalidate_() so that thread migration doesn't lead to a missed TLB invalidation. Reviewed by: jhb MFC after: 3 weeks	2007-03-05 21:40:10 +00:00
John Baldwin	aa7a005ee0	Use vm_paddr_t rather than uintptr_t when passing the physical address of APICs to lapic_init() and ioapic_create().	2007-03-05 20:35:17 +00:00
John Baldwin	db181dc741	Add a simple device driver to "eat" any I/O APICs that show up as PCI devices. MFC after: 1 week	2007-03-05 16:22:49 +00:00
Bruce Evans	d78180f8f5	Partial fix for a bug in rev.1.231. If suspend/resume clobbers the RTC state, then it may clobber the RTC index register, so the index register must be restored before using it to restore control registers in rtc_restore(). The following problems remain: - rtc_restore() is only called if pmtimer is configured. Buggy suspend/resumes are more likely to clobber the index register than a control register, so pmtimer is more needed than it used to be. - pmtimer doesn't exist for amd64. - Restoring of the RTC state may race with rtcintr(). If an RTC interrupt is handled before the state is restored, then rtcin(RTC_INTR) in rtcintr() may read from the wrong register, so rtcintr() may spin forever. This may be mitigated by the most common state clobbering being to turn off RTC interrupts.	2007-03-05 09:10:17 +00:00
Yoshihiro Takahashi	de038d5ffc	style(9).	2007-03-04 04:55:19 +00:00
Jung-uk Kim	a4e3bad794	MFP4: 115220, 115222 - Fix style(9) and reduce diff between amd64 and i386. - Prefix Linuxulator macros with LINUX_ to prevent future collision.	2007-03-02 00:08:47 +00:00
John Baldwin	4d70511ac3	Use pause() rather than tsleep() on stack variables and function pointers.	2007-02-27 17:23:29 +00:00
Jung-uk Kim	6a5964d385	MFP4: 115094 Linux does not check file descriptor when MAP_ANONYMOUS is set. This should fix recent LTP test regressions. Reported by: Scot Hetzel (swhetzel at gmail dot com) netchild	2007-02-27 02:08:01 +00:00
Alexander Leidinger	802e08a360	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
John Baldwin	860d8e2312	Use ih_filter instead of ih_handler in a couple of places. This fixes most INTR_FAST handlers on i386. Reviewed by: piso	2007-02-23 20:03:24 +00:00
Paolo Pisati	ef544f6312	o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr() o add an int return code to all fast handlers o retire INTR_FAST/IH_FAST For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current Reviewed by: many Approved by: re@	2007-02-23 12:19:07 +00:00
Konstantin Belousov	3c97ab97bf	Unbreak ddb stepping over special frames after the following commit: Revision Changes Path 1.113 +4 -2 src/sys/i386/i386/apic_vector.s 1.117 +7 -1 src/sys/i386/i386/exception.s 1.36 +7 -7 src/sys/i386/i386/local_apic.c 1.298 +61 -63 src/sys/i386/i386/trap.c 1.62 +15 -22 src/sys/i386/i386/vm86.c 1.32 +4 -2 src/sys/i386/i386/vm86bios.s 1.21 +2 -2 src/sys/i386/include/apicvar.h 1.27 +2 -2 src/sys/i386/isa/atpic.c 1.50 +2 -1 src/sys/i386/isa/atpic_vector.s 1.35 +1 -1 src/sys/i386/isa/icu.h Tested by: kris, Peter Holm No objections from: kmacy	2007-02-19 10:57:47 +00:00
Alan Cox	ae0663a383	Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.	2007-02-18 06:33:02 +00:00
John Baldwin	9be403be00	Add bootverbose printfs to indicate which IDT vectors are assigned to MSI interrupts.	2007-02-15 22:22:57 +00:00
Jung-uk Kim	1441cab7b2	Regen.	2007-02-15 00:57:04 +00:00
Jung-uk Kim	10931a467a	MFP4: 113025, 113146, 113177, 113203, 113500, 113546, 113570 - PROT_READ, PROT_WRITE, or PROT_EXEC implies PROT_READ and PROT_EXEC. Linux/ia64's i386 emulation layer does this and it complies with Linux header files. This fixes mmap05 LTP test case on amd64. - Do not adjust stack size when failure has occurred. - Synchronize i386 mmap/mprotect with amd64.	2007-02-15 00:54:40 +00:00
Brooks Davis	983f970981	Include GEOM_LABEL in GENERIC. It's very useful and not well publicized enough. Approved by: pjd	2007-02-09 19:03:18 +00:00
John Baldwin	71ddf30bd2	Don't send interrupts to CPUs disabled via lapic hints. Reported by: Ludger Bolmerg <lbolmerg ! web.de> MFC after: 3 days Pointy hat to: jhb	2007-02-08 16:49:59 +00:00
Marcel Moolenaar	1d3aed33e8	Evolve the ctlreq interface added to geom_gpt into a generic partitioning class that supports multiple schemes. Current schemes supported are APM (Apple Partition Map) and GPT. Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM and GEOM_PART_GPT (resp). The ctlreq interface supports verbs to create and destroy partitioning schemes on a disk; to add, delete and modify partitions; and to commit or undo changes made.	2007-02-07 18:55:31 +00:00
Bruce Evans	1300fd67f3	Fixed some style bugs. Routine except: - don't use __GNUCLIKE___OFFSETOF, since __offsetof() is a standard FreeBSD implementaion detail which has nothing to do with GNUC.	2007-02-06 18:04:02 +00:00
Bruce Evans	3764a82377	Simplified PCPU_GET() and PCPU_SET(). We must copy through a temporary variable to avoid invalid constraints in dead code. Use an array of u_char's (inside a struct) instead of a char/short/int/long variable so that the variable and its accesses can be spelled in the same way in all cases and code doesn't need to be cloned just to hold the spelling differences. Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET(). Cast to (void ) as in rev.1.37 of the i386 version where the errors were fixed for the i386 PCPU_GET() only. It would be more correct to copy to and from the temp. variable using memcpy(), but then an ifdef tangle would be required to ensure using the builtin memcpy(). We depend on fairly aggressive optimization to put the temp. variable only in a register despite it being copied using (type )(void )&anothertype and could depend on this when using memcpy() too. This seems to work right even for -O0, but the -O0 case has not been completely tested. This change gives identical object code for all object files in LINT on amd64 (except for one file with a __TIME__ stamp). For LINT on i386 it gives unimportant differences in instruction order and padding in a few object files. This was only tested for -O. This change (actually a previous version of it) gives the following reductions in the number of object files in LINT that fail to compile with -O2 but without the -fno-strict-aliasing kludge: - amd64: 29 (down from 211) - i386: 36 (down from 47) gcc-3.4.6 actually allows the invalid constraints that result from not using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes on them and I don't want to depend on compiler bugs.	2007-02-06 16:21:09 +00:00
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	a9ccaccfc3	Fix LOR that occurs because proctree_lock was acquired while holding emuldata lock by moving the code upwards outside the emul_lock coverage. Submitted by: rdivacky	2007-02-01 13:27:52 +00:00
Robert Watson	b1e5dcf778	As we now have an SFB_NOWAIT flag, change 'will' to 'may' where the comment for sf_buf_alloc(9) talks about sleeping.	2007-01-28 17:39:03 +00:00
Bruno Ducrot	8867dfa953	o introduce a flags 'errata' for HW bugs onto the softc. o remove errata_a0 and introduce the corresponding flags into 'errata'. o introduce a new errata for K8, namely some platform might set the PENDING_BIT but aren't able to unset it, also don't loop forever waiting PENDING_BIT being cleared. o try to introduce a workaround for the PENDING_BIT stuck problem, o support now half multipliers for K8. Tested by: Abdullah Al-Marrie Approved by: njl	2007-01-23 19:20:30 +00:00
Jeff Roberson	f0393f063a	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.	2007-01-23 08:46:51 +00:00
Jeff Roberson	3c93ca7d2f	- Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision for this behavior on the initiator side.	2007-01-23 08:38:39 +00:00
Bruce Evans	71799af2d5	Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.	2007-01-23 08:01:20 +00:00
John Baldwin	5fe82bca57	Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support. - First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl	2007-01-22 21:48:44 +00:00
Alexander Leidinger	d071f5048c	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
Xin LI	f67af5c918	Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.	2007-01-17 15:05:52 +00:00
Alexander Leidinger	973ac082f8	MFp4 (112893): Make linux_vfork() actually work. This enables make to work again with 2.6. It also fixes the LTP vfork tests. Submitted by: rdivacky	2007-01-14 16:20:37 +00:00
Kip Macy	6f3a846eeb	exclude the icu and clock lock from LOCK_PROFILING	2007-01-14 02:13:07 +00:00
Warner Losh	fed32d7544	Remove 3rd clause, renumber, ok per email	2007-01-12 07:26:21 +00:00
John Baldwin	cad688f011	Remove magic from rman_activate_resource() that uses the direct map at KERNBASE for the first 1 MB of RAM instead of calling pmap_mapdev(). pmap_mapdev() knows how to handle the first 1 MB (and has known for a while now) and properly maps the memory as UC to boot. MFC after: 2 weeks	2007-01-11 19:40:19 +00:00
Jeff Roberson	b31e373bf7	- Use the correct test in the ipi bitmask handler for IPI_PREEMPT so that we actually issue preemptions. - Remove the #ifdef IPI_PREEMPTION so it is always compiled in. Leave the option which optionally enables support in sched_4bsd. sched_ule.c will soon use this functionality as a run time rather than compile time option. - Compare against the idlethread rather than the priority. There are some idle prio tasks that we can preempt. Discussed with: ups Tested on: i386, amd64	2007-01-11 00:17:02 +00:00
Jung-uk Kim	5efc6c44ff	Add SSSE3 extensions and correct CNXT-ID spelling for Intel processors.	2007-01-09 19:23:22 +00:00
Alexander Leidinger	1c65504ca8	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
Alexander Leidinger	99e9dcf022	regen after addition of linux_utimes and linux_rt_sigtimedwait	2006-12-31 13:20:31 +00:00
Alexander Leidinger	c9447c7551	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
Bruce Evans	0b194ec872	Fix oops in previous commit.	2006-12-29 15:48:18 +00:00
Bruce Evans	f28e1c8f99	Fixed some style bugs (mainly assorted errors in comments, and inconsistent spelling of `result').	2006-12-29 15:29:49 +00:00
Bruce Evans	6c296ffa81	Fixed some style bugs (whitespace only).	2006-12-29 14:28:23 +00:00
Bruce Evans	7e4277e591	Try harder to garbage-collect the "LOCORE" (really asm) version of MPLOCKED. The cleaning in rev.1.25 was supposed to have been undone by rev.1.26, but 1.26 could never have actually affected asm files since atomic.h is full of C declarations so including it in asm files would just give syntax errors. The asm MPLOCKED is even less needed than when misplaced definitions of it were first removed, and is now unused in any asm file in the src tree except in anachronismns in sys/i386/i386/support.s.	2006-12-29 13:36:26 +00:00
Bruce Evans	26ab2d1d23	Avoid an instruction in atomic_cmpset_{int_long)() in most cases. These functions are used a lot for mutexes, so this reduces the text size of an average kernel by about 0.75%. This wasn't intended to be a significant optimization, but it somehow increased the maximum number of packets per second that can be transmitted by my bge hardware from 320000 to 460000 (this benchmark is CPU-bound and remarkably sensitive to changes in the text section). Details: we would prefer to leave the result of the cmpxchg in %al, but cannot tell gcc that it is there, so we have to convert it to an integer register. We converted to %al, then to %[re]ax, but the latter step is usually wasted since gcc usually only wants the condition code and can recover it from %al just as easily as from %[re]ax. Let gcc promote %al in the few cases where this is needed. Nearby style fixes; - let gcc manage the load of `res', and don't abuse `res' for a copy of `exp' - don't echo `res's name in comments - consistently spell the condition code as 'e' after comparison for equality - don't hard-code %al anywhere except in constraints - for the version that doesn't use cmpxchg, there is no requirement to use %al anywhere, so don't hard-code it in the constraints either. Style non-fix: - for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al) for the main output operand, although this is not required. The input and output operands that use the "a" constraint are now decoupled, and this makes things clearer except for the reason that the output register is hard-coded. It is now just a hack to tell gcc that the input "a" has been clobbered without increasing the number of operands.	2006-12-27 20:26:00 +00:00
Jung-uk Kim	5e448826b7	Regen (just to fix 'generated from' line from the previous commit).	2006-12-20 20:42:58 +00:00
Jung-uk Kim	8187e7d7ad	Add linux_nanosleep() and regen.	2006-12-20 20:21:48 +00:00
Jung-uk Kim	77424f4177	MFP4: 109655 - Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to src/sys/compat/linux/linux_time.c. - Validate timespec ranges before use as Linux kernel does. - Fix l_timespec structure. - Clean up style(9) nits.	2006-12-20 20:17:35 +00:00
David Xu	4e32b7b3cc	Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130	2006-12-20 04:40:39 +00:00
Kip Macy	a5c5d4402c	Evidently FreeBSD has long relied on the compiler to treat structures passed by value (trap frames) as if they were in fact being passed by reference. For better or worse, this incorrect behaviour is no longer present in gcc 4.1. In this patch I convert all trapframe arguments to be explicitly pass by reference. I also remove vm86_initflags, pushing the very little work that it actually does up into vm86_prepcall. Reviewed by: kan Tested by: kan	2006-12-17 05:07:01 +00:00
Kip Macy	2c1709c67b	vm86_initflags was causing gcc41 and even gcc346 to get rather confused - de-obfuscate Suggested by: kan Reviewed by: kan Tested by: kan	2006-12-17 03:17:46 +00:00
Nick Hibma	9079fff550	Align the interfaces for the various watchdogs and make the interface behave as expected. Also: - Return an error if WD_PASSIVE is passed in to the ioctl as only WD_ACTIVE is implemented at the moment. See sys/watchdog.h for an explanation of the difference between WD_ACTIVE and WD_PASSIVE. - Remove the I_HAVE_TOTALLY_LOST_MY_SENSE_OF_HUMOR define. If you've lost your sense of humor, than don't add a define. Specific changes: i80321_wdog.c Don't roll your own passive watchdog tickle as this would defeat the purpose of an active (userland) watchdog tickle. ichwd.c / ipmi.c: WD_ACTIVE means active patting of the watchdog by a userland process, not whether the watchdog is active. See sys/watchdog.h. kern_clock.c: (software watchdog) Remove a check for WD_ACTIVE as this does not make sense here. This reverts r1.181.	2006-12-15 21:44:49 +00:00
Pyun YongHyeon	1f90cf9895	Add msk(4) to the list of drivers supported by GENERIC kernel.	2006-12-13 03:41:47 +00:00
John Baldwin	8964299ac8	Give Host-PCI bridge drivers their own pcib_alloc_msi() and pcib_alloc_msix() methods instead of using the method from the generic PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI specific logic soon.	2006-12-12 19:27:01 +00:00
John Baldwin	fde45e231a	Sort function prototypes.	2006-12-12 19:24:45 +00:00

1 2 3 4 5 ...

11010 Commits