freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-23 11:18:54 +00:00

Author	SHA1	Message	Date
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	a9ccaccfc3	Fix LOR that occurs because proctree_lock was acquired while holding emuldata lock by moving the code upwards outside the emul_lock coverage. Submitted by: rdivacky	2007-02-01 13:27:52 +00:00
Robert Watson	b1e5dcf778	As we now have an SFB_NOWAIT flag, change 'will' to 'may' where the comment for sf_buf_alloc(9) talks about sleeping.	2007-01-28 17:39:03 +00:00
Bruno Ducrot	8867dfa953	o introduce a flags 'errata' for HW bugs onto the softc. o remove errata_a0 and introduce the corresponding flags into 'errata'. o introduce a new errata for K8, namely some platform might set the PENDING_BIT but aren't able to unset it, also don't loop forever waiting PENDING_BIT being cleared. o try to introduce a workaround for the PENDING_BIT stuck problem, o support now half multipliers for K8. Tested by: Abdullah Al-Marrie Approved by: njl	2007-01-23 19:20:30 +00:00
Jeff Roberson	f0393f063a	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.	2007-01-23 08:46:51 +00:00
Jeff Roberson	3c93ca7d2f	- Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision for this behavior on the initiator side.	2007-01-23 08:38:39 +00:00
Bruce Evans	71799af2d5	Cleaned up declaration and initialization of clock_lock. It is only used by clock code, so don't export it to the world for machdep.c to initialize. There is a minor problem initializing it before it is used, since although clock initialization is split up so that parts of it can be done early, the first part was never done early enough to actually work. Split it up a bit more and do the first part as late as possible to document the necessary order. The functions that implement the split are still bogusly exported. Cleaned up initialization of the i8254 clock hardware using the new split. Actually initialize it early enough, and don't work around it not being initialized in DELAY() when DELAY() is called early for initialization of some console drivers. This unfortunately moves a little more code before the early debugger breakpoint so that it is harder to debug. The ordering of console and related initialization is delicate because we want to do as little as possible before the breakpoint, but must initialize a console.	2007-01-23 08:01:20 +00:00
John Baldwin	5fe82bca57	Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support. - First off, device drivers really do need to know if they are allocating MSI or MSI-X messages. MSI requires allocating powerof2() messages for example where MSI-X does not. To address this, split out the MSI-X support from pci_msi_count() and pci_alloc_msi() into new driver-visible functions pci_msix_count() and pci_alloc_msix(). As a result, pci_msi_count() now just returns a count of the max supported MSI messages for the device, and pci_alloc_msi() only tries to allocate MSI messages. To get a count of the max supported MSI-X messages, use pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix(). pci_release_msi() still handles both MSI and MSI-X messages, however. As a result of this change, drivers using the existing API will only use MSI messages and will no longer try to use MSI-X messages. - Because MSI-X allows for each message to have its own data and address values (and thus does not require all of the messages to have their MD vectors allocated as a group), some devices allow for "sparse" use of MSI-X message slots. For example, if a device supports 8 messages but the OS is only able to allocate 2 messages, the device may make the best use of 2 IRQs if it enables the messages at slots 1 and 4 rather than default of using the first N slots (or indicies) at 1 and 2. To support this, add a new pci_remap_msix() function that a driver may call after a successful pci_alloc_msix() (but before allocating any of the SYS_RES_IRQ resources) to allow the allocated IRQ resources to be assigned to different message indices. For example, from the earlier example, after pci_alloc_msix() returned a value of 2, the driver would call pci_remap_msix() passing in array of integers { 1, 4 } as the new message indices to use. The rid's for the SYS_RES_IRQ resources will always match the message indices. Thus, after the call to pci_remap_msix() the driver would be able to access the first message in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based rather than 0-based so that they will always correspond to the rid values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt). To support this API, a new PCIB_REMAP_MSIX() method was added to the pcib interface to change the message index for a single IRQ. Tested by: scottl	2007-01-22 21:48:44 +00:00
Alexander Leidinger	d071f5048c	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
Xin LI	f67af5c918	Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.	2007-01-17 15:05:52 +00:00
Alexander Leidinger	973ac082f8	MFp4 (112893): Make linux_vfork() actually work. This enables make to work again with 2.6. It also fixes the LTP vfork tests. Submitted by: rdivacky	2007-01-14 16:20:37 +00:00
Kip Macy	6f3a846eeb	exclude the icu and clock lock from LOCK_PROFILING	2007-01-14 02:13:07 +00:00
Warner Losh	fed32d7544	Remove 3rd clause, renumber, ok per email	2007-01-12 07:26:21 +00:00
John Baldwin	cad688f011	Remove magic from rman_activate_resource() that uses the direct map at KERNBASE for the first 1 MB of RAM instead of calling pmap_mapdev(). pmap_mapdev() knows how to handle the first 1 MB (and has known for a while now) and properly maps the memory as UC to boot. MFC after: 2 weeks	2007-01-11 19:40:19 +00:00
Jeff Roberson	b31e373bf7	- Use the correct test in the ipi bitmask handler for IPI_PREEMPT so that we actually issue preemptions. - Remove the #ifdef IPI_PREEMPTION so it is always compiled in. Leave the option which optionally enables support in sched_4bsd. sched_ule.c will soon use this functionality as a run time rather than compile time option. - Compare against the idlethread rather than the priority. There are some idle prio tasks that we can preempt. Discussed with: ups Tested on: i386, amd64	2007-01-11 00:17:02 +00:00
Jung-uk Kim	5efc6c44ff	Add SSSE3 extensions and correct CNXT-ID spelling for Intel processors.	2007-01-09 19:23:22 +00:00
Alexander Leidinger	1c65504ca8	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
Alexander Leidinger	99e9dcf022	regen after addition of linux_utimes and linux_rt_sigtimedwait	2006-12-31 13:20:31 +00:00
Alexander Leidinger	c9447c7551	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
Bruce Evans	0b194ec872	Fix oops in previous commit.	2006-12-29 15:48:18 +00:00
Bruce Evans	f28e1c8f99	Fixed some style bugs (mainly assorted errors in comments, and inconsistent spelling of `result').	2006-12-29 15:29:49 +00:00
Bruce Evans	6c296ffa81	Fixed some style bugs (whitespace only).	2006-12-29 14:28:23 +00:00
Bruce Evans	7e4277e591	Try harder to garbage-collect the "LOCORE" (really asm) version of MPLOCKED. The cleaning in rev.1.25 was supposed to have been undone by rev.1.26, but 1.26 could never have actually affected asm files since atomic.h is full of C declarations so including it in asm files would just give syntax errors. The asm MPLOCKED is even less needed than when misplaced definitions of it were first removed, and is now unused in any asm file in the src tree except in anachronismns in sys/i386/i386/support.s.	2006-12-29 13:36:26 +00:00
Bruce Evans	26ab2d1d23	Avoid an instruction in atomic_cmpset_{int_long)() in most cases. These functions are used a lot for mutexes, so this reduces the text size of an average kernel by about 0.75%. This wasn't intended to be a significant optimization, but it somehow increased the maximum number of packets per second that can be transmitted by my bge hardware from 320000 to 460000 (this benchmark is CPU-bound and remarkably sensitive to changes in the text section). Details: we would prefer to leave the result of the cmpxchg in %al, but cannot tell gcc that it is there, so we have to convert it to an integer register. We converted to %al, then to %[re]ax, but the latter step is usually wasted since gcc usually only wants the condition code and can recover it from %al just as easily as from %[re]ax. Let gcc promote %al in the few cases where this is needed. Nearby style fixes; - let gcc manage the load of `res', and don't abuse `res' for a copy of `exp' - don't echo `res's name in comments - consistently spell the condition code as 'e' after comparison for equality - don't hard-code %al anywhere except in constraints - for the version that doesn't use cmpxchg, there is no requirement to use %al anywhere, so don't hard-code it in the constraints either. Style non-fix: - for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al) for the main output operand, although this is not required. The input and output operands that use the "a" constraint are now decoupled, and this makes things clearer except for the reason that the output register is hard-coded. It is now just a hack to tell gcc that the input "a" has been clobbered without increasing the number of operands.	2006-12-27 20:26:00 +00:00
Jung-uk Kim	5e448826b7	Regen (just to fix 'generated from' line from the previous commit).	2006-12-20 20:42:58 +00:00
Jung-uk Kim	8187e7d7ad	Add linux_nanosleep() and regen.	2006-12-20 20:21:48 +00:00
Jung-uk Kim	77424f4177	MFP4: 109655 - Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to src/sys/compat/linux/linux_time.c. - Validate timespec ranges before use as Linux kernel does. - Fix l_timespec structure. - Clean up style(9) nits.	2006-12-20 20:17:35 +00:00
David Xu	4e32b7b3cc	Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130	2006-12-20 04:40:39 +00:00
Kip Macy	a5c5d4402c	Evidently FreeBSD has long relied on the compiler to treat structures passed by value (trap frames) as if they were in fact being passed by reference. For better or worse, this incorrect behaviour is no longer present in gcc 4.1. In this patch I convert all trapframe arguments to be explicitly pass by reference. I also remove vm86_initflags, pushing the very little work that it actually does up into vm86_prepcall. Reviewed by: kan Tested by: kan	2006-12-17 05:07:01 +00:00
Kip Macy	2c1709c67b	vm86_initflags was causing gcc41 and even gcc346 to get rather confused - de-obfuscate Suggested by: kan Reviewed by: kan Tested by: kan	2006-12-17 03:17:46 +00:00
Nick Hibma	9079fff550	Align the interfaces for the various watchdogs and make the interface behave as expected. Also: - Return an error if WD_PASSIVE is passed in to the ioctl as only WD_ACTIVE is implemented at the moment. See sys/watchdog.h for an explanation of the difference between WD_ACTIVE and WD_PASSIVE. - Remove the I_HAVE_TOTALLY_LOST_MY_SENSE_OF_HUMOR define. If you've lost your sense of humor, than don't add a define. Specific changes: i80321_wdog.c Don't roll your own passive watchdog tickle as this would defeat the purpose of an active (userland) watchdog tickle. ichwd.c / ipmi.c: WD_ACTIVE means active patting of the watchdog by a userland process, not whether the watchdog is active. See sys/watchdog.h. kern_clock.c: (software watchdog) Remove a check for WD_ACTIVE as this does not make sense here. This reverts r1.181.	2006-12-15 21:44:49 +00:00
Pyun YongHyeon	1f90cf9895	Add msk(4) to the list of drivers supported by GENERIC kernel.	2006-12-13 03:41:47 +00:00
John Baldwin	8964299ac8	Give Host-PCI bridge drivers their own pcib_alloc_msi() and pcib_alloc_msix() methods instead of using the method from the generic PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI specific logic soon.	2006-12-12 19:27:01 +00:00
John Baldwin	fde45e231a	Sort function prototypes.	2006-12-12 19:24:45 +00:00
John Baldwin	d748ef4792	Replace a few magic numbers.	2006-12-12 19:23:52 +00:00
John Baldwin	c304531851	Add a function to return the MD interrupt source cookie associated with an interrupt event. Use this in the x86 code to fixup the intrcnt names when an interrupt handler is removed.	2006-12-12 19:20:19 +00:00
Maxim Sobolev	efa43a53bd	Allow machdep.cpu_idle_hlt to be set from the loader. This should allow to workaround the problem with SMP kernels on Turion64 X2 processors described in kern/104678 and may be useful in other situations too. MFC after: 3 days	2006-12-06 18:27:17 +00:00
Julian Elischer	ad1e7d285a	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
Bruce Evans	b73057227b	Optimized RTC accesses by avoiding null writes to the index register and by only delaying when an RTC register is written to. The delay after writing to the data register is now not just a workaround. This reduces the number of ISA accesses in the usual case from 4 to 1. The usual case is 2 rtcin()'s for each RTC interrupt. The index register is almost always RTC_INTR for this. The 3 extra ISA accesses were 1 for writing the index and 2 for delays. Some delays are needed in theory, but in practice they now just slow down slow accesses some more since almost eveyone including us does them wrong so modern systems enforce sufficient delays in hardware. I used to have the delays ifdefed out, but with the index register optimization the delays are rarely executed so the old magic ones can be kept or even implemented non- magically without significant cost. Optimizing RTC interrupt handling is more interesting than it used to be because RTC interrupts are currently needed to fix the more efficient apic timer interrupts on some systems. apic_timer_hz is normally 2000 so the RTC interrupt rate needs to be 2048 to keep the apic timer firing on such systems. Without these changes, each RTC interrupt normally took 10 ISA accesses (2 PIC accesses and 2 sets of 4 RTC accesses). Each ISA access takes 1-1.5uS so 10 of then at 2048 Hz takes 2-3% of a CPU. Now 4 of them take 0.8-1.2% of a CPU.	2006-12-03 03:49:28 +00:00
John Birrell	e0b651251d	Turn console printf buffering into a kernel option and only on by default for sun4v where it is absolutely required. This change moves the buffer from struct pcpu to the stack to avoid using the critical section which created a LOR in a couple of cases due to interaction with the tty code and kqueue. The LOR can't be fixed with the critical section and the pcpu buffer can't be used without the critical section. Putting the buffer on the stack was my initial solution, but it was pointed out that the stress on the stack might cause problems depending on the call path. We don't have a way of creating tests for those possible cases, so it's best to leave this as an option for the time being. In time we may get enough data to enable this option more generally.	2006-11-30 04:17:05 +00:00
Ruslan Ermilov	ca0fa71fde	Tweak the comment about mapping a kernel using large pages.	2006-11-25 23:00:46 +00:00
Alan Cox	da44960498	The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)	2006-11-19 20:54:58 +00:00
John Baldwin	7693afca4e	- Add macro constants for the various fields in %dr7 and use them in place of various scattered magic values. - Pretty print the address of hardware watchpoints in 'show watch' rather than just displaying hex. - Expand address field width on amd64 for 64-bit pointers.	2006-11-17 19:20:32 +00:00
John Baldwin	5527d3ed75	Trim some noise from bootverbose: - Drop the printf in intr_machdep.c when we assign an interrupt souce to a CPU. Each source already has a more detailed printf. - Don't output a line for each ioapic pin showing its initial state, this has outlived its usefulness. - When an APIC enumerator sets the bus, polarity, or trigger mode of an ioapic pin, just return success without printing anything if the new value matches the current one. MFC after: 2 weeks	2006-11-17 16:41:03 +00:00
John Baldwin	5d346a567c	A few more style fixes.	2006-11-17 16:37:35 +00:00
Maxim Konovalov	79ba24ca87	o Make pv_maxchunks no less than maxproc. This helps to survive a forkbomb explosion. Reviewed by: alc Security: local DoS X-MFC atfer: RELENG_6 is not affected due to a different pv_entry allocation code.	2006-11-16 11:46:24 +00:00
John Baldwin	71f4007710	Various whitespace and style fixes.	2006-11-15 19:53:48 +00:00
John Baldwin	15f266289d	Fix a typo that broke MSI (MSI-X worked fine) in the later revisions of the MSI patches.	2006-11-15 18:40:00 +00:00
John Baldwin	4184900911	MD support for PCI Message Signalled Interrupts on amd64 and i386: - Add a new apic_alloc_vectors() method to the local APIC support code to allocate N contiguous IDT vectors (aligned on a M >= N boundary). This function is used to allocate IDT vectors for a group of MSI messages. - Add MSI and MSI-X PICs. The PIC code here provides methods to manage edge-triggered MSI messages as x86 interrupt sources. In addition to the PIC methods, msi.c also includes methods to allocate and release MSI and MSI-X messages. For x86, we allow for up to 128 different MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs, 16-254 for APIC PCI IRQs, and IRQ 255 is reserved). - Add pcib_(alloc\|release)_msi[x]() methods to the MD x86 PCI bridge drivers to bubble the request up to the nexus driver. - Add pcib_(alloc\|release)_msi[x]() methods to the x86 nexus drivers that ask the MSI PIC code to allocate resources and IDT vectors. MFC after: 2 months	2006-11-13 22:23:34 +00:00
Ruslan Ermilov	d77f5882e7	Fix NKPT comments to match reality. Note that the current value of NKPT is no longer enough to run amd64 with 16G of RAM, as it doesn't have space for mapping a kernel (16M kernel would require additionally 8 page tables).	2006-11-13 20:33:54 +00:00

1 2 3 4 5 ...

10894 Commits