freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-19 10:53:58 +00:00

Author	SHA1	Message	Date
John Baldwin	8964299ac8	Give Host-PCI bridge drivers their own pcib_alloc_msi() and pcib_alloc_msix() methods instead of using the method from the generic PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI specific logic soon.	2006-12-12 19:27:01 +00:00
John Baldwin	fde45e231a	Sort function prototypes.	2006-12-12 19:24:45 +00:00
John Baldwin	c304531851	Add a function to return the MD interrupt source cookie associated with an interrupt event. Use this in the x86 code to fixup the intrcnt names when an interrupt handler is removed.	2006-12-12 19:20:19 +00:00
Maxim Sobolev	efa43a53bd	Allow machdep.cpu_idle_hlt to be set from the loader. This should allow to workaround the problem with SMP kernels on Turion64 X2 processors described in kern/104678 and may be useful in other situations too. MFC after: 3 days	2006-12-06 18:27:17 +00:00
Julian Elischer	ad1e7d285a	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
Ruslan Ermilov	3cbc967ef7	Use a different bitmask for superpages' base address so that it doesn't conflict with the PG_PDE_PAT bit. (We still don't mask off all the reserved bits but that's okay for now.) Reviewed by: alc	2006-12-05 11:31:33 +00:00
Alexander Leidinger	786e4fc47d	MFP4 (110939): MFi386: return EOPNOTSUPP for unknown module events. Submitted by: rdivacky	2006-12-03 21:06:07 +00:00
Alexander Leidinger	43d9d89b3f	Sync with i386 (remove the LINUX stuff) now that the module is usable.	2006-12-03 21:02:09 +00:00
Bruce Evans	b73057227b	Optimized RTC accesses by avoiding null writes to the index register and by only delaying when an RTC register is written to. The delay after writing to the data register is now not just a workaround. This reduces the number of ISA accesses in the usual case from 4 to 1. The usual case is 2 rtcin()'s for each RTC interrupt. The index register is almost always RTC_INTR for this. The 3 extra ISA accesses were 1 for writing the index and 2 for delays. Some delays are needed in theory, but in practice they now just slow down slow accesses some more since almost eveyone including us does them wrong so modern systems enforce sufficient delays in hardware. I used to have the delays ifdefed out, but with the index register optimization the delays are rarely executed so the old magic ones can be kept or even implemented non- magically without significant cost. Optimizing RTC interrupt handling is more interesting than it used to be because RTC interrupts are currently needed to fix the more efficient apic timer interrupts on some systems. apic_timer_hz is normally 2000 so the RTC interrupt rate needs to be 2048 to keep the apic timer firing on such systems. Without these changes, each RTC interrupt normally took 10 ISA accesses (2 PIC accesses and 2 sets of 4 RTC accesses). Each ISA access takes 1-1.5uS so 10 of then at 2048 Hz takes 2-3% of a CPU. Now 4 of them take 0.8-1.2% of a CPU.	2006-12-03 03:49:28 +00:00
John Birrell	e0b651251d	Turn console printf buffering into a kernel option and only on by default for sun4v where it is absolutely required. This change moves the buffer from struct pcpu to the stack to avoid using the critical section which created a LOR in a couple of cases due to interaction with the tty code and kqueue. The LOR can't be fixed with the critical section and the pcpu buffer can't be used without the critical section. Putting the buffer on the stack was my initial solution, but it was pointed out that the stress on the stack might cause problems depending on the call path. We don't have a way of creating tests for those possible cases, so it's best to leave this as an option for the time being. In time we may get enough data to enable this option more generally.	2006-11-30 04:17:05 +00:00
Ruslan Ermilov	34028cf7d1	Differentiate between data and instruction fetch in the fatal page fault trap handler. Reviewed by: alc	2006-11-28 20:04:00 +00:00
Ruslan Ermilov	ca830b9a74	Use a define instead of a "magic" value.	2006-11-23 21:37:04 +00:00
Ruslan Ermilov	7b0381568e	Finish the PG_NX support at the pmap level. Reviewed by: alc	2006-11-23 21:36:02 +00:00
Ruslan Ermilov	f27eb21694	It's been possible to build linprocfs as a module for some time now. Submitted by: rdivacky	2006-11-22 10:34:12 +00:00
Alan Cox	da44960498	The global variable avail_end is redundant and only used once. Eliminate it. Make avail_start static to the pmap on amd64. (It no longer exists on other architectures.)	2006-11-19 20:54:58 +00:00
John Baldwin	81efc3d94c	Add support for 8 byte hardware watches in long mode. Kernel hardware watches support 8 byte watches. For userland, we disallow 8 byte watches for 32-bit tasks.	2006-11-17 20:27:01 +00:00
John Baldwin	7693afca4e	- Add macro constants for the various fields in %dr7 and use them in place of various scattered magic values. - Pretty print the address of hardware watchpoints in 'show watch' rather than just displaying hex. - Expand address field width on amd64 for 64-bit pointers.	2006-11-17 19:20:32 +00:00
John Baldwin	5527d3ed75	Trim some noise from bootverbose: - Drop the printf in intr_machdep.c when we assign an interrupt souce to a CPU. Each source already has a more detailed printf. - Don't output a line for each ioapic pin showing its initial state, this has outlived its usefulness. - When an APIC enumerator sets the bus, polarity, or trigger mode of an ioapic pin, just return success without printing anything if the new value matches the current one. MFC after: 2 weeks	2006-11-17 16:41:03 +00:00
John Baldwin	5d346a567c	A few more style fixes.	2006-11-17 16:37:35 +00:00
John Baldwin	71f4007710	Various whitespace and style fixes.	2006-11-15 19:53:48 +00:00
John Baldwin	15f266289d	Fix a typo that broke MSI (MSI-X worked fine) in the later revisions of the MSI patches.	2006-11-15 18:40:00 +00:00
John Baldwin	4184900911	MD support for PCI Message Signalled Interrupts on amd64 and i386: - Add a new apic_alloc_vectors() method to the local APIC support code to allocate N contiguous IDT vectors (aligned on a M >= N boundary). This function is used to allocate IDT vectors for a group of MSI messages. - Add MSI and MSI-X PICs. The PIC code here provides methods to manage edge-triggered MSI messages as x86 interrupt sources. In addition to the PIC methods, msi.c also includes methods to allocate and release MSI and MSI-X messages. For x86, we allow for up to 128 different MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs, 16-254 for APIC PCI IRQs, and IRQ 255 is reserved). - Add pcib_(alloc\|release)_msi[x]() methods to the MD x86 PCI bridge drivers to bubble the request up to the nexus driver. - Add pcib_(alloc\|release)_msi[x]() methods to the x86 nexus drivers that ask the MSI PIC code to allocate resources and IDT vectors. MFC after: 2 months	2006-11-13 22:23:34 +00:00
John Baldwin	818b0b4bdf	Various fixes: - Remove an extra entry from the array for 0x0f prefixed instruction groups. This fixes decoding of instructions where the second opcode >= 0x80. - Add support for the 64-bit immediate mov instructions. - When short_addr is enabled, don't parse the modr/m byte for a 16-bit address, but as a 32-bit address. - Support %rip relative addressing. - Don't print a displacement of 0 if there is a base or index register. MFC after: 3 days	2006-11-13 21:14:54 +00:00
Ruslan Ermilov	d77f5882e7	Fix NKPT comments to match reality. Note that the current value of NKPT is no longer enough to run amd64 with 16G of RAM, as it doesn't have space for mapping a kernel (16M kernel would require additionally 8 page tables).	2006-11-13 20:33:54 +00:00
Ruslan Ermilov	26af9ac7d0	Fix a comment.	2006-11-13 06:26:57 +00:00
Alan Cox	44b8bd66f9	Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)	2006-11-12 21:48:34 +00:00
Ruslan Ermilov	9f70620442	Regen. Forgotten by: trhodes	2006-11-11 21:49:08 +00:00
Ruslan Ermilov	7eae4829bf	Spelling.	2006-11-07 21:57:18 +00:00
Ruslan Ermilov	81490cbe6f	Line up memory amount reporting that got broken when s/real/usable/.	2006-11-07 21:55:39 +00:00
John Baldwin	6ddd7e6a5a	Add a new 'union l_sigval' to use in place of 'union sigval' in the linux siginfo structure. l_sigval uses a l_uintptr_t for sival_ptr so that sival_ptr is the right size for linux32 on amd64. Since no code currently uses 'lsi_ptr' this is just a cosmetic nit rather than a bug fix.	2006-11-07 18:53:49 +00:00
John Baldwin	3900a3be21	Remove duplicate IDTVEC macro definition, it's already defined in <machine/intr_machdep.h>.	2006-11-07 18:46:33 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
John Birrell	8391a99bf7	Remove the KDTRACE option again because of the complaints about having it as a default. For the record, the KDTRACE option caused _no_ additional source files to be compiled in; certainly no CDDL source files. All it did was to allow existing BSD licensed kernel files to include one or more CDDL header files. By removing this from DEFAULTS, the onus is on a kernel builder to add the option to the kernel config, possibly by including GENERIC and customising from there. It means that DTrace won't be a feature available in FreeBSD by default, which is the way I intended it to be. Without this option, you can't load the dtrace module (which contains the dtrace device and the DTrace framework). This is equivalent to requiring an option in a kernel config before you can load the linux emulation module, for example. I think it is a mistake to have DTrace ported to FreeBSD, but not to have it available to everyone, all the time. The only exception to this is the companies which distribute systems with FreeBSD embedded. Those companies will customise their systems anyway. The KDTRACE option was intended for them, and only them.	2006-11-04 23:50:12 +00:00
John Birrell	1f80cd9398	Build in kernel support for loading DTrace modules by default. This adds the hooks that DTrace modules register with, and adds a few functions which have the dtrace_ prefix to allow the DTrace FBT (function boundary trace) provider to avoid tracing because they are called from the DTtrace probe context. Unlike other forms of tracing and debug, DTrace support in the kernel incurs negligible run-time cost. I think the only reason why anyone wouldn't want to have kernel support enabled for DTrace would be due to the license (CDDL) under which DTrace is released.	2006-11-04 04:58:10 +00:00
John Birrell	3d068827c2	Add a cnputs() function to write a string to the console with a lock to prevent interspersed strings written from different CPUs at the same time. To avoid putting a buffer on the stack or having to malloc one, space is incorporated in the per-cpu structure. The buffer size if 128 bytes; chosen because it's the next power of 2 size up from 80 characters. String writes to the console are buffered up the end of the line or until the buffer fills. Then the buffer is flushed to all console devices. Existing low level console output via cnputc() is unaffected by this change. ithread calls to log() are also unaffected to avoid blocking those threads. A minor change to the behaviour in a panic situation is that console output will still be buffered, but won't be written to a tty as before. This should prevent interspersed panic output as a number of CPUs panic before we end up single threaded running ddb. Reviewed by: scottl, jhb MFC after: 2 weeks	2006-11-01 04:54:51 +00:00
Konstantin Belousov	d4d2a400e4	Fix a typo resulting in truncated linux32 signal trampoline code copied to the usermode. Usually, signal handler segfaulted on return. Reviewed by: jhb MFC after: 3 days	2006-10-31 17:53:02 +00:00
Alexander Leidinger	96ed72ac81	regen after linux_io_* backout	2006-10-29 14:12:44 +00:00
Alexander Leidinger	3680a41902	Backout the linux aio stuff. Several problems where identified and the dynamic nature (if no native aio code is available, the linux part returns ENOSYS because of missing requisites) should be solved differently than it is. All this will be done in P4. Not included in this commit is a backout of the changes to the native aio code (removing static in some places). Those changes (and some more) will also be needed when the reworked linux aio stuff will reenter the tree. Requested by: rwatson Discussed with: rwatson	2006-10-29 14:02:39 +00:00
Bruce Evans	6a70163fcc	Removed some SMP ifdefs so that using the TSC as a cputime clock is not completely decided at config time. Just don't default to using the TSC if there are multiple active CPUs. Also, don't default to using the TSC if it is broken. SMP ifdefs are still used to disallow using perfmon since perfmon is always broken if SMP is just configured. This only helps much for SMP kernels running on 1 CPU. The overheads for using the i8254 cputime clock were a bit too high on 486/33's, and now on multi-GHz CPUs they are usually in the 99-99.9% range. Switching from the old default of an i8254 clock to the TSC works poorly because the overheads are not recalibrated. Use the same condition for declaring perfmon stuff as for using it.	2006-10-29 09:48:44 +00:00
Bruce Evans	91b4d1bfc2	In the userland .mcount(): - Don't use a frame pointer. Our callers need a frame pointer, but we could only use one to support things that aren't supported. (These things are: - profiling of profiling - debugging of profiling. The core ENTRY() macro doesn't support forcing a frame pointer for debugging, so don't do more here.) - Ensure that we are in the text section and have normal alignment. - Use the normal syntax for `.type'.	2006-10-28 13:12:06 +00:00
Alexander Leidinger	c1ea90bfd3	regen (prctl addition)	2006-10-28 11:24:38 +00:00
Bruce Evans	43f0ea0a27	i386/include/profile.h: Fixed a syntax error for the (!__KERNEL && !__GNUCLIKE_ASM) case in rev.1.36. Apparently, this case has never been reached even by lint. Submitted by: stefanf {amd64,i386}/include/profile.h: In case the above case is actually reached, break it properly by providing null support that will fail at link time instead of a stub that gives wrong (null) profiling at runtime.	2006-10-28 11:03:03 +00:00
Alexander Leidinger	955d762aca	MFP4: Implement prctl(). Submitted by: rdivacky Tested with: LTP	2006-10-28 10:59:59 +00:00
Bruce Evans	853b92dacf	In MCOUNT_OVERHEAD(label), actually use the `label' parameter. We were still using the global label named "profil", and this worked accidentally because all callers use the same name.	2006-10-28 07:59:11 +00:00
Bruce Evans	3a110062fd	Cleaned up includes. <machine/profile.h> was unused. <machine/timerreg.h> was only used in the GUPROF case, so the messes to get its i386 prerequisites included shouldn't have been needed. Fixed some style bugs. Quote #error contents, and don't repeat an #error directive on amd64.	2006-10-28 06:38:51 +00:00
Bruce Evans	94450a83e8	Removed all traces of HIDENAME() in amd64 and i386 kernel code. Using this used to be slightly cleaner than using ifdefs in a few places to support both a.out and elf, but using it now just causes messes and unportabilities. It seems to be impossible to implement the elf HIDENAME() portably in cpp (since token pasting of "." and <name> is invalid). */prof_machdep.c: - Removed all uses of CNAME(). CNAME() is easy enough to use in pure asm code, but using it in inline asm requires messy quoting. The core pure asm code has been hacked on more and all uses of CNAME() in it have already gone away. Just assume the elf convention here too. - Removed now-uneeded include of <machine/asmacros.h>. - Removed the workaround for a namespace conflict with this include.	2006-10-28 06:04:29 +00:00
Bruce Evans	447647908c	Don't call mexitcount or provide a stub mexitcount to call when profiling is configured but high resolution profiling is not configured. Only functions in *.[Ss] called the stub, so efficiency was not significantly affected.	2006-10-27 14:17:50 +00:00
John Birrell	3750d1ecad	Remove the KSE option now that it's in DEFAULTS on these arches/machines. The 'nooption' kernel config entry has to be used to turn KSE off now. This isn't my preferred way of dealing with this, but I'll defer to scottl's experience with the io/mem kernel option change and the grief experienced over that. Submitted by: scottl@	2006-10-26 22:11:35 +00:00
John Birrell	013d6d8cb4	Add 'options KSE' to the kernel config DEFAULTS on all arches/machines except sun4v. This change makes the transition from a default to an option more transparent and is an attempt to head off all the compliants that are likely from people who don't read UPDATING, based on experience with the io/mem change. Submitted by: scottl@	2006-10-26 22:05:25 +00:00
John Birrell	8460a577a4	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
Ruslan Ermilov	837f167eb2	Move "device splash" back to MI NOTES and "files", it's MI.	2006-10-23 13:23:14 +00:00
Alan Cox	43200cd3ed	Eliminate unnecessary PG_BUSY tests.	2006-10-22 04:18:01 +00:00
Ruslan Ermilov	7971a9bc04	MFi386: 1.13: Fix booting with ps2 keyboards.	2006-10-21 12:52:46 +00:00
Dag-Erling Smørgrav	c43ac89acc	Move more MD devices and options out of MI NOTES.	2006-10-20 09:52:27 +00:00
Bruce Evans	045f738b58	Don't show debug registers in "show registers". Special registers should be displayed specially, and debug registers are among of the least interesting special registers (far behind %cr3). The debug registers are still accessible as variables and displayed in another bogus place ("show watches").	2006-10-20 09:44:21 +00:00
Dag-Erling Smørgrav	c276283866	The VGA_DEBUG option only exists on {amd64,i386,ia64}. Also remove 'device io' from amd64 NOTES; DEFAULTS takes care of it.	2006-10-20 08:56:26 +00:00
Warner Losh	e54ad0a189	Remove references to pccard.conf	2006-10-19 05:17:55 +00:00
David Xu	5f641fc0fb	o Add keyword volatile for user mutex owner field. o Fix type consistent problem by using type long for old umtx and wait channel. o Rename casuptr to casuword.	2006-10-17 02:24:47 +00:00
John Baldwin	b85360078a	Add one more include to fix the case of !DDB and !atpic.	2006-10-16 21:40:46 +00:00
Hiroki Sato	b84baf83b9	Add a newline to the printf(). Spotted by: Peter Carah <pete@altadena.net> MFC after: 3 days	2006-10-15 16:52:59 +00:00
Alexander Leidinger	95f2da66d3	regen (linux AIO stuff)	2006-10-15 14:24:10 +00:00
Alexander Leidinger	6a1162d4cd	MFP4 (with some minor changes): Implement the linux_io_* syscalls (AIO). They are only enabled if the native AIO code is available (either compiled in to the kernel or as a module) at the time the functions are used. If the AIO stuff is not available there will be a ENOSYS. From the submitter: ---snip--- DESIGN NOTES: 1. Linux permits a process to own multiple AIO queues (distinguished by "context"), but FreeBSD creates only one single AIO queue per process. My code maintains a request queue (STAILQ of queue(3)) per "context", and throws all AIO requests of all contexts owned by a process into the single FreeBSD per-process AIO queue. When the process calls io_destroy(2), io_getevents(2), io_submit(2) and io_cancel(2), my code can pick out requests owned by the specified context from the single FreeBSD per-process AIO queue according to the per-context request queues maintained by my code. 2. The request queue maintained by my code stores contrast information between Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks (struct aiocb). FreeBSD IO control block actually exists in userland memory space, required by FreeBSD native aio_XXXXXX(2). 3. It is quite troubling that the function io_getevents() of libaio-0.3.105 needs to use Linux-specific "struct aio_ring", which is a partial mirror of context in user space. I would rather take the address of context in kernel as the context ID, but the io_getevents() of libaio forces me to take the address of the "ring" in user space as the context ID. To my surprise, one comment line in the file "io_getevents.c" of libaio-0.3.105 reads: Ben will hate me for this REFERENCE: 1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/ (include/linux/aio_abi.h, fs/aio.c) 2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/ (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2)) 3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html The design notes: http://lse.sourceforge.net/io/aionotes.txt 4. The package libaio, both source and binary: http://rpmfind.net/linux/rpm2html/search.php?query=libaio Simple transparent interface to Linux AIO system calls. 5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/ POSIX AIO implementation based on Linux AIO system calls (depending on libaio). ---snip--- Submitted by: Li, Xiao <intron@intron.ac>	2006-10-15 14:22:14 +00:00
Alexander Leidinger	0a62e03542	MFP4 (106538 + 106541): Implement CLONE_VFORK. This fixes the clone05 LTP test. Submitted by: rdivacky	2006-10-15 13:39:40 +00:00
Alexander Leidinger	2482245b0c	Revert my previous commit, I mismerged this to the wrong place. Pointy hat to: netchild	2006-10-15 13:30:45 +00:00
Alexander Leidinger	21aed094a9	MFP4 (106541): Fix the clone05 test in the LTP. Submitted by: rdivacky	2006-10-15 13:25:23 +00:00
Alexander Leidinger	4b3583a354	MFP4 (107144[1]): Implement CLONE_FS on i386[1] and amd64. Submitted by: rdivacky [1]	2006-10-15 13:22:14 +00:00
John Baldwin	d3998dcf2e	Move the 2 additional #includes down into the #ifndef DEV_ATPIC section.	2006-10-13 17:31:57 +00:00
John Birrell	e70cbcb5ba	Attempt to fix the GENERIC kernel build which has been failing on tinderbox for a couple of days.	2006-10-13 04:53:22 +00:00
John Baldwin	5d54487ef2	Fix nodevice atpic compile. Pointy hat to: jhb	2006-10-12 12:48:21 +00:00
John Baldwin	520ffff83e	Change the x86 interrupt code to suspend/resume interrupt controllers (PICs) rather than interrupt sources. This allows interrupt controllers with no interrupt pics (such as the 8259As when APIC is in use) to participate in suspend/resume. - Always register the 8259A PICs even if we don't use any of their pins. - Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't included. - Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC on resume. This gets suspend/resume working with APIC on UP systems. SMP still needs more work to bring the APs back to life. The MFC after is tentative. Tested by: anholt (i386) Submitted by: Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3) MFC after: 1 week	2006-10-10 23:23:12 +00:00
John Baldwin	6e20fe33ba	Oops, fix sign bug in #ifdef for value of INTRCNT_COUNT. PR: kern/99870 Submitted by: jkim MFC after: 3 days	2006-10-10 19:26:35 +00:00
Simon L. B. Nielsen	4517aab293	- Remove SCHED_ULE from GENERIC to better avoid foot-shooting by unsuspecting users. - Add a comment in NOTES about experimental status of SCHED_ULE. - Make warning about experimental status in sched_ule(4) a bit stronger. Suggested and reviewed by: dougb Discussed on: developers MFC after: 3 days	2006-10-05 20:31:58 +00:00
David Xu	c6511aea86	Move some declaration of 32-bit signal structures into file freebsd32-signal.h, implement sigtimedwait and sigwaitinfo system calls.	2006-10-05 01:56:11 +00:00
John Birrell	6825d60738	PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after: Security: Move the relocation definitions to the common elf header so that DTrace can use them on one architecture targeted to a different one. Add the additional ELF types defines in Sun's "Linker and Libraries" manual.	2006-10-04 21:37:10 +00:00
Poul-Henning Kamp	e5037a18a9	Use utc_offset() where applicable, and hide the internals of it as static variables.	2006-10-02 18:23:37 +00:00
Poul-Henning Kamp	b69f71eb29	Second part of a little cleanup in the calendar/timezone/RTC handling. Split subr_clock.c in two parts (by repo-copy): subr_clock.c contains generic RTC and calendaric stuff. etc. subr_rtc.c contains the newbus'ified RTC interface. Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock} sysctls and associated variables into subr_clock.c. They are not machine dependent and we have generic code that relies on being present so they are not even optional.	2006-10-02 15:42:02 +00:00
Poul-Henning Kamp	f645b0b51c	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
Maxim Sobolev	2c473eaf67	Extend comment explaining why code is conditional at !defined(SCHED_ULE). Suggested by: ru	2006-09-27 22:09:35 +00:00
Maxim Sobolev	6e93c19e3d	Since ULE doesn't honor hlt_cpus_mask don't compile code that prevents timer interrupt servicing for disabled HTT cores in ULE case. Should be probably fixed in ULE code instead, but we have no real maintainer for ULE to do it. PR: 103697	2006-09-27 18:51:19 +00:00
Ruslan Ermilov	6c9fdda750	Added COMPAT_FREEBSD6 option.	2006-09-26 12:36:34 +00:00
David Xu	07a8ebcc75	Stop reloading %fs and %gs, since it causes the base address from GDT to be loaded into FS.base and GS.base, these values of course are not the values set by sysarch() with I386_SET_FSBASE and I386_SET_GSBASE, the change fixed a crash for 32bit libthr after signal handler returned and normal code is accessing thread pointer, for example: movl %gs:8, %eax.	2006-09-23 13:42:09 +00:00
John Baldwin	d72a078647	Update the ipmi(4) driver: - Split out the communication protocols into their own files and use a couple of function pointers in the softc that the commuication protocols setup in their own attach routine. - Add support for the SSIF interface (talking to IPMI over SMBus). - Add an ACPI attachment. - Add a PCI attachment that attaches to devices with the IPMI interface subclass. - Split the ISA attachment out into its own file: ipmi_isa.c. - Change the code to probe the SMBIOS table for an IPMI entry to just use pmap_mapbios() to map the table in rather than trying to setup a fake resource on an isa device and then activating the resource to map in the table. - Make bus attachments leaner by adding attach functions for each communication interface (ipmi_kcs_attach(), ipmi_smic_attach(), etc.) that setup per-interface data. - Formalize the model used by the driver to handle requests by adding an explicit struct ipmi_request object that holds the state of a given request and reply for the entire lifetime of the request. By bundling the request into an object, it is easier to add retry logic to the various communication backends (as well as eventually support BT mode which uses a slightly different message format than KCS, SMIC, and SSIF). - Add a per-softc lock and remove D_NEEDGIANT as the driver is now MPSAFE. - Add 32-bit compatibility ioctl shims so you can use a 32-bit ipmitool on FreeBSD/amd64. - Add ipmi(4) to i386 and amd64 NOTES. Submitted by: ambrisko (large portions of 2 and 3) Sponsored by: IronPort Systems, Yahoo! MFC after: 6 days	2006-09-22 22:11:29 +00:00
Alexander Kabaev	d9cb97ff9d	Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes the former and __builtin_va_start was present in all GCC version 3.1 and later.	2006-09-21 01:37:02 +00:00
Wojciech A. Koszek	dec10b39fd	Correct 'interrupt interrupt' -> 'interrupt' in the comment. Requested by: jhb Approved by: cognet (mentor)	2006-09-20 20:52:11 +00:00
David Xu	103c065406	Make cpu_set_upcall_kse() and cpu_set_user_tls() work for 32bit process.	2006-09-17 14:54:14 +00:00
John Baldwin	884ff1813f	Add a new ddb command 'show lapic' to dump details about the local APIC registers for the current CPU. MFC after: 3 days	2006-09-11 20:12:42 +00:00
John Baldwin	5c15c7e71d	Actually hook up the IPI_INVLCACHE IDT vectors backing pmap_invalidate_cache() in the SMP case so pmap_mapdev() in multiuser doesn't panic with a trap 30. I broke this many months ago when I added pmap_invalidate_cache() as early parts of the PAT work. Patience from: jmg Pointy hat: jhb	2006-09-11 20:10:42 +00:00
John Baldwin	9914a8cc7d	- Fix rman_manage_region() to be a lot more intelligent. It now checks for overlaps, but more importantly, it collapses adjacent free regions. This is needed to cope with BIOSen that split up ports for system devices (like IPMI controllers) across multiple system resource entries. - Now that rman_manage_region() is not so dumb, remove extra logic in the x86 nexus drivers to populate the IRQ rman that manually coalesced the regions. MFC after: 1 week	2006-09-11 19:31:52 +00:00
Alexander Leidinger	bb59e63f8f	Change futex lock from mutex to sx. Make futex_get atomic (protected by the futex lock). Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:25:25 +00:00
John Baldwin	f6c48c1932	Use a single constant to define the sizes of the physmap[], phys_avail[], and dump_avail[] arrays so they are in sync (previously it was possible to store more entries in the physmap[] then we could store in phys_avail[], which was pointless). While I'm here, bump up the length of these tables to hold 30 entries on amd64 and 16 on i386. This allows machines with fairly fragmented memory maps to boot ok (at least one machine would not boot FreeBSD/i386 but would boot FreeBSD/amd64 because amd64 allowed for more fragments). MFC after: 3 days	2006-09-07 15:03:02 +00:00
Maxim Sobolev	e7d33dcbc5	Unbreak in the case when device apic is compiled into non-SMP kernel. Reported by: jhay MFC after: 2 weeks	2006-09-06 22:05:34 +00:00
Maxim Sobolev	23da540855	The FreeBSD by default "disables" hyper-threading cores, by not scheduling any threads to them. However, it still counts those cores as "active but permanently idle" when calculating system-wide CPUs statistics. It is incorrect, since it skews statistics quite a bit and creates real problems for certain types of applications (monitoring applications for example), by making them believe that the system does have enough idle CPU resources, while in fact it does not. Correct the problem by not calling performance counting routines on "disabled" cores. The cleaner solution would be to just disable APIC timer interrupts on those cores completely, but ENOTIME here and it is not clear if the additional complexity really worth minor performance gain. Reviewed by: ssouhlal Sponsored by: Sippy Software, Inc. MFC after: 2 weeks	2006-09-05 17:15:24 +00:00
Alexander Leidinger	4038a816f8	MFi386 parts of rev 1.55 (modulo real MD parts): - implement CLONE_PARENT semantic - lock proc in the currently disabled part of CLONE_THREAD Submitted by: rdivacky	2006-08-28 13:09:24 +00:00
David Xu	66e1c26dba	Implement casuword32, compare and set user integer, thank Marcel Moolenarr who wrote the IA64 version of casuword32.	2006-08-28 02:28:15 +00:00
Alexander Leidinger	084556f5d7	regen	2006-08-27 08:58:00 +00:00
Alexander Leidinger	835e506190	Add the linux statfs64 call. This allows Tivoli backup to proceed a little but further on -current (still not successful, but a step into the right direction). Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: Paul Mather <paul@gromit.dlib.vt.edu>	2006-08-27 08:56:54 +00:00
Alexander Leidinger	40f734dd0d	Emulate what vfork does instead of using it in linux_vfork. This way we can do the stuff we need to do with linux processes at fork and don't panic the kernel at exit of the child. Submitted by: rdivacky Tested with: tst-vfork* (glibc regression tests) Tested by: netchild	2006-08-25 11:59:56 +00:00
Alexander Leidinger	1a28c0df09	Sync the MI parts for amd64 with i386 and remove the corresponding special handling for amd64 in the common code. The MD parts for amd64 are still outstanding, but at least this fixes some panics on amd64. Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: bsam	2006-08-20 13:50:27 +00:00
Alexander Leidinger	29ddc19bbf	Get rid of some nested includes. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb	2006-08-19 15:13:01 +00:00
Alexander Leidinger	94cb2ecf79	Move some stuff into headers where they belong. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-17 21:06:48 +00:00
Alexander Leidinger	c632e9d3cc	Initialize the emul sx-lock. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-17 10:04:49 +00:00
David Xu	55fd8f9e8c	Change xorq back to xorl. Noticed by: bde	2006-08-16 22:22:28 +00:00
Alexander Leidinger	0eef2f8a4e	Style fixes to comments. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-16 18:54:51 +00:00
David Xu	421cb5a9f2	Backout revision 1.117, xorl and xorq have same result, but xorq needs longer decoding.	2006-08-15 22:43:02 +00:00
John Baldwin	f8f1f7fb85	Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier systrace changes.	2006-08-15 17:37:01 +00:00
John Baldwin	df78f6d313	- Remove unused sysvec variables from various syscalls.conf. - Send the systrace_args files for all the compat ABIs to /dev/null for now. Right now makesyscalls.sh generates a file with a hardcoded function name, so it wouldn't work for any of the ABIs anyway. Probably the function name should be configurable via a 'systracename' variable and the functions should be stored in a function pointer in the sysvec structure.	2006-08-15 17:25:55 +00:00
Alexander Leidinger	7c09e6c0bd	Initialize the eventhandlers, mutexes and sx locks. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 14:58:15 +00:00
Alexander Leidinger	77b959aa51	add autogenerated systrace_args stuff for dtrace	2006-08-15 12:56:36 +00:00
Alexander Leidinger	9b44bfc556	Add the linux 2.6.x stuff (not used by default!): - TLS - complete - pid/tid mangling - complete - thread area - complete - futexes - complete with issues - clone() extension - complete with some possible minor issues - mq/timer/clock* stuff - complete but untested and the mq* stuff is disabled when not build as part of the kernel with native FreeBSD mq* support (module support for this will come later) Tested with: - linux-firefox - works, tested - linux-opera - works, tested - linux-realplay - doesnt work, issue with futexes - linux-skype - doesnt work, issue with futexes - linux-rt2-demo - works, tested - linux-acroread - doesnt work, unknown reason (coredump) and sometimes issue with futexes - various unix utilities in linux-base-gentoo3 and linux-base-fc4: everything tried worked On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. To test this new stuff, you have to run sysctl compat.linux.osrelease=2.6.16 to switch back use sysctl compat.linux.osrelease=2.4.2 Don't switch while running a linux program, strange things may or may not happen. Sponsored by: Google SoC 2006 Submitted by: rdivacky Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild	2006-08-15 12:54:30 +00:00
Alexander Leidinger	c107650561	regen	2006-08-15 12:51:45 +00:00
David Xu	2328274aec	Because fuword on AMD64 returns 64bit long integer -1 on fault, clear entire %rax to zero instead of only clearing %eax, otherwise it will leave garbage data in upper 32 bits.	2006-08-15 12:45:51 +00:00
Alexander Leidinger	b4359bd8e5	Add new syscalls in the linuxolator (only used when the sysctl compat.linux.osrelease is changed to "2.6.16" or similar). On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 12:28:14 +00:00
Alan Cox	c3b182d235	Eliminate an unnecessary initialization from trap_pfault() that also happens to contain a style error.	2006-08-14 19:53:53 +00:00
John Baldwin	d25fdf539e	Don't try to preserve PAT bits in pmap_enter(). We currently on pages that aren't mapped via pmap_enter() (KVA). We will eventually support PAT bits on user pages, but those will require some sort of MI caching mode stored in the vm_page. Reviewed by: alc	2006-08-14 15:39:41 +00:00
Alan Cox	3173042ecc	It's not entirely obvious that PGEX_I must be zero if no-execute is neither supported nor enabled. Just to be sure, verify that no-execute is enabled before passing VM_PROT_EXECUTE to vm_fault(). Suggested by: tegge@	2006-08-14 06:15:16 +00:00
John Baldwin	7e9f73f3ed	First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA. Reviewed by: alc	2006-08-11 19:22:57 +00:00
Alexander Leidinger	50e422f056	Add some more errno mappings (bsd -> linux) and a comment about the status.. Submitted by: "Intron" <mag@intron.ac>	2006-08-10 22:05:25 +00:00
Alan Cox	079ba18aac	Pass VM_PROT_EXECUTE to vm_fault() instead of VM_PROT_READ if the page fault was caused by an instruction fetch.	2006-08-08 04:01:29 +00:00
Alan Cox	14aaab5329	Eliminate the acquisition and release of the page queues lock around a call to vm_page_sleep_if_busy().	2006-08-06 06:29:16 +00:00
Alan Cox	f8883c0160	Define the additional page fault error codes that are implemented by amd64.	2006-08-02 16:24:23 +00:00
Alan Cox	78985e424a	Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change. The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access. Reviewed by: marcel@ (ia64)	2006-08-01 19:06:06 +00:00
David E. O'Brien	a4755e0e13	Correct spelling of 3DNow!.	2006-08-01 01:23:39 +00:00
Marcel Moolenaar	302981e72a	Remove sio(4) and related options from MI files to amd64, i386 and pc98 MD files. Remove nodevice and nooption lines specific to sio(4) from ia64, powerpc and sparc64 NOTES. There were no such lines for arm yet. sio(4) is usable on less than half the platforms, not counting a future mips platform. Its presence in MI files is therefore increasingly becoming a burden.	2006-07-29 18:38:54 +00:00
John Baldwin	cb76d9b05c	Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is now back to just being an argument count.	2006-07-28 20:22:58 +00:00
John Baldwin	91ce2694d1	Regen for MPSAFE flag removal.	2006-07-28 19:08:37 +00:00
John Baldwin	af5bf12239	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
John Baldwin	e0b4add8d8	Various fixes to comments in the syscall master files including removing cruft from the audit import and adding mention of COMPAT4 to freebsd32.	2006-07-28 18:55:18 +00:00
John Baldwin	22ea1bc57a	Unify the checking for lock misbehavior in the various syscall() implementations and adjust some of the checks while I'm here: - Add a new check to make sure we don't return from a syscall in a critical section. - Add a new explicit check before userret() to make sure we don't return with any locks held. The advantage here is that we can include the syscall number and name in syscall() whereas that info is not available in userret(). - Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by the more general checks just added. MFC after: 2 weeks	2006-07-27 22:32:30 +00:00
John Baldwin	0c5d1dbd43	Add KTR_SYSC tracing to the syscall() implementations that didn't have it yet. MFC after: 1 week	2006-07-27 21:25:50 +00:00
John Baldwin	00f1856905	Add missing ptrace(2) system-call stops to various syscall() implementations. MFC after: 1 week	2006-07-27 19:50:16 +00:00
John Baldwin	57b16b0882	Don't allow MAXMEM or hw.physmem to extend the top of memory if our memory map was obtained from the SMAP. SMAP is trustworthy, and the memory extending feature is a band-aid for older systems where FreeBSD's methods of detecting memory were not always trustworthy. This fixes the issue where using hw.physmem could result in the ACPI tables getting trashed breaking ACPI. MFC after: 3 days Tested on: i386	2006-07-27 19:47:22 +00:00
David Xu	14f5d6fd7d	Remove a duplicated line.	2006-07-24 12:24:56 +00:00
Alan Cox	3cad40e517	Add pmap_clear_write() to the interface between the virtual memory system's machine-dependent and machine-independent layers. Once pmap_clear_write() is implemented on all of our supported architectures, I intend to replace all calls to pmap_page_protect() by calls to pmap_clear_write(). Why? Both the use and implementation of pmap_page_protect() in our virtual memory system has subtle errors, specifically, the management of execute permission is broken on some architectures. The "prot" argument to pmap_page_protect() should behave differently from the "prot" argument to other pmap functions. Instead of meaning, "give the specified access rights to all of the physical page's mappings," it means "don't take away the specified access rights from all of the physical page's mappings, but do take away the ones that aren't specified." However, owing to our i386 legacy, i.e., no support for no-execute rights, all but one invocation of pmap_page_protect() specifies VM_PROT_READ only, when the intent is, in fact, to remove only write permission. Consequently, a faithful implementation of pmap_page_protect(), e.g., ia64, would remove execute permission as well as write permission. On the other hand, some architectures that support execute permission have basically ignored whether or not VM_PROT_EXECUTE is passed to pmap_page_protect(), e.g., amd64 and sparc64. This change represents the first step in replacing pmap_page_protect() by the less subtle pmap_clear_write() that is already implemented on amd64, i386, and sparc64. Discussed with: grehan@ and marcel@	2006-07-20 17:48:41 +00:00
Alan Cox	e4cec28398	Now that free_pv_entry() accesses the pmap, call free_pv_entry() in pmap_remove_all() before rather than after the pmap is unlocked. At present, the page queues lock provides sufficient sychronization. In the future, the page queues lock may not always be held when free_pv_entry() is called.	2006-07-17 03:10:17 +00:00
Jung-uk Kim	0758eaa227	Sync specialreg.h changes between amd64 and i386 with few fixes.	2006-07-13 16:09:40 +00:00
John Baldwin	19e9205a23	Simplify the pager support in DDB. Allowing different db commands to install custom pager functions didn't actually happen in practice (they all just used the simple pager and passed in a local quit pointer). So, just hardcode the simple pager as the only pager and make it set a global db_pager_quit flag that db commands can check when the user hits 'q' (or a suitable variant) at the pager prompt. Also, now that it's easy to do so, enable paging by default for all ddb commands. Any command that wishes to honor the quit flag can do so by checking db_pager_quit. Note that the pager can also be effectively disabled by setting $lines to 0. Other fixes: - 'show idt' on i386 and pc98 now actually checks the quit flag and terminates early. - 'show intr' now actually checks the quit flag and terminates early.	2006-07-12 21:22:44 +00:00
Jung-uk Kim	444576c0c4	Add two new CPUID bits for AMD CPUs, i. e., SVM and extended APIC register.	2006-07-12 06:04:12 +00:00
John Baldwin	90aff9de2d	Regen.	2006-07-11 20:55:23 +00:00
John Baldwin	be5747d5b5	- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.	2006-07-11 20:52:08 +00:00
Matt Jacob	086ba9f74f	Make the firmware assist driver resident in preparation for isp using it.	2006-07-09 16:40:31 +00:00
John Baldwin	ec982ae761	Regen.	2006-07-06 21:43:14 +00:00
John Baldwin	ad6d226d43	- Protect the list of linux ioctl handlers with an sx lock. - Hold Giant while calling linux ioctl handlers for now as they aren't all known to be MPSAFE yet. - Mark linux_ioctl() MPSAFE.	2006-07-06 21:42:36 +00:00
Alan Cox	9a147235a5	Make two simplifications to pmap_ts_referenced(): Eliminate an unnecessary test and exit the loop in a shorter way.	2006-07-06 06:17:08 +00:00
Alan Cox	7eb8cd27f8	pmap_clear_ptes() is already convoluted. This will worsen with the implementation of superpages. Eliminate it and add pmap_clear_write(). There are no functional changes. Checked by: md5	2006-07-05 07:04:31 +00:00
David Xu	20197a8c49	Temporarily remove SCHED_CORE, it seems I have so many works can do now, one example is POSIX priority mutex for libthr.	2006-07-05 02:32:55 +00:00
Alan Cox	da536e6348	Correct an error in the new pmap_collect(), thus only affecting HEAD. Specifically, the pv entry was always being freed to the caller's pmap instead of the pmap to which the pv entry belongs.	2006-07-02 18:22:47 +00:00
Alan Cox	87e9885fe4	Tidy up pmap_ts_referenced(): Eliminate excessive white space. Eliminate an initialized but otherwise unused variable. Explicitly check a pointer against NULL. There are no functional changes. Checked by: md5	2006-07-01 23:43:54 +00:00
Alan Cox	ad84c5de83	Eliminate the remaining uses of "register". Convert the remaining K&R-style function declarations to ANSI-style.	2006-07-01 05:01:05 +00:00
John Baldwin	cec34dbf79	Regen.	2006-06-27 18:32:16 +00:00
John Baldwin	49d409a108	- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.	2006-06-27 18:28:50 +00:00
John Baldwin	0cceebeeb2	Regen.	2006-06-27 14:47:08 +00:00
John Baldwin	597d608f86	- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.	2006-06-27 14:46:31 +00:00
Alan Cox	8e0e1e2239	Correct a very old and very obscure bug: vmspace_fork() calls pmap_copy() if the mapping is VM_INHERIT_SHARE. Suppose the mapping is also wired. vmspace_fork() clears the wiring attributes in the vm map entry but pmap_copy() copies the PG_W attribute in the PTE. I don't think this is catastrophic. It blocks pmap_remove_pages() from destroying the mapping and corrupts the pmap's wiring count. This revision fixes the problem by changing pmap_copy() to clear the PG_W attribute. Reviewed by: tegge@	2006-06-27 04:28:23 +00:00
David E. O'Brien	bfc788c283	Add a pure open source nForce Ethernet driver, under BSDL. This driver was ported from OpenBSD by Shigeaki Tagashira <shigeaki@se.hiroshima-u.ac.jp> and posted at http://www.se.hiroshima-u.ac.jp/~shigeaki/software/freebsd-nfe.html It was additionally cleaned up by me. It is still a work-in-progress and thus is purposefully not in GENERIC. And it conflicts with nve(4), so only one should be loaded.	2006-06-26 23:41:07 +00:00
Sergey Babkin	d81175c738	Backed out the change by request from rwatson. PR: kern/14584	2006-06-26 22:03:22 +00:00
John Baldwin	b820787fb3	Regen.	2006-06-26 18:37:36 +00:00
John Baldwin	cf837b8943	linux_brk() is MPSAFE.	2006-06-26 18:36:16 +00:00
Alan Cox	feb0c348cf	Eliminate a comment that became stale after revision 1.540. Wrap a nearby line.	2006-06-25 22:22:37 +00:00
Sergey Babkin	7a799f1ef0	The common UID/GID space implementation. It has been discussed on -arch in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR. PR: kern/14584 Submitted by: babkin	2006-06-25 18:37:44 +00:00
Alexander Leidinger	adc250e2c5	Commit the DUMMY stuff (printing messages for missing syscalls) for amd64 too. Submitted by: rdivacky Sponsored by: Google SoC 2006 Noticed by: jkim Pointyhat to: netchild	2006-06-21 08:45:40 +00:00
Alan Cox	f05446648b	Change get_pv_entry() such that the call to vm_page_alloc() specifies VM_ALLOC_NORMAL instead of VM_ALLOC_SYSTEM when try is TRUE. In other words, when get_pv_entry() is permitted to fail, it no longer tries as hard to allocate a page. Change pmap_enter_quick_locked() to fail rather than wait if it is unable to allocate a page table page. This prevents a race between pmap_enter_object() and the page daemon. Specifically, an inactive page that is a successor to the page that was given to pmap_enter_quick_locked() might become a cache page while pmap_enter_quick_locked() waits and later pmap_enter_object() maps the cache page violating the invariant that cache pages are never mapped. Similarly, change pmap_enter_quick_locked() to call pmap_try_insert_pv_entry() rather than pmap_insert_entry(). Generally speaking, pmap_enter_quick_locked() is used to create speculative mappings. So, it should not try hard to allocate memory if free memory is scarce. Add an assertion that the object containing m_start is locked in pmap_enter_object(). Remove a similar assertion from pmap_enter_quick_locked() because that function no longer accesses the containing object. Remove a stale comment. Reviewed by: ups@	2006-06-20 20:52:11 +00:00
Alexander Leidinger	aff681d258	regen after change to syscalls.master	2006-06-20 20:41:29 +00:00
Alexander Leidinger	502195ac72	Switch to using the DUMMY infrastructure instead of UNIMPL for the new syscalls. This way there will be a log message printed to the console (this time for real). Note: UNIMPL should be used for syscalls we do not implement ever, e.g. syscalls to load linux kernel modules. Submitted by: rdivacky Sponsored by: Goole SoC 2006 P4 IDs: 99600, 99602	2006-06-20 20:38:44 +00:00
Yaroslav Tykhiy	15a901e263	We no longer need to disable interrupts in MD trap machinery when we're about to call kdb_trap() because the latter MI function can disable interrupts by itself now. Pointed out by: bde X-MFC remark: depends on kern/subr_kdb.c#1.18 Sponsored by: RiNet (Cronyx Plus LLC)	2006-06-20 12:44:21 +00:00
David Xu	7da6810b11	Add variable cpu_mxcsr_mask to save valid bits of mxcsr register.	2006-06-19 22:59:28 +00:00
David Xu	4d70df3fee	MFi386: Use the method described in IA-32 Intel Architecture Software Developer's Manual chapter 11.6.6 to get valid mxcsr bits, use the mxcsr mask to clear invalid bits passed by user code.	2006-06-19 22:36:01 +00:00
Alexander Leidinger	28a3ae7f88	Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's an explicit comment that it's needed for the linuxolator. This is not the case anymore. For all other architectures there was only a "KEEP THIS". I'm (and other people too) running a COMPAT_43-less kernel since it's not necessary anymore for the linuxolator. Roman is running such a kernel for a for longer time. No problems so far. And I doubt other (newer than ia32 or alpha) architectures really depend on it. This may result in a small performance increase for some workloads. If the removal of COMPAT_43 results in a not working program, please recompile it and all dependencies and try again before reporting a problem. The only place where COMPAT_43 is needed (as in: does not compile without it) is in the (outdated/not usable since too old) svr4 code. Note: this does not remove the COMPAT_43TTY option. Nagging by: rdivacky	2006-06-15 19:58:53 +00:00
Stephan Uphoff	2053c12705	Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps	2006-06-15 01:01:06 +00:00
Alexander Leidinger	4946fe7c4d	regen after MFP4 (soc2006/rdivacky_linuxolator) of syscalls.master P4-Changes: similar to 98673 and 98675 but regenerated locally Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-06-13 18:48:30 +00:00
Alexander Leidinger	c8b579c182	MFP4 (soc2006/rdivacky_linuxolator) Update of syscall.master: o Adding of several new dummy syscalls (268-310) o Synchronization of amd64 syscall.master with i386 one o Auditing added to amd64 syscall.master o Change auditing type for lstat syscall (bugfix). [1] P4-Changes: 98672, 98674 Noticed by: rwatson [1] Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-06-13 18:43:55 +00:00
David Xu	b41f1452d9	Add scheduler CORE, the work I have done half a year ago, recent, I picked it up again. The scheduler is forked from ULE, but the algorithm to detect an interactive process is almost completely different with ULE, it comes from Linux paper "Understanding the Linux 2.6.8.1 CPU Scheduler", although I still use same word "score" as a priority boost in ULE scheduler. Briefly, the scheduler has following characteristic: 1. Timesharing process's nice value is seriously respected, timeslice and interaction detecting algorithm are based on nice value. 2. per-cpu scheduling queue and load balancing. 3. O(1) scheduling. 4. Some cpu affinity code in wakeup path. 5. Support POSIX SCHED_FIFO and SCHED_RR. Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler uses 256 priority queues. Unlike ULE which using pull and push, the scheduelr uses pull method, the main reason is to let relative idle cpu do the work, but current the whole scheduler is protected by the big sched_lock, so the benefit is not visible, it really can be worse than nothing because all other cpu are locked out when we are doing balancing work, which the 4BSD scheduelr does not have this problem. The scheduler does not support hyperthreading very well, in fact, the scheduler does not make the difference between physical CPU and logical CPU, this should be improved in feature. The scheduler has priority inversion problem on MP machine, it is not good for realtime scheduling, it can cause realtime process starving. As a result, it seems the MySQL super-smack runs better on my Pentium-D machine when using libthr, despite on UP or SMP kernel.	2006-06-13 13:12:56 +00:00
John Baldwin	e3d7caf487	Enable a few more things in x86 NOTES to get broader LINT coverage: - Turn on iwi(4), ipw(4), and ndis(4) on amd64 and i386. - Turn on ral(4) and ural(4) on i386, pc98, and amd64.	2006-06-12 20:38:17 +00:00
Alan Cox	b74a62d602	Don't invalidate the TLB in pmap_qenter() unless the old mapping was valid. Most often, it isn't. Reviewed by: tegge@	2006-06-12 20:05:27 +00:00
Warner Losh	78878cef94	Add the ability to subset the devices that UART pulls in. This allows the arm to compile without all the extras that don't appear, at least not in the flavors of ARM I deal with. This helps us save about 100k. If I've botched the available devices on a platform, please let me know and I'll correct ASAP.	2006-06-12 04:21:50 +00:00
Alan Cox	ce142d9ec0	Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects. Reviewed by: tegge@	2006-06-05 20:35:27 +00:00
Mike Silbersack	f25d341cfb	After much discussion with mjacob and scottl, change bus_dmamem_alloc so that it just warns the user with a printf when it misaligns a piece of memory that was requested through a busdma tag. Some drivers (such as mpt, and probably others) were asking for alignments that could not be satisfied, but as far as driver operation was concerned, that did not matter. In the theory that other drivers will fall into this same category, we agreed that panicing or making the allocation fail will cause more hardship than is necessary. The printf should be sufficient motivation to get the driver glitch fixed.	2006-06-01 04:49:29 +00:00
Matt Jacob	aa57a87a56	Turn the panic on not being able to meet alignment constraints in bus_dmamem_alloc into the more reasonable EINVAL return. Also, reclaim memory allocated but then not used if we had an error return.	2006-05-31 00:37:56 +00:00
Mike Silbersack	3d31890277	MFi386 rev 1.78: Add a quick hack to ensure that bus_dmamem_alloc properly aligns small allocations with large alignment requirements. Add a panic to detect cases where we've still failed to properly align.	2006-05-28 18:31:32 +00:00
Maxim Sobolev	aa1807d5d6	Move clock_lock prototype into <machine/clock.h>, where it is more appropriate. Discussed with: jhb	2006-05-19 18:53:50 +00:00
Marius Strobl	8df071afd9	Add le(4). I could actually only test it on alpha, i386 and sparc64 but given that this includes the more problematic platforms I see no reason why it shouldn't also work on amd64 and ia64.	2006-05-17 20:45:45 +00:00
Poul-Henning Kamp	c40da00ca3	Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.	2006-05-16 14:37:58 +00:00
Ruslan Ermilov	155d9f6a98	Kill more references to lnc(4). Submitted by: grep(1)	2006-05-16 12:15:39 +00:00
Marius Strobl	055abe9af2	Remove some remnants of lnc(4).	2006-05-14 18:49:25 +00:00
Poul-Henning Kamp	5405ab4889	Clean out sysctl machdep.* related defines. The cmos clock related stuff should really be in MI code.	2006-05-11 17:29:25 +00:00
Alexander Leidinger	ba5bd0001c	regen (linux rt_sigpending)	2006-05-10 18:19:51 +00:00
Alexander Leidinger	17138b619c	Implement rt_sigpending in the linuxolator. PR: 92671 Submitted by: Markus Niemist"o <markus.niemisto@gmx.net>	2006-05-10 18:17:29 +00:00
Doug Ambrisko	32397ce071	Add in linsysfs. A linux 2.6 like sys filesystem to pacify the Linux LSI MegaRAID SAS utility. Sponsored by: IronPort Systems Man page help from: brueffer	2006-05-09 22:27:01 +00:00
Doug Ambrisko	387196bf56	Forgot the amd/linux32 part since sys/*/linux didn't match :-( Pointed out by: Alexander (thanks)	2006-05-06 17:26:45 +00:00
Sam Leffler	57d6ae0689	add ath and wlan crypto support MFC after: 1 month	2006-05-03 18:15:36 +00:00
Scott Long	8d59dfff98	Allow bus_dmamap_load() to pass ENOMEM back to the caller. This puts it into conformance with the mbuf and uio load routines. ENOMEM can only happen with BUS_DMA_NOWAIT is passed in, thus the deferals are disabled. I don't like doing this, but fixing this fixes assumptions in other important drivers, which is a net benefit for now.	2006-05-03 04:14:17 +00:00
John Baldwin	2b8a339c7e	Add various constants for the PAT MSR and the PAT PTE and PDE flags. Initialize the PAT MSR during boot to map PAT type 2 to Write-Combining (WC) instead of Uncached (UC-). MFC after: 1 month	2006-05-01 22:07:00 +00:00
John Baldwin	4ac60df584	Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the wbinvd() instruction. This includes a new IPI so that all CPU caches on all CPUs are flushed for the SMP case. MFC after: 1 month	2006-05-01 21:36:47 +00:00
Alan Cox	e9ba21a5bb	Eliminate unnecessary, recursive acquisitions and releases of the page queues lock by free_pv_entry() and pmap_remove_pages(). Reduce the scope of the page queues lock in pmap_remove_pages().	2006-04-29 00:59:15 +00:00
Marcel Moolenaar	64220a7e28	Rewrite of puc(4). Significant changes are: o Properly use rman(9) to manage resources. This eliminates the need to puc-specific hacks to rman. It also allows devinfo(8) to be used to find out the specific assignment of resources to serial/parallel ports. o Compress the PCI device "database" by optimizing for the common case and to use a procedural interface to handle the exceptions. The procedural interface also generalizes the need to setup the hardware (program chipsets, program clock frequencies). o Eliminate the need for PUC_FASTINTR. Serdev devices are fast by default and non-serdev devices are handled by the bus. o Use the serdev I/F to collect interrupt status and to handle interrupts across ports in priority order. o Sync the PCI device configuration to include devices found in NetBSD and not yet merged to FreeBSD. o Add support for Quatech 2, 4 and 8 port UARTs. o Add support for a couple dozen Timedia serial cards as found in Linux.	2006-04-28 21:21:53 +00:00
Scott Long	27aafcda76	Enable the rr232x driver for amd64.	2006-04-28 05:23:10 +00:00
Alan Cox	7dece6c7d9	In general, bits in the page directory entry (PDE) and the page table entry (PTE) have the same meaning. The exception to this rule is the eighth bit (0x080). It is the PS bit in a PDE and the PAT bit in a PTE. This change avoids the possibility that pmap_enter() confuses a PAT bit with a PS bit, avoiding a panic(). Eliminate a diagnostic printf() from the i386 pmap_enter() that serves no current purpose, i.e., I've seen no bug reports in the last two years that are helped by this printf(). Reviewed by: jhb	2006-04-27 21:26:25 +00:00
Peter Wemm	0be8b8cee8	Move vm.pmap.pv_entry_count out from the PV_STATS ifdefs. It is always available and is a real counter, not a statistic.	2006-04-26 21:34:07 +00:00
Jung-uk Kim	daea0aad84	Check if reported HTT cores are physical cores. This commit does not affect AMD CPUs at all because HTT bit is disabled earlier. Intel multicore CPUs and ULE scheduler may be affected.	2006-04-25 00:06:37 +00:00
Jung-uk Kim	091c9b4961	Add another Intel CPU feature flag, xTPR (Send Task Priority Messages).	2006-04-24 22:56:57 +00:00
Jung-uk Kim	cf24d86bcc	Check if deterministic cache parameters leaf is valid before use.	2006-04-24 22:23:52 +00:00
Colin Percival	8b4553119e	Adjust dangerous-shared-cache-detection logic from "all shared data caches are dangerous" to "a shared L1 data cache is dangerous". This is a compromise between paranoia and performance: Unlike the L1 cache, nobody has publicly demonstrated a cryptographic side channel which exploits the L2 cache -- this is harder due to the larger size, lower bandwidth, and greater associativity -- and prohibiting shared L2 caches turns Intel Core Duo processors into Intel Core Solo processors. As before, the 'machdep.hyperthreading_allowed' sysctl will allow even the L1 data cache to be shared. Discussed with: jhb, scottl Security: See FreeBSD-SA-05:09.htt for background material.	2006-04-24 21:17:01 +00:00
Xin LI	3b28c0c6f9	Move AHC_REG_PRETTY_PRINT and AHD_REG_PRETTY_PRINT below their corresponding devices.	2006-04-24 08:44:34 +00:00
Peter Wemm	9bbf94367c	Oops. Minidumps were developed on 6.x, in without the small pv entry code. Add some strategic dump_add_page()/dump_drop_page() lines to include pv chunks in the minidumps - these operate in the direct map region like UMA.	2006-04-21 04:50:18 +00:00
Peter Wemm	c0345a84aa	Introduce minidumps. Full physical memory crash dumps are still available via the debug.minidump sysctl and tunable. Traditional dumps store all physical memory. This was once a good thing when machines had a maximum of 64M of ram and 1GB of kvm. These days, machines often have many gigabytes of ram and a smaller amount of kvm. libkvm+kgdb don't have a way to access physical ram that is not mapped into kvm at the time of the crash dump, so the extra ram being dumped is mostly wasted. Minidumps invert the process. Instead of dumping physical memory in in order to guarantee that all of kvm's backing is dumped, minidumps instead dump only memory that is actively mapped into kvm. amd64 has a direct map region that things like UMA use. Obviously we cannot dump all of the direct map region because that is effectively an old style all-physical-memory dump. Instead, introduce a bitmap and two helper routines (dump_add_page(pa) and dump_drop_page(pa)) that allow certain critical direct map pages to be included in the dump. uma_machdep.c's allocator is the intended consumer. Dumps are a custom format. At the very beginning of the file is a header, then a copy of the message buffer, then the bitmap of pages present in the dump, then the final level of the kvm page table trees (2MB mappings are expanded into a 4K page mappings), then the sparse physical pages according to the bitmap. libkvm can now conveniently access the kvm page table entries. Booting my test 8GB machine, forcing it into ddb and forcing a dump leads to a 48MB minidump. While this is a best case, I expect minidumps to be in the 100MB-500MB range. Obviously, never larger than physical memory of course. minidumps are on by default. It would want be necessary to turn them off if it was necessary to debug corrupt kernel page table management as that would mess up minidumps as well. Both minidumps and regular dumps are supported on the same machine.	2006-04-21 04:24:50 +00:00
Warner Losh	59b8f529ca	Set the rid for a resoruce allocated with rman_reserve_resource.	2006-04-20 04:16:34 +00:00
Colin Percival	2652af563e	Correct a local information leakage bug affecting AMD FPUs. Security: FreeBSD-SA-06:14.fpu	2006-04-19 07:00:19 +00:00
Peter Wemm	714d4fe9b6	If we're doing a try-alloc of a pv entry and give up early, do not forget to reduce the pv_entry_count counter. This was found by Tor Egge. In the same email, Tor also pointed out the pv_stats problem in the previous commit, but I'd forgotten about it until I went looking for this email about this allocation problem.	2006-04-18 20:17:32 +00:00
Peter Wemm	bac58593f1	pv_entry_count is more than a statistic. It is used for resource limiting. Do not compile out its counter updates if pv entry stats are turned off.	2006-04-18 20:11:00 +00:00
Alan Cox	ad740f9081	Include opt_pmap.h for PMAP_SHPGPERPROC. PR: 94509	2006-04-13 03:31:48 +00:00
Alan Cox	826c207263	Retire pmap_track_modified(). We no longer need it because we do not create managed mappings within the clean submap. To prevent regressions, add assertions blocking the creation of managed mappings within the clean submap. Reviewed by: tegge	2006-04-12 04:22:52 +00:00
Paul Saab	d8636a9ab7	Hook bce up to the build	2006-04-10 20:04:22 +00:00
John Baldwin	907d4d7f45	Cache the value of the lower half of each I/O APIC redirection table entry so that we only have to do an ioapic_write() instead of an ioapic_read() followed by an ioapic_write() every time we mask and unmask level triggered interrupts. This cuts the execution time for these operations roughly in half. Profiled by: Paolo Pisati <p.pisati@oltrelinux.com> MFC after: 1 week	2006-04-05 20:43:19 +00:00
Peter Wemm	2e4548288a	Convert pv_entry_frees and pv_entry_allocs stats counters from int to long, they wrap way too quickly.	2006-04-04 20:17:35 +00:00
Marcel Moolenaar	b1fb1bb19a	Sync with i386: Map exceptions to signals in gdb_cpu_signal() so that kgdb(1) gets a SIGTRAP when it needs to. Pointed out by: grehan@	2006-04-04 03:00:20 +00:00
Marcel Moolenaar	470d831703	The PC is register 16, not 18. Pointed out by: grehan@	2006-04-04 02:44:51 +00:00
Marcel Moolenaar	bfcdefd8aa	Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the PCB in which the context of stopped CPUs is stored. To access this PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The definition, when present, lives in <machine/kdb.h> and abstracts where MD code saves the context. Define KDB_STOPPEDPCB on i386, amd64, alpha and sparc64 in accordance to previous code.	2006-04-03 22:51:47 +00:00
Peter Wemm	68ac481184	Shrink the amd64 pv entry from 48 bytes to about 24 bytes. On a machine with large mmap files mapped into many processes, this saves hundreds of megabytes of ram. pv entries were individually allocated and had two tailq entries and two pointers (or addresses). Each pv entry was linked to a vm_page_t and a process's address space (pmap). It had the virtual address and a pointer to the pmap. This change replaces the individual allocation with a per-process allocation system. A page ("pv chunk") is allocated and this provides 168 pv entries for that process. We can now eliminate one of the 16 byte tailq entries because we can simply iterate through the pv chunks to find all the pv entries for a process. We can eliminate one of the 8 byte pointers because the location of the pv entry implies the containing pv chunk, which has the pointer. After overheads from the pv chunk bitmap and tailq linkage, this works out that each pv entry has an effective size of 24.38 bytes. Future work still required, and other problems: * when running low on pv entries or system ram, we may need to defrag the chunk pages and free any spares. The stats (vm.pmap.) show that this doesn't seem to be that much of a problem, but it can be done if needed. running low on pv entries is now a much bigger problem. The old get_pv_entry() routine just needed to reclaim one other pv entry. Now, since they are per-process, we can only use pv entries that are assigned to our current process, or by stealing an entire page worth from another process. Under normal circumstances, the pmap_collect() code should be able to dislodge some pv entries from the current process. But if needed, it can still reclaim entire pv chunk pages from other processes. * This should port to i386 really easily, except there it would reduce pv entries from 24 bytes to about 12 bytes. (I have integrated Alan's recent changes.)	2006-04-03 21:36:01 +00:00
Peter Wemm	b9eee07e36	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
Alan Cox	9c6a71e4ca	Introduce pmap_try_insert_pv_entry(), a function that conditionally creates a pv entry if the number of entries is below the high water mark for pv entries. Use pmap_try_insert_pv_entry() in pmap_copy() instead of pmap_insert_entry(). This avoids possible recursion on a pmap lock in get_pv_entry(). Eliminate the explicit low-memory checks in pmap_copy(). The check that the number of pv entries was below the high water mark was largely ineffective because it was located in the outer loop rather than the inner loop where pv entries were allocated. Instead of checking, we attempt the allocation and handle the failure. Reviewed by: tegge Reported by: kris MFC after: 5 days	2006-04-02 05:45:05 +00:00
Maksim Yevmenkin	b643101293	Add kbdmux(4) to GENERIC on amd64 Requested by: scottl Tested by: scottl	2006-03-31 23:04:48 +00:00
Scott Long	7f631a410c	Hook the MFI driver up to the build.	2006-03-29 09:57:22 +00:00
John Baldwin	8283c726e7	If the XSDT address in the RSDP for an ACPI 2.0 machine is NULL, then fall back to using the RSDT instead. ACPI-CA already follows this same strategy as a workaround for yet another instance of brain-damaged BIOS writers. PR: i386/93963 Submitted by: Masayuki FUKUI <fukui.FreeBSD@fanet.net>	2006-03-27 15:59:48 +00:00
Alan Cox	fa8053e9a9	Eliminate unnecessary invalidations of the entire TLB by pmap_remove(). Specifically, on mappings with PG_G set pmap_remove() not only performs the necessary per-page invlpg invalidations but also performs an unnecessary invalidation of the entire set of non-PG_G entries. Reviewed by: tegge	2006-03-21 18:07:42 +00:00
David Xu	39d3e6198d	Remove stale KSE code. Reviewed by: alc	2006-03-21 06:46:27 +00:00
John Baldwin	aef8cd01ed	Drop some unneeded casts since we program the kernel in C rather than C++.	2006-03-20 19:39:08 +00:00
Alexander Leidinger	79d8404261	regen: fix of linuxolator with testing in a cross-build	2006-03-20 18:54:29 +00:00
Alexander Leidinger	3a192a2050	Fix the linuxolator on amd64 (cross-build).	2006-03-20 18:53:26 +00:00
Ruslan Ermilov	e4e272bfbf	Regen.	2006-03-19 11:12:41 +00:00
Ruslan Ermilov	aefce619cf	Unbreak COMPAT_LINUX32 option support on amd64. Broken by: netchild	2006-03-19 11:10:33 +00:00
Alexander Leidinger	c85625bfe7	regen	2006-03-18 20:49:01 +00:00
Stephan Uphoff	4c0e9e8c79	Enable global pages TLB extension on Application Processors. MFC after: 3 days	2006-03-18 19:32:46 +00:00
Alexander Leidinger	1f7642e058	regen after COMPAT_43 removal	2006-03-18 18:24:38 +00:00
Alexander Leidinger	5c8919adf4	Get rid of the need of COMPAT_43 in the linuxolator. Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Obtained from: DragonFly (some parts)	2006-03-18 18:20:17 +00:00
John Baldwin	39092e79ed	Don't allow userland to set hardware watch points on kernel memory at all. Previously, we tried to allow this only for root. However, we were calling suser() on the target process rather than the current process. This means that if you can ptrace() a process running as root you can set a hardware watch point in the kernel. In practice I think you probably have to be root in order to pass the p_candebug() checks in ptrace() to attach to a process running as root anyway. Rather than fix the suser(), I just axed the entire idea, as I can't think of any good reason _at all_ for userland to set hardware watch points for KVM. MFC after: 3 days Also thinks hardware watch points on KVM from userland are bad: bde, rwatson	2006-03-14 16:13:55 +00:00
Peter Wemm	8d0593f54e	Merge/sync with i386: various cosmetic tweaks	2006-03-14 00:01:56 +00:00
Peter Wemm	cfa7ffb1d7	MFi386: The SIGFPE macros were moved to signal.h (FPE_INTOVF etc)	2006-03-14 00:01:22 +00:00
Peter Wemm	31b2d08a2d	MFi386: rename pcib_devclass to hostb_devclass (cosmetic here)	2006-03-13 23:58:40 +00:00
Peter Wemm	c8df689359	MFi386: add a TRAP_INTERRUPT case	2006-03-13 23:56:44 +00:00
Peter Wemm	29e9282e2e	Cosmetic sync with i386	2006-03-13 23:55:31 +00:00
Paul Saab	12aff6461c	Fix the format/display descriptor of vm.kmem_size and vm.kmem_free to be 'long' instead of 'int' so that sysctl(8) correctly displays the 8 returned bytes as a single 'long' instead of two 'int' values. Submitted by: peter	2006-03-13 08:13:37 +00:00
John Baldwin	8e8f0765ab	Flip the switch and don't route interrupts to hyperthreads in a HT system. In at least one benchmark this showed around a 20% performance increase. If other workloads do benefit from having hyperthreads service interrupts, we can always make this a loader tunable. MFC after: 3 days Tested by: ps	2006-03-09 16:38:52 +00:00
Stephan Uphoff	68ff3c2445	Fix exec_map resource leaks. Tested by: kris@	2006-03-08 20:21:54 +00:00
Yaroslav Tykhiy	4ffbe6ba9f	MFi386 revision 1.1220: options TDFX_LINUX --> device tdfx_linux	2006-03-06 15:29:28 +00:00
Sam Leffler	5225f08dc9	guard function decls with _KERNEL so user code can include this file	2006-03-01 05:59:56 +00:00
John Baldwin	215e7c161a	Rework how we wire up interrupt sources to CPUs: - Throw out all of the logical APIC ID stuff. The Intel docs are somewhat ambiguous, but it seems that the "flat" cluster model we are currently using is only supported on Pentium and P6 family CPUs. The other "hierarchy" cluster model that is supported on all Intel CPUs with local APICs is severely underdocumented. For example, it's not clear if the OS needs to glean the topology of the APIC hierarchy from somewhere (neither ACPI nor MP Table include it) and setup the logical clusters based on the physical hierarchy or not. Not only that, but on certain Intel chipsets, even though there were 4 CPUs in a logical cluster, all the interrupts were only sent to one CPU anyway. - We now bind interrupts to individual CPUs using physical addressing via the local APIC IDs. This code has also moved out of the ioapic PIC driver and into the common interrupt source code so that it can be shared with MSI interrupt sources since MSI is addressed to APICs the same way that I/O APIC pins are. - Interrupt source classes grow a new method pic_assign_cpu() to bind an interrupt source to a specific local APIC ID. - The SMP code now tells the interrupt code which CPUs are avaiable to handle interrupts in a simpler and more intuitive manner. For one thing, it means we could now choose to not route interrupts to HT cores if we wanted to (this code is currently in place in fact, but under an #if 0 for now). - For now we simply do static round-robin of IRQs to CPUs when the first interrupt handler just as before, with the change that IRQs are now bound to individual CPUs rather than groups of up to 4 CPUs. - Because the IRQ to CPU mapping has now been moved up a layer, it would be easier to manage this mapping from higher levels. For example, we could allow drivers to specify a CPU affinity map for their interrupts, or we could allow a userland tool to bind IRQs to specific CPUs. The MFC is tentative, but I want to see if this fixes problems some folks had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose interrupts). MFC after: 1 week	2006-02-28 22:24:55 +00:00
David Malone	0cbae93607	It seems bit 5 of cpu_feature2 is the VMX (Virtual Machine Extensions) bit. While I'm here, delete a comment that was cut and past from the cpu_features code that doesn't belong here.	2006-02-15 14:48:59 +00:00
Poul-Henning Kamp	e8444a7e6f	CPU time accounting speedup (step 2) Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64	2006-02-11 09:33:07 +00:00
Poul-Henning Kamp	eb2da9a51f	Simplify system time accounting for profiling. Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int. Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.	2006-02-08 08:09:17 +00:00
Poul-Henning Kamp	5b1a8eb397	Modify the way we account for CPU time spent (step 1) Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.	2006-02-07 21:22:02 +00:00
John Baldwin	8917b8d28c	- Always call exec_free_args() in kern_execve() instead of doing it in all the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.	2006-02-06 22:06:54 +00:00

... 3 4 5 6 7 ...

4980 Commits