1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-21 11:13:30 +00:00
Commit Graph

1470 Commits

Author SHA1 Message Date
Marcel Moolenaar
0675c65d6b s/u_int#_t/uint#_t/g 2004-09-22 23:12:46 +00:00
Marcel Moolenaar
08d3edb315 For the atomic_{add|clear|set|subtract} family of inlines, return the
old or previous value instead of void. This is not as is documented
in atomic(9), but is API (and ABI) compatible and simply makes sense.
This feature will primarily be used for atomic PTE updates in PMAP/ng.
2004-09-22 19:58:43 +00:00
Marcel Moolenaar
5c48823c36 MFp4: various style fixes, including
o  s/u_int/uint/g
o  s/#define<sp>/#define<tab>/g
o  indent macro definitions
o  Improve vertical spacing
o  Globally align line continuation character
2004-09-22 19:47:42 +00:00
John Baldwin
76764432e4 - Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
  every 18 lines.  If you hit Enter, it will print another line and prompt
  again.  If you hit space it will output another page and then prompt.
  If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
  to print.  This is useful if your trace happens to walk back onto
  0xdeadc0de and gets stuck in an endless loop.

MFC after:	1 month
Tested on:	i386, alpha, sparc64
2004-09-20 19:05:32 +00:00
Marcel Moolenaar
13e6668525 MFp4:
Completely remove the remaining EFI includes and add our own (type)
definitions instead. While here, abstract more of the internals by
providing interface functions.
2004-09-19 03:50:46 +00:00
Alan Cox
7580b56bdc Release the page queues lock earlier in pmap_protect() and pmap_remove() in
order to reduce contention.
2004-09-18 22:56:58 +00:00
Marcel Moolenaar
9f9ae8ebb7 Provide our own FPSWA definitions, instead of depending on the Intel
EFI headers and put them all in <machine/fpu.h>. The Intel EFI headers
conflict with the Intel ACPI headers (duplicate type definitions), so
are being phased out in the kernel.
2004-09-17 22:19:41 +00:00
Marcel Moolenaar
cba8d3ae49 Remove useless inclusion of <machine/fpu.h> 2004-09-17 20:42:45 +00:00
Poul-Henning Kamp
7ce1979be6 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
Marcel Moolenaar
fa3b7cae8d Catch up with other platforms: switch the default scheduler to 4BSD. 2004-09-12 05:50:32 +00:00
Scott Long
50736a153b Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform.  The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum.  However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change.  The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)
2004-09-08 04:54:19 +00:00
Marcel Moolenaar
566d143be0 Sync the busdma code with i386. The most tangible upshot is that
the alignment and boundary constraints are being respected, which
fixes the reported ATA problems with SiI chips.
I consider the busdma implementation worrisome nonetheless. Not
only is there too much MI code duplicated in MD files, there's a
lot of questionable code. I smell a wholesale, cross-platform
overhaul coming...

MT5 candidate.
2004-09-08 02:55:04 +00:00
Julian Elischer
ed062c8d66 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
Marcel Moolenaar
44af2aa001 Add aac(4) and aacp(4). The driver is 64-bit clean for roughly a year
now and has been mentioned on the freebsd-ia64 list.
2004-09-02 18:05:26 +00:00
Julian Elischer
5995adc206 Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after:	3 days
2004-08-31 07:34:54 +00:00
Julian Elischer
99e9dcb817 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
Alan Cox
bfa15df9ba Remove unnecessary check for curthread == NULL. 2004-08-30 03:52:05 +00:00
Marcel Moolenaar
224407d6a8 s/ENTRY/ENTRY_NOPROFILE/g for particular functions that do not follow
the C calling convention or are otherwise not regular functions. This
allows us to boot a profiling kernel.
2004-08-30 01:32:28 +00:00
Marcel Moolenaar
82ecff453f Catch up with the drive-by renaming of IA32 to COMPAT_IA32. Missed
11 days ago when all the other places were fixed and finally caught
by the tinderbox run...
2004-08-27 21:57:00 +00:00
Marcel Moolenaar
0f2fe153bc Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
   instruction of a functions implementation. It holds the address
   of a function descriptor. Hence the user(), btrap(), eintr() and
   bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
   be layed-out contiguously. This can not be achieved on ia64 and is
   generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
2004-08-27 19:42:35 +00:00
Alan Cox
8991a235cb The machine-independent parts of the virtual memory system always pass a
valid pmap to the pmap functions that require one.  Remove the checks for
NULL.  (These checks have their origins in the Mach pmap.c that was
integrated into BSD.  None of the new code written specifically for
FreeBSD included them.)
2004-08-27 19:06:17 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
Marcel Moolenaar
04f093dde7 Get a step closer to profiling the kernel by fixing the definitions
of the MCOUNT_ENTER, MCOUNT_EXIT and MCOUNT_DECL defines. Also make
sure there's a prototype of _MCOUNT_DECL(). This allows us to build
a kernel. There are still unresolved symbols, so linking fails.
2004-08-25 08:03:48 +00:00
Marcel Moolenaar
f0556e70bb Make profiling actually work. The gcc compiler emits a call to the
_mcount() stub when profiling is enabled. Emit this code sequence
for assembly routines as welli (MCOUNT definition in <machine/asm.h>.
We do not pass the GOT entry however as the 4th argument, because it's
not used. The _mcount() stub calls __mcount(), which does the actual
work. Define _MCOUNT_DECL to define __mcount. We do not have an
implementation of mcount(), so we define MCOUNT as empty, but have a
weak alias to _mcount() in _mcount.S.
Note that the _mcount() stub in the kernel is slightly different from
the stub in userland. This is because we do not have to worry about
nested routines in the kernel.
2004-08-25 07:42:34 +00:00
Nate Lawson
dc6851d588 Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list(). 2004-08-24 19:22:54 +00:00
David E. O'Brien
6cda6c4a35 sr(4) definately won't work on IA64. 2004-08-24 18:31:27 +00:00
Arun Sharma
2d24da614a The existing code fails some corner cases. Replace it with
ia64_bsp_adjust() which has been tested to work in all cases for
arbitrary (bsp, nslots) combinations.

reviewed by: marcel@
2004-08-16 22:09:58 +00:00
Marcel Moolenaar
344bbdbd54 As I said: the previous commit was untested... Remove an #endif which
should have ceased to exist when its corresponding #if was removed.
2004-08-16 19:05:08 +00:00
Marcel Moolenaar
97752b2cbd Catch up with the drive-by renaming of IA32 to COMPAT_IA32. It must
have been rush hour...

While here, move COMPAT_IA32 from opt_global.h to opt_compat.h like on
amd64. Consequently, it's unsafe to use the option in pcb.h. We now
unconditionally have the ia32 specific registers in the PCB.

This commit is untested.
2004-08-16 18:54:23 +00:00
Arun Sharma
646c6dd2c0 ITC.{i,d} instructions use format M41 not M42.
reviewed by: marcel@
2004-08-16 18:41:24 +00:00
Marcel Moolenaar
c66fdb617d Allocate memory in the unwinder with M_NOWAIT. We may need to provide
backtraces with locks held.
2004-08-14 05:00:37 +00:00
Marcel Moolenaar
3a00930042 In set_regs(), flush the dirty registers onto the backingstore before
we update the registers. That way we don't have any dirty registers to
worry about and also know that bsp=bspstore, which makes updating the
RSE related registers predictable.
This is not the end of it. We need more validity checks, but for now
this allows us to complete the gdb testsuite without crashing the
kernel.
2004-08-11 05:29:13 +00:00
Marcel Moolenaar
4da47b2fec Add __elfN(dump_thread). This function is called from __elfN(coredump)
to allow dumping per-thread machine specific notes. On ia64 we use this
function to flush the dirty registers onto the backingstore before we
write out the PRSTATUS notes.

Tested on: alpha, amd64, i386, ia64 & sparc64
Not tested on: arm, powerpc
2004-08-11 02:35:06 +00:00
Marcel Moolenaar
b4b7c60d70 Better preserve the original protection for the mappings we maintain.
The hardware always gives read access for privilege level 0, which
means that we cannot use the hardware access rights and privilege
level in the PTE to test whether there's a change in protection.  So,
we save the original vm_prot_t in the PTE as well.
Add pmap_pte_prot() to set the proper access rights and privilege
level on the PTE given a pmap and the requested protection.

The above allows us to compare the protection in pmap_extract_and_hold()
which was missing. While in pmap_extract_and_hold(), add pmap locking.

While here, clean up most (i.e. all but one) PTE macros we inherited
from alpha. They were either unused, used inconsistently, badly named
or simply weren't beneficial. We save the wired and managed state of
the PTE in distinct (bit) fields.

While in pte.h, s/u_int64_t/uint64_t/g

pmap locking obtained from: alc@
feedback & review by: alc@
2004-08-09 20:44:41 +00:00
Marcel Moolenaar
47a86e3cd4 Implement single stepping when we leave the kernel through the EPC syscall
path. The basic problem is that we cannot set the single stepping flag
directly, because we don't leave the kernel via an interrupt return. So,
we need another way to set the single stepping flag.
The way we do this is by enabling the lower-privilege transfer trap, which
gets raised when we drop the privilege level. However, since we're still
running in kernel space (sec), we're not yet done. We clear the lower-
privilege transfer trap, enable the taken-branch trap and continue exiting
the kernel until we branch into user space.
Given the current code, there's a total of two traps this way before
we can raise SIGTRAP.
2004-08-08 00:28:07 +00:00
Marcel Moolenaar
6aa84a056a Slightly move labels around to make sure we call ast() on our way out
after a fork(2) in fork_trampoline(). By moving the epc_syscall_return
label immediately before the call to do_ast() in epc_syscall(), we not
only achieve that but also handle the detour through exception_return
when the frame corresponds to an asynchronous kernel entry. Hence, we
simplified fork_trampoline() as a side-effect.
2004-08-07 21:55:15 +00:00
Marcel Moolenaar
7d9a8b1cd5 De-inline gdb_cpu_signal() because we need to convert the trap vectors
related to breakpoints and single stepping into SIGTRAP so gdb(1) knows
why the remote target has stopped. In particular, gdb(1) needs to know
if the reason is something of its own doing.
2004-08-07 21:40:52 +00:00
Arun Sharma
d7cf64c9a1 Use a 256MB TR instead of a 64MB TR to make sure that the kernel
text/data are covered on APs. This enables the kernel to boot on
a 4 way Intel Itanium-2 platform. This has a secondary effect of
keeping the TRs identical on BP and the APs.

reviewed by: marcel@
2004-08-04 20:09:41 +00:00
Mark Murray
d23a262fc5 Making a loadable null.ko for /dev/(null|zero) proved rather
unpopular, so remove this (mis)feature.

Encouragement provided by:	jhb (and others)
2004-08-03 19:24:54 +00:00
Maxime Henrion
9f1b87f106 Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait().  It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by:	jhb
Reviewed by:	jhb
2004-08-03 18:44:27 +00:00
Marcel Moolenaar
78b9765bee Fix 2 typos in previous commit: both s/strct/struct/ 2004-08-02 18:37:55 +00:00
Marcel Moolenaar
cb1d0dd340 Add the mem and null devices now that they are optional. 2004-08-02 17:53:06 +00:00
Marcel Moolenaar
6d28b03026 Sort the miscellaneous devices to restore ordering after the insertion
of the mem and null devices.
2004-08-02 17:50:39 +00:00
Mark Murray
a5ed4a0ad5 Remove extraneous ';'. 2004-08-01 18:51:44 +00:00
Mark Murray
8ab2f5ecc5 Break out the MI part of the /dev/[k]mem and /dev/io drivers into
their own directory and module, leaving the MD parts in the MD
area (the MD parts _are_ part of the modules). /dev/mem and /dev/io
are now loadable modules, thus taking us one step further towards
a kernel created entirely out of modules. Of course, there is nothing
preventing the kernel from having these statically compiled.
2004-08-01 11:40:54 +00:00
Alan Cox
350fb8ae6a - Add pmap locking to ia64's pmap_enter() and pmap_enter_quick(). (This
brings ia64 to parity with alpha, amd64, and i386 in this area.)
 - Prevent a race in pmap_find_pte(): If pmap_find_pte() sleeps in
   uma_zalloc(), another thread could allocate a pte at the same address.
   Instead, sleep at a higher level and retry the lookup before retrying
   the allocation.

Reviewed and tested by:	marcel@
2004-07-30 20:25:12 +00:00
Marcel Moolenaar
f95c91bcee Fix -O builds with gcc 3.4 by defining ffs as __builtin_ffs instead of
creating an inline function that just calls __builtin_ffs.
2004-07-30 07:56:53 +00:00
Poul-Henning Kamp
0658bb8ef8 Move a relic to its correct location(s): Put nfs diskless initialization
calls with the code they call.  (Yet another example of mindless copy&paste).
2004-07-28 21:54:57 +00:00
Robert Watson
1a8cfbc450 Pass a thread argument into cpu_critical_{enter,exit}() rather than
dereference curthread.  It is called only from critical_{enter,exit}(),
which already dereferences curthread.  This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.

Head nodding:	jhb, bmilekic
2004-07-27 16:41:01 +00:00
Marcel Moolenaar
6dd19a884b Work-around a gcc code generation bug for function descriptors
references (target/16559). This fixes SMP configurations.

Obtained from: arun@
2004-07-25 07:07:09 +00:00
Alan Cox
e4242deba7 In pmap_mincore() create a private copy of the pte for use after the pmap
lock is released.
2004-07-22 02:05:46 +00:00
Alan Cox
756e6d1939 Additional pmap locking
Tested by: marcel@
2004-07-21 07:01:48 +00:00
Marcel Moolenaar
fd32d93b97 Unify db_stack_trace_cmd(). All it did was look up the thread given
the thread ID and call db_trace_thread().
Since arm has all the logic in db_stack_trace_cmd(), rename the
new DB_COMMAND function to db_stack_trace to avoid conflicts on
arm.
While here, have db_stack_trace parse its own arguments so that
we can use a more natural radix for IDs. If the ID is not a thread
ID, or more precisely when no thread exists with the ID, try if
there's a process with that ID and return the first thread in it.
This makes it easier to print stack traces from the ps output.

requested by: rwatson@
tested on: amd64, i386, ia64
2004-07-21 05:07:09 +00:00
David Schultz
479f8d2214 Make FLT_ROUNDS correctly reflect the dynamic rounding mode. 2004-07-19 08:17:25 +00:00
Alan Cox
4a5be3f70a Add partial pmap locking.
Tested by: marcel@
2004-07-19 05:39:49 +00:00
Alan Cox
6fe30ff3f2 Remove unused fields from the pmap. 2004-07-16 03:42:45 +00:00
Poul-Henning Kamp
672c05d49c Preparation commit for the tty cleanups that will follow in the near
future:

rename ttyopen() -> tty_open() and ttyclose() -> tty_close().

We need the ttyopen() and ttyclose() for the new generic cdevsw
functions for tty devices in order to have consistent naming.
2004-07-15 20:47:41 +00:00
Alan Cox
3d2e54c317 Push down the acquisition and release of the page queues lock into
pmap_protect() and pmap_remove().  In general, they require the lock in
order to modify a page's pv list or flags.  In some cases, however,
pmap_protect() can avoid acquiring the lock.
2004-07-15 18:00:43 +00:00
Alan Cox
72826d0f4a A loop in pmap_remove() should use TAILQ_FOREACH_SAFE(), not
TAILQ_FOREACH(), because the loop deletes elements from the list.

Reviewed by:	marcel@
2004-07-15 03:20:00 +00:00
David Xu
53dbf30349 Add ptrace_clear_single_step(), alpha already has it for years, the function
will be used by ptrace to clear a thread's single step state.
2004-07-13 07:22:56 +00:00
Alan Cox
440382a953 Simplify pmap_protect(). 2004-07-13 06:54:23 +00:00
Alan Cox
ce8da3091f Push down the acquisition and release of the page queues lock into
pmap_remove_pages().  (The implementation of pmap_remove_pages() is
optional.  If pmap_remove_pages() is unimplemented, the acquisition and
release of the page queues lock is unnecessary.)

Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
2004-07-13 02:49:22 +00:00
Marcel Moolenaar
8bcb1e9e84 Add options KDB and GDB. KDB takes on the function of what DDB used
to be. Both DDB and GDB specify which KDB backends to include.
2004-07-11 03:20:09 +00:00
Marcel Moolenaar
1ca618fcaa Remove the now unused GDB stubs. See src/sys/gdb/* for the new KDB
backend.
2004-07-11 01:47:26 +00:00
Marcel Moolenaar
37224cd3fc Mega update for the KDB framework: turn DDB into a KDB backend.
Most of the changes are a direct result of adding thread awareness.
Typically, DDB_REGS is gone. All registers are taken from the
trapframe and backtraces use the PCB based contexts. DDB_REGS was
defined to be a trapframe on all platforms anyway.
Thread awareness introduces the following new commands:
	thread X	switch to thread X (where X is the TID),
	show threads	list all threads.

The backtrace code has been made more flexible so that one can
create backtraces for any thread by giving the thread ID as an
argument to trace.

With this change, ia64 has support for breakpoints.
2004-07-10 23:47:20 +00:00
Marcel Moolenaar
6d33366c74 Update for the KDB framework:
o  ksym_start and ksym_end changed type to vm_offset_t.
o  Make debugging support conditional upon KDB instead of DDB.
o  Call kdb_enter() instead of breakpoint().
o  Remove implementation of Debugger().
o  Call kdb_trap() according to the new world order.

unwinder:
o  s/db_active/kdb_active/g
o  Various s/ddb/kdb/g
o  Add support for unwinding from the PCB as well as the trapframe.
   Abuse a spare field in the special register set to flag whether
   the PCB was actually constructed from a trapframe so that we can
   make the necessary adjustments.

md_var.h:
o   Add RSE convenience macros.
o   Add ia64_bsp_adjust() to add or subtract from BSP while taking
    NaT collections into account.
2004-07-10 22:59:30 +00:00
Marcel Moolenaar
5a39cbaf69 Implement makectx(). The makectx() function is used by KDB to create
a PCB from a trapframe for purposes of unwinding the stack. The PCB
is used as the thread context and all but the thread that entered the
debugger has a valid PCB.
This function can also be used to create a context for the threads
running on the CPUs that have been stopped when the debugger got
entered. This however is not done at the time of this commit.
2004-07-10 19:56:00 +00:00
Marcel Moolenaar
cbc174356c Introduce the KDB debugger frontend. The frontend provides a framework
in which multiple (presumably different) debugger backends can be
configured and which provides basic services to those backends.
Besides providing services to backends, it also serves as the single
point of contact for any and all code that wants to make use of the
debugger functions, such as entering the debugger or handling of the
alternate break sequence. For this purpose, the frontend has been
made non-optional.
All debugger requests are forwarded or handed over to the current
backend, if applicable. Selection of the current backend is done by
the debug.kdb.current sysctl. A list of configured backends can be
obtained with the debug.kdb.available sysctl. One can enter the
debugger by writing to the debug.kdb.enter sysctl.
2004-07-10 18:40:12 +00:00
Marcel Moolenaar
72d44f31a6 Introduce the GDB debugger backend for the new KDB framework. The
backend improves over the old GDB support in the following ways:
o  Unified implementation with minimal MD code.
o  A simple interface for devices to register themselves as debug
   ports, ala consoles.
o  Compression by using run-length encoding.
o  Implements GDB threading support.
2004-07-10 17:47:22 +00:00
Brian Somers
0ac4013324 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
Marcel Moolenaar
1e3e78d239 MFamd64 (1.275):
Reduce the scope of the Giant lock being held for non-mpsafe syscalls.
There was way too much code being covered.
2004-07-08 21:08:07 +00:00
Marcel Moolenaar
469df33664 Better handle the break instruction trap. The runtime specification
has outlined which break numbers are software interrupts, debugger
breakpoints and ABI specific breaks. We mostly treated all break
numbers we didn't care about as debugger breakpoints.
2004-07-08 16:30:42 +00:00
Brian Somers
59e1ebc9b5 Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
Alan Cox
8fe61389a7 - Correct pmap_extract()'s return type. It should be vm_paddr_t, not
vm_offset_t.
 - Convert pmap_extract() to the ANSI style of declaration.
2004-07-05 23:18:48 +00:00
John Baldwin
0c0b25ae91 Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
Marcel Moolenaar
3d8f0528e5 Unbreak build: define __RMAN_RESOURCE_VISIBLE
See also src/sys/sys/rman.h rev. 1.21.
2004-06-30 23:55:14 +00:00
Nate Lawson
1a26ea7f2c Add machdep quirks functions. On i386, this disables acpi on systems with
BIOS dates earlier than Jan 1, 1999.  Add prototypes and quirks flags.
2004-06-30 04:42:29 +00:00
Alan Cox
2551e6f323 - Remove unused definitions.
- Move a definition inside the scope of a #ifdef _KERNEL.
2004-06-23 08:06:52 +00:00
Bruce Evans
4c5f10a672 Backed out previous commit. Blind substitution of dev_t by `struct cdev *'
was just wrong here because the dev_t's are user dev_t's.
2004-06-20 03:52:50 +00:00
Alan Cox
ffcbbfc220 Remove dead code related to pv entry allocation.
Reviewed by:	marcel@
2004-06-19 20:31:49 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Alan Cox
52fae0ba6c Neither pmap_enter() nor pmap_enter_quick() should create pv entries for
unmanaged pages.

Tested by:	marcel@
2004-06-11 20:11:41 +00:00
Poul-Henning Kamp
1930e303cf Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
Alan Cox
4e302a62f7 Reduce the number of preallocated pv entries and lpte entries in
pmap_init().

Tested by:	marcel@
2004-06-11 04:24:35 +00:00
Poul-Henning Kamp
2140d01b27 Machine generated patch which changes linedisc calls from accessing
linesw[] directly to using the ttyld...() functions

The ttyld...() functions ar inline so there is no performance hit.
2004-06-04 16:02:56 +00:00
Tim J. Robbins
cc05397ffc Remove checks for curthread == NULL - it can't happen. 2004-06-03 10:22:47 +00:00
Poul-Henning Kamp
fd360128ff Add missing <sys/module.h> instances which were shadowed by the nested
include in <sys/kernel.h>
2004-06-03 05:58:30 +00:00
Tim J. Robbins
fa2a4d0595 Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by:	jhb
2004-06-03 01:47:37 +00:00
Poul-Henning Kamp
138fbf675a Gainfully employ the new ttyioctl in the trivial cases. 2004-06-01 13:49:28 +00:00
Thomas Moestl
65e29c4822 Retire cpu_sched_exit(); it is not used any more. 2004-05-26 12:09:39 +00:00
Bruce Evans
b2321e7cdb Moved most of the "MI" definitions and declarations from <machine/profile.h>
to <sys/gmon.h>.  Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.
2004-05-19 15:41:26 +00:00
Stefan Farfeleder
b1aa0ba527 <stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is
defined.  Otherwise first including <wchar.h> and then <stdint.h> leads to no
WINT_M{AX,IN} at all.

PR:		64956
Approved by:	das (mentor)
2004-05-18 16:04:57 +00:00
Marcel Moolenaar
875bcd3528 Fix typo in comment. While here, end the sentence with a period and
remove the empty line between the fdc and sio devices. The empty
line suggests that the comment applies to fdc only while it applies
to all following devices and options.

Typo spotted by: ru@
2004-05-17 18:36:14 +00:00
Marcel Moolenaar
ef1eb2f330 Unbreak build due to previous commit: now that elf_reloc_internal()
gets the relocation base passed in relocbase, we cannot declare a
local variable with the same name. Assume the argument holds the
same value as the local variable did...
2004-05-17 07:11:37 +00:00
Marcel Moolenaar
f2ab7b8bb0 filter out the fdc(4) and sio(4) devices and corresponding options.
Note that cy(4) uses COM_MULTIPORT, so we need to keep that option.
2004-05-17 07:03:01 +00:00
Peter Wemm
e8855d4f97 Make a small revision to the api between the elf linker core and the
elf_reloc() backends for two reasons.  First, to support the possibility
of there being two elf linkers in the kernel (eg: amd64), and second, to
pass the relocbase explicitly (for relocating .o format kld files).
2004-05-16 20:00:28 +00:00
Marcel Moolenaar
4c86725c82 Revert previous commit. We should not get any FP traps from within
the kernel. We can guarantee this by resetting the FP status register.
This masks all FP traps. The reason we did get FP traps was that we
didn't reset the FP status register in all cases.

Make sure to reset the FP status register in syscall(). This is one of
the places where it was forgotten.

While on the subject, reset the FP status register only when we trapped
from user space.
2004-05-07 05:35:31 +00:00
Marcel Moolenaar
99aa060c2b Make sure to sanitize the FP status register. Specifically this
masks all FP traps, which should not happen in the kernel.
2004-05-07 05:29:12 +00:00
Nate Lawson
869ec176fc Make unnecessary globals static and remove unused includes.
Pointed out by:	cscout
2004-05-06 02:18:58 +00:00
Nate Lawson
65a7c90189 Add an MI implementation of the ACPI global lock routines and retire the
individual asm versions.  The global lock is shared between the BIOS and
OS and thus cannot use our mutexes.  It is defined in section 5.2.9.1 of
the ACPI specification.

Reviewed by:	marcel, bde, jhb
2004-05-05 20:04:14 +00:00
Marcel Moolenaar
e160e18ba6 Floating-point faults and exceptions can happen in the kernel too.
Do not panic when it happens; handle them.

Run into by: das
2004-05-03 04:13:31 +00:00
Marcel Moolenaar
1b3564abb3 Catch- and cleanup:
o  Fix and improve comments and references,
o  Add PFIL_HOOKS, UFS_ACL and UFS_DIRHASH,
o  Switch from SCHED_4BSD to SCHED_ULE,
o  Remove SCSI_DELAY (there's no SCSI support),
2004-05-03 00:10:59 +00:00
David E. O'Brien
4e744b5e7f Spell Ethernet correctly. 2004-05-02 18:57:29 +00:00
Marcel Moolenaar
73b2b50372 Verify the MADT checksum before using the table.
Submitted by: njl
2004-05-01 04:08:14 +00:00
David Schultz
be3930682a Hide FLT_EVAL_METHOD and DECIMAL_DIG in pre-C99 compilation
environments.

PR:		63935
Submitted by:	Stefan Farfeleder <stefan@fafoe.narf.at>
2004-04-25 02:36:29 +00:00
Nate Lawson
8ec94874b2 Don't check for NULL, device_get_softc() always succeeds. 2004-04-21 02:10:58 +00:00
Alan Cox
3edd4a4094 MFamd64
Simplify the sf_buf implementation.  In short, make it a veneer
 over the direct virtual-to-physical mapping.
2004-04-18 07:11:12 +00:00
Alan Cox
4e67150a95 Remove a comment that refers to avail_start and avail_end as these
variables no longer exist.
2004-04-11 06:37:36 +00:00
Alan Cox
b14d6acced - pmap_kenter_temporary() is unused by machine-independent code. Therefore,
move its declaration to the machine-dependent header file on those
   machines that use it.  In principle, only i386 should have it.
   Alpha and AMD64 should use their direct virtual-to-physical mapping.
 - Remove pmap_kenter_temporary() from ia64.  It is unused.  Approved
   by: marcel@
2004-04-10 22:41:46 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Alan Cox
a3b706071c Remove avail_end. As of yesterday, it is unused. 2004-04-06 01:38:28 +00:00
Alan Cox
c8607538c8 Remove avail_start on those platforms that no longer use it. (Only amd64
does anything with it beyond simple initialization.)
2004-04-05 04:08:00 +00:00
Alan Cox
bdb93eb248 Remove unused arguments from pmap_init(). 2004-04-05 00:37:50 +00:00
Alan Cox
121230a40d In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it
should not.  Add a new parameter so that the caller can specify which is
the case.

Reported by:	dillon
2004-04-03 09:16:27 +00:00
Marcel Moolenaar
60f66d9174 MFi386: correctly calculate the top-of-stack when a kthread is created
with a larger kernel stack. Remove inclusion of opt_kstack_pages.h now
that it's unused.
2004-03-27 17:44:25 +00:00
Marcel Moolenaar
6beee8df28 In breakpoint(), use a different immediate to make sure we can
distinguish between debugger inserted breakpoints and fixed
breakpoints. While here, make sure the break instruction never
ends up in the last slot of a bundle by forcing it to be an
M-unit instruction. This makes it easier for use to skip over
it.
2004-03-21 01:41:29 +00:00
Alan Cox
1af1ebb84e - Add uiomove_fromphys() implementations to alpha and ia64. These only
differ trivially from amd64.
 - Correct a spelling error in a comment.
2004-03-20 21:06:20 +00:00
Marcel Moolenaar
a36bdc0606 Introduce the cpumask_t type. The purpose of the type is to create a
level of abstraction for any and all CPU mask and CPU bitmap variables
so that platforms have the ability to break free from the hard limit
of 32 CPUs, simply because we don't have more bits in an u_int. Note
that the type is not supposed to solve massive parallelism, where
the number of CPUs can be larger than the width of the widest integral
type. As such, cpumask_t is not supposed to be a compound type. If
such would be necessary in the future, we can deal with the issues
then and there. For now, it can be assumed that the type is integral
and unsigned.

With this commit, all MD definitions start off as u_int. This allows
us to phase-in cpumask_t at our leasure without breaking anything.
Once cpumask_t is used consistently, platforms can switch to wider
(or smaller) types if such would be beneficial (or not; whatever :-)

Compile-tested on: i386
2004-03-20 20:41:40 +00:00
Marcel Moolenaar
e10f4ce153 Replace uint64_t with unsigned long in struct dbreg. 2004-03-20 05:27:14 +00:00
Marcel Moolenaar
5ab92b80d0 Remove the last traditional hints. These hints only served the purpose
for uart(4) to figure out which device to use as console. Use this file
to define hw.uart.console instead so that we don't have to put it in
the default loader.conf, which makes it hard to override.
2004-03-20 04:23:03 +00:00
John-Mark Gurney
4de27366d1 sync comment with i386's isa.c.. This removes a comment that is YEARS
old...
2004-03-17 21:45:55 +00:00
Alan Cox
90ecfebd82 Refactor the existing machine-dependent sf_buf_free() into a machine-
dependent function by the same name and a machine-independent function,
sf_buf_mext().  Aside from the virtue of making more of the code machine-
independent, this change also makes the interface more logical.  Before,
sf_buf_free() did more than simply undo an sf_buf_alloc(); it also
unwired and if necessary freed the page.  That is now the purpose of
sf_buf_mext().  Thus, sf_buf_alloc() and sf_buf_free() can now be used
as a general-purpose emphemeral map cache.
2004-03-16 19:04:28 +00:00
Scott Long
11d905ecd8 Now that contigfree() does not require Giant, don't grab it in busdma. 2004-03-13 15:42:59 +00:00
Marcel Moolenaar
39209a2fad Identify the Deerfield processor. Deerfield is a low-voltage variant
based on the Madison core and targeting the low end of the spectrum.
Its clock frequency is 1Ghz, whereas Madison starts at 1.3Ghz. Since
the CPUID information is the same for Madison and Deerfield, we use
the clock frequency to identify the processor.
Supposedly the Deerfield only uses 62W, which seems to be less than
modern Xeon processors (about 70W) and about half what a Madison would
need.
2004-03-10 22:23:20 +00:00
Alan Cox
fcffa790e9 Retire pmap_pinit2(). Alpha was the last platform that used it. However,
ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps,
there has been no reason for having this function on Alpha.  Briefly,
when pmap_growkernel() relied upon the list of all processes to find and
update the various pmaps to reflect a growth in the kernel's valid
address space, pmap_init2() served to avoid a race between pmap
initialization and pmap_growkernel().  Specifically, pmap_pinit2() was
responsible for initializing the kernel portions of the pmap and
pmap_pinit2() was called after the process structure contained a pointer
to the new pmap for use by pmap_growkernel().  Thus, an update to the
kernel's address space might be applied to the new pmap unnecessarily,
but an update would never be lost.
2004-03-07 21:06:48 +00:00
Alan Cox
d82852fbb9 Integrate the code from pmap_pinit2() into pmap_pinit(), leaving
pmap_pinit2() empty.

Approved by:	marcel
2004-03-07 07:43:13 +00:00
Alan Cox
925d2fedf5 Remove unused declarations. (Some time ago, these variables became fields
of vm/vm.h's struct kva_md_info.)
2004-03-07 07:13:15 +00:00
Lukas Ertl
1bcf24ee9d Fix syntax errors and wrong function prototypes in several MD header
files when using non-GNUC compilers.

PR:             kern/58515
Submitted by:   Stefan Farfeleder <stefan@fafoe.narf.at>
Approved by:    grog (mentor), obrien
2004-03-05 09:19:59 +00:00
Marcel Moolenaar
27e327fdaf Do not pre-map the I/O port space. On the Intel Tiger 4 this conflicts
with a memory mapped I/O range that's immediately before it and is
not 256MB aligned. As a result, when an address is accessed in the
memory mapped range and a direct mapping is added for it, it overlaps
with the pre-mapped I/O port space and causes a machine check.

Based on a patch from: arun@
2004-02-22 02:10:48 +00:00
Poul-Henning Kamp
dc08ffec87 Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
2004-02-21 21:10:55 +00:00
Poul-Henning Kamp
8e1f1df080 Device megapatch 3/6:
Add missing D_TTY flags to various drivers.

Complete asserts that dev_t's passed to ttyread(), ttywrite(),
ttypoll() and ttykqwrite() have (d_flags & D_TTY) and a struct tty
pointer.

Make ttyread(), ttywrite(), ttypoll() and ttykqwrite() the default
cdevsw methods for D_TTY drivers and remove the explicit initializations
in various drivers cdevsw structures.
2004-02-21 20:41:11 +00:00
Poul-Henning Kamp
c9c7976f7f Device megapatch 1/6:
Free approx 86 major numbers with a mostly automatically generated patch.

A number of strategic drivers have been left behind by caution, and a few
because they still (ab)use their major number.
2004-02-21 19:42:58 +00:00
Poul-Henning Kamp
0b7ed341e1 Change the disk(9) API in order to make device removal more robust.
Previously the "struct disk" were owned by the device driver and this
gave us problems when the device disappared and the users of that device
were not immediately disappearing.

Now the struct disk is allocate with a new call, disk_alloc() and owned
by geom_disk and just abandonned by the device driver when disk_create()
is called.

Unfortunately, this results in a ton of "s/\./->/" changes to device
drivers.

Since I'm doing the sweep anyway, a couple of other API improvements
have been carried out at the same time:

The Giant awareness flag has been flipped from DISKFLAG_NOGIANT to
DISKFLAG_NEEDSGIANT

A version number have been added to disk_create() so that we can detect,
report and ignore binary drivers with old ABI in the future.

Manual page update to follow shortly.
2004-02-18 21:36:53 +00:00
Marcel Moolenaar
a753b14687 Sort PFIL_HOOKS. 2004-01-27 20:22:53 +00:00
Jeff Roberson
048ac395be - Recruit some new ULE users by making it the default scheduler in GENERIC.
ULE will be in a probationary period to determine whether it will be left
   as the default in 5.3 which would likely mean the rest of the 5.x series.
2004-01-24 21:38:52 +00:00
Jacques Vidrine
5864cda7c6 Add PFIL_HOOKS to the GENERIC kernel configuration, primarily so
that one can load the IPFilter module (which requires PFIL_HOOKS).

Requested by:	Many, for over a year
2004-01-24 14:59:51 +00:00
Marcel Moolenaar
28466ae036 Fix handling of FP traps:
o  For traps, the cr.iip register points to the next instruction to
   execute on interrupt return (modulo slot). Since we need to get
   the bundle of the instruction that caused the FP fault/trap, make
   sure we fetch the previous bundle if the next instruction is in
   fact the first in a bundle.
o  When we call the FPSWA handler, we need to tell it whether it's
   a trap or a fault (first argument). This was hardcoded to mean a
   fault.

Also, for FP faults, when a fault is converted to a trap, adjust the
cr.iip and cr.ipsr registers to point to the next instruction. This
makes sure that the SIGFPE handler gets a consistent state.
2004-01-20 03:29:24 +00:00
Marcel Moolenaar
dd45fd9791 s/framep/tf/g -- this normalizes on the use of tf to point to the
trapframe and improves grep-ability.
2004-01-20 02:35:46 +00:00
Dag-Erling Smørgrav
2d6853a650 Whitespace nit. 2004-01-13 15:30:36 +00:00
Jacques Vidrine
e4dc8baa84 Provide sysarch(2) prototypes in the MD sysarch.h headers. While I'm
at it, use the ANSI C generic pointer type for the second argument,
thus matching the documentation.

Remove the now extraneous (and now conflicting) function declarations
in various libc sources.  Remove now unnecessary casts.

Reviewed by:	bde
2004-01-09 16:52:09 +00:00
David Xu
a30ec4b99c Make sigaltstack as per-threaded, because per-process sigaltstack state
is useless for threaded programs, multiple threads can not share same
stack.
The alternative signal stack is private for thread, no lock is needed,
the orignal P_ALTSTACK is now moved into td_pflags and renamed to
TDP_ALTSTACK.
For single thread or Linux clone() based threaded program, there is no
semantic changed, because those programs only have one kernel thread
in every process.

Reviewed by: deischen, dfr
2004-01-03 02:02:26 +00:00
Mike Silbersack
ddeb5b242e Track three new sendfile-related statistics:
- The number of times sendfile had to do disk I/O
- The number of times sfbuf allocation failed
- The number of times sfbuf allocation had to wait
2003-12-28 08:57:09 +00:00
Mike Silbersack
5caf2b00f0 Move the declaration of sfbufspeak and sfbufsused to mbuf.h,
and use imax instead of max, as sfbufspeak and sfbufsused
are signed.

Submitted by:   bde
2003-12-28 01:43:22 +00:00
Mike Silbersack
5eda9873e9 Track current and peak sfbuf usage, export the values via sysctl. 2003-12-27 07:52:47 +00:00
Marcel Moolenaar
b3378ed911 Don't use NULL with integral types. 2003-12-24 19:55:07 +00:00
Peter Wemm
7655ebdaaa Return AE_OK for stub functions returning ACPI_STATUS, not NULL 2003-12-24 05:26:26 +00:00
Peter Wemm
c15e347e22 GC the unused <machine/kse.h> file. 2003-12-24 00:51:30 +00:00
Peter Wemm
9b68618df0 Add an additional field to the elf brandinfo structure to support
quicker exec-time replacement of the elf interpreter on an emulation
environment where an entire /compat/* tree isn't really warranted.
2003-12-23 02:42:39 +00:00
Peter Wemm
59553d37df Add missing #include "opt_compat.h" so that the compatability function
freebsd4_freebsd32_sigreturn() is defined when expected.  This should
unbreak the tinderbox. Sorry.
2003-12-18 06:59:18 +00:00
Marcel Moolenaar
a8434283fc In set_mcontext(), take into account that kse_switchin(2) will
eventually be passed an async. context as well as a syscall
context.
While here, fix a serious bug in that if the trapframe is a
syscall frame, but we're restoring an async context, we need
to clear the FRAME_SYSCALL flag so that we leave the kernel
via exception_restore.
2003-12-14 01:59:31 +00:00
Peter Wemm
64d85faa1c Assimilate ia64 back into the fold with the common freebsd32/ia32 code.
The split-up code is derived from the ia64 code originally.

Note that I have only compile-tested this, not actually run-tested it.
The ia64 side of the force is missing some significant chunks of signal
delivery code.
2003-12-11 01:05:09 +00:00
Peter Wemm
ba3fa22e8c Fix last second typo. 2003-12-10 22:59:03 +00:00
Peter Wemm
419d43c635 Use gcc's superior ffs() builtin. 2003-12-10 22:51:40 +00:00
Peter Wemm
a80f5f272c Use ffs(x) == popcnt(x ^ (x - 1)) to implement 64 bit ffsl(). gcc's
ffs() builtin uses this already but truncates the upper 32 bits.
2003-12-10 22:47:02 +00:00
Marcel Moolenaar
b291daba6f Don't panic for misalignment traps when the onfault handler is set.
Not all transfers between kernel and user space are byte oriented
and thus alignment safe. Especially fuword*() and suword*() are
sensitive to alignment but in general more optimal than block copies.
By catching the misalignment trap we avoid pessimizing the common
case of properly aligned memory accesses which we would do if we
were to use byte copies or adding tests for proper alignment.

Note that the expectation that the kernel produces aligned pointers
is unchanged. This change therefore relates to possible unaligned
pointers generated in userland.
2003-12-09 09:52:14 +00:00
Nate Lawson
cac6460cfe Use the ACPI-CA definitions for the various APIC tables instead of our
own.
2003-12-09 03:04:19 +00:00
David E. O'Brien
a5b5101f5e Move the bktr(4) <arch>/include/ioctl_{bt848,meteor}.h files to dev/bktr
as these ioctl's aren't MD.  This also means they are installed in
/usr/include/dev/bktr now.  Also provide compatability wrappers for
where these headers lived in 4.x.
2003-12-08 07:22:42 +00:00
Marcel Moolenaar
47eb01b822 Simplify the contexts created by the kernel and remove the related
flags. We now create asynchronous contexts or syscall contexts only.
Syscall contexts differ from the minimal ABI dictated contexts by
having the scratch registers saved and restored because that's where
we keep the syscall arguments and syscall return values.
Since this change affects KSE, have it use kse_switchin(2) for the
"new" syscall context.
2003-12-07 20:47:33 +00:00
Warner Losh
05a463a03d Ooops. These are still used by the bktr driver. David O'Brien has
plans for dealing, but I'll let him deal.

Pointy hat to: imp@
2003-12-07 06:37:32 +00:00
Warner Losh
65b4a1b917 Remote meteor driver. It hasn't compiled in over 3 years. If someone
makes it compile again, and can test it, we can restore the driver to
the tree.
2003-12-07 04:41:11 +00:00
John Baldwin
798a45964d - Split cpu_mp_probe() into two parts. cpu_mp_setmaxid() is still called
very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid.
  cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is
  actually present and sets mp_ncpus and all_cpus.  Splitting these up
  allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just
  setting mp_maxid to MAXCPU in cpu_mp_setmaxid().  This could allow the
  CPU probing code to live in a module, for example, since modules
  sysinit's in modules cannot be invoked prior to SI_SUB_KLD.  This is
  needed to re-enable the ACPI module on i386.
- For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating
  its contents in a few places.  Also, add a smp_cpu_enabled() function
  to avoid duplicating some code.  There is room for further code
  reduction later since much of this code is also present in cpu_mp_start().
- All archs besides i386 still set mp_maxid to the same values they set it
  to before this change.  i386 now sets mp_maxid to MAXCPU.

Tested on:	alpha, amd64, i386, ia64, sparc64
Approved by:	re (scottl)
2003-11-21 22:23:26 +00:00
Marcel Moolenaar
1fc7ca0fb1 Set the ACPI processor Id in the PCPU structure so that CPU idling
on SMP systems has a chance of working. This was a loose end of the
implementation of the ACPI Cx idle states. Since our logical CPU Id
is the ACPI processor Id, we do not need to jump through hoops to
obtain it.

Approved: re@ (jhb)
2003-11-20 16:42:39 +00:00
Peter Wemm
0bfbe7b935 Widen the enable/disable helper function's argument in line with the
ithread_create() changes etc.  This should be mostly a NOP.
2003-11-17 06:10:15 +00:00
Bruce Evans
81bbee5996 Fixed a pedantic syntax error (a stray semicolon at the end of
PCPU_MD_FIELDS).
2003-11-17 03:40:41 +00:00
Alan Cox
0ec3db3072 - Remove unnecessary synchronization from sf_buf_init(). (There is only
one active CPU when sf_buf_init() is performed.)
2003-11-16 23:40:06 +00:00
Alan Cox
e45db9b837 - Modify alpha's sf_buf implementation to use the direct virtual-to-
physical mapping.
 - Move the sf_buf API to its own header file; make struct sf_buf's
   definition machine dependent.  In this commit, we remove an
   unnecessary field from struct sf_buf on the alpha, amd64, and ia64.
   Ultimately, we may eliminate struct sf_buf on those architecures
   except as an opaque pointer that references a vm page.
2003-11-16 06:11:26 +00:00
Nate Lawson
b72e9cf526 Add the pc_acpi_id PCPU member. The new acpi_cpu driver uses this to
dereference the softc.
2003-11-15 18:58:29 +00:00
Marcel Moolenaar
eea3bbdff8 Remove ia64_highfp_load() now that it's unused. 2003-11-12 03:24:34 +00:00
Marcel Moolenaar
0d9ae4e24e Further work-out the handling of the high FP registers. The most
important change is in cpu_switch() where we disable the high FP
registers for the thread that we switch-out if the CPU currently
has its high FP registers. This avoids that the high FP registers
remain enabled for the thread even when the CPU has unloaded them
or the thread migrated to another processor.
Likewise, when we switch-in a thread of that has its high FP
registers on the CPU, we enable them. This avoids an otherwise
harmless, but unnecessary trap to have them enabled.

The code that handles the disabled high FP trap (in trap()) has
been turned into a critical section for the most part to avoid
being preempted. If there's a race, we bail out and have the
processor trap again if necessary.

Avoid using the generic ia64_highfp_save() function when the
context is predictable. The function adds unnecessary overhead.
Don't use ia64_highfp_load() for the same reason. The function
is now unused and can be removed.

These changes make the lazy context switching of the high FP
registers in an UP kernel functional.
2003-11-12 01:26:02 +00:00
Marcel Moolenaar
a5ba2b5cc4 Save and restore the high FP registers in {g|s}_mcontext(). Note
that we currently do not keep track of whether the thread has
actually used the high FP registers before. If not, we should
not save them in the context which automaticly means that we
also would not restore them from the context. For now, do it
unconditionally so that we can reach functional completeness.
2003-11-11 09:53:37 +00:00
Marcel Moolenaar
9d52656a5a Fix a nasty bug that got exposed when the sendsig() and sigreturn()
functions switched to using {g|s}et_mcontext(). The problem is that
sigreturn(), being a syscall, can be given an async. context (i.e.
one corresponding to an interrupt or trap). When this happens, we
try to return to user mode via epc_syscall_return with a trapframe
that can only be used to return to user mode via exception_restore.

To fix this, we check the frame's flags immediately prior to
epc_syscall_return and branch to exception_restore for non-syscall
frames. Modify the assertion in set_mcontext() to check that if
there's a mismatch, it's because of sigreturn().
2003-11-11 09:25:19 +00:00
Marcel Moolenaar
9422d61a1f In get_mcontext(), do not update bspstore and ndirty in the trapframe.
Only update them in the newly created context to reflect the state
after copying the dirty registers onto the user stack. If we were to
update the trapframe, we lose the state at entry into the kernel. We
may need that after we create the context, such as for KSE upcalls.

We have to update the trapframe after writing the dirty registers to
the user stack for signal delivery to work. But this is best done in
sendsig() itself where it applies, not in get_mcontext() where it's
done unconditionally.
2003-11-10 05:28:05 +00:00
Marcel Moolenaar
3534a08109 When a thread is being swapped-out, save the high FP registers. We
have a pointer in the PCPU to the PCB of the thread that currently
has its high FP registers loaded.
2003-11-09 23:13:23 +00:00
Marcel Moolenaar
ac8c7680a6 Use get_mcontext() to construct the signal context in sendsig() and
use set_mcontext() to restore the context in sigreturn(). Since we
put the syscall number and the syscall arguments in the trapframe
(we don't save the scratch registers for syscalls, which allows us
to reuse the space to our advantage), create a MD specific flag so
that we save the scratch registers even for syscalls. We would not
be able to restart a syscall otherwise.

The signal trampoline does not need to flush the regiters anymore,
because get_mcontext() already handles that. In fact, if we set up
the context correctly, we do not need to have a trampoline at all.
This change however only minimally changes the trampoline code. In
follow-up commits this can be further optimized.

Note that normally we preserve cfm and iip in the trapframe created
by the EPC syscall path when we restore a context in set_mcontext()
because those fields are not normally set for a synchronuous context.
The kernel puts the return address and frame info of the syscall
stub in there. By preserving these fields we hide this detail from
userland which allows us to use setcontext(2) for user created
contexts. However, sigreturn() is commonly called from the trampoline,
which means that if we preserve cfm and iip in all cases, we would
return to the trampoline after the sigreturn(), which means we hit
the safety net: we call exit(2). So, we do not preserve cfm and iip
when we have a synchronous context that also has scratch registers
(the uncommon context created by sendsig() only), under the assumption
that if such a context is created in userland, something special is
going on and the use of cfm and iip is then just another quirk. All
this is invisible in the common case.
2003-11-09 22:17:36 +00:00
Marcel Moolenaar
fcaa2925a9 Change the clear_ret argument of get_mcontext() to be a flags argument.
Since all callers either passed 0 or 1 for clear_ret, define bit 0 in
the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI
code for possible (but unlikely) future use. The remaining bits are for
use by MD code.

This change is triggered by a need on ia64 to have another knob for
get_mcontext().
2003-11-09 20:31:04 +00:00
Marcel Moolenaar
00bd917263 Remove the atkbd, psm, sc and vga devices. Most ia64 boxes out there
are zx1 based machines and they don't particularly like it when we
poke at them with PC legacy code. The atkbd and psm devices were
disabled in the hints file so that one could enable them on machines
that support legacy devices, but that's not really something you can
expect from a first-time installer. This still leaves syscons (sc)
and the vga device, which were enabled by default and wrecking havoc
anyway. We could disable them by default like the atkbd and psm
devices, but there's really no point in pretending we're in a better
shape that way.
2003-11-08 23:19:13 +00:00
Scott Long
eb3b7bf69f Document the lockfunc and lockfuncarg arguments to bus_dma_tag_create() in
the busdma headers.
2003-11-07 23:29:42 +00:00
John Baldwin
dac33f12cc Regen. 2003-11-07 20:30:30 +00:00
John Baldwin
a060e9b7ef Sync with global syscalls.master. ptrace(), dup(), pipe(), ktrace(),
ia32_sigaltstack(), sysarch(), issetugid(), utrace(), and ia32_sigaction()
are MP safe.
2003-11-07 20:27:16 +00:00
Marcel Moolenaar
51e25af386 Add support for unaligned ld2, st2, st4 and st8. While here, make
sure we handle stacked registers properly by taking into account
that:
1. bspstore points after the frame (due to cover),
2. we need to adjust for intermediate NaT collections.
2003-11-06 04:26:40 +00:00
Marcel Moolenaar
2642a8845b Handle unaligned 4-byte loads. While in the neighborhood, remove the
cr.isr sanity check. We actually encounter insanities, which very
likely means that the insanity check itself is insane. Remove an empty
comment while I'm at it.
2003-11-03 08:04:04 +00:00
Marcel Moolenaar
6537124772 Add a bogus definition of __va_list for use by lint. Make it visible
only when lint is defined to protect builds with non-GNU compilers.
2003-11-03 05:04:09 +00:00
Marcel Moolenaar
fcca8c1dde Remove headers copied from i386 and either useless or wrong on ia64.
An example of useless is bios.h. An example of wrong is msdos.h (due
to the use of long for 32-bit fields).

display.h cannot be removed because it's used by syscons. That header
however has no platform dependency and shouldn't really be here.

Removal if these headers may cause build failures in the ports tree.
It's the ports that need fixing in that case.

Tested with: buildworld, LINT
2003-11-02 09:19:07 +00:00
Marcel Moolenaar
3bdfa17c6c When switching the RSE to use the kernel stack as backing store, keep
the RNAT bit index constant. The net effect of this is that there's
no discontinuity WRT NaT collections which greatly simplifies certain
operations. The cost of this is that there can be up to 504 bytes of
unused stack between the true base of the kernel stack and the start
of the RSE backing store. The cost of adjusting the backing store
pointer to keep the RNAT bit index constant, for each kernel entry,
is negligible.

The primary reasons for this change are:
1. Asynchronuous contexts in KSE processes have the disadvantage of
   having to copy the dirty registers from the kernel stack onto the
   user stack. The implementation we had so far copied the registers
   one at a time without calculating NaT collection values. A process
   that used speculation would not work. Now that the RNAT bit index
   is constant, we can block-copy the registers from the kernel stack
   to the user stack without having to worry about NaT collections.
   They will be in the right place on the user stack.
2. The ndirty field in the trapframe is now also usable in userland.
   This was previously not the case because ndirty also includes the
   space occupied by NaT collections. The value could be off by 8,
   depending on the discontinuity. Now that the RNAT bit index is
   contants, we have exactly the same number of NaT collection points
   on the kernel stack as we would have had on the user stack if we
   didn't switch backing stores.
3. Debuggers and other applications that use ptrace(2) can now copy
   the dirty registers from the kernel stack (using ptrace(2)) and
   copy them whereever they want them (onto the user stack of the
   inferior as might be the case for gdb) without having to worry
   about NaT collections in the same way the kernel doesn't have to
   worry about them.

There's a second order effect caused by the randomization of the
base of the backing store, for it depends on the number of dirty
registers the processor happened to have at the time of entry into
the kernel. The second order effect is that the RSE will have a
better cache utilization as compared to having the backing store
always aligned at page boundaries. This has not been measured and
may be in practice only minimally beneficial, if at all measurable.
2003-10-28 19:38:26 +00:00
Marcel Moolenaar
95b0df9df2 The previous commit removed both clause 3 and clause 4 from the UCB
license. Only clause 3 has been revoked. Restore the fourth clause
as clause 3.

Pointed out by: das@

Remove my name as a copyright holder since I don't use a BSD license
compatible or comparable to the UCB license. I choose not to add a
complete second license for my work for aesthetic reasons, nor to
replace the UCB license on grounds of rewriting more than 90% of the
source files. The rewrite can also be seen as an enhancement and since
the files were practically empty, it's rather trivial to have changed
90% of the files.
2003-10-27 22:54:34 +00:00
Marcel Moolenaar
f74fae21b8 Add support for userland to access I/O port space. This is primarily
added for XFree86. There are 2 reasons for doing this with sysarch():
1. The memory mapped I/O space is not at a fixed physical address. An
   application has to use some interface to get the base address. It
   gets worse if the machine has multiple memory mapped I/O spaces.
2. Access to the memory mapped I/O space needs to happen through a
   translation that is flagged as uncachable. There's no interface
   that allows a process to do uncached memory I/O, other than though
   /dev/mem (possibly).

So, until we either disallow direct access to I/O or bus space from
userland or have a better way of doing this, sysarch() has the least
negative impact on existing interfaces.
2003-10-27 05:45:35 +00:00
Marcel Moolenaar
3a988c5c87 Remove unused header. See also ia64/disasm/disasm.h. 2003-10-24 06:53:43 +00:00
Marcel Moolenaar
2a0a749f39 Remove ia64_pack_bundle() and ia64_unpack_bundle(). They are not
used anymore.
2003-10-24 06:52:21 +00:00
Marcel Moolenaar
4d85274d1a Remove unused file. db_disasm() has been implemented in db_interface.c
now.
2003-10-24 06:48:41 +00:00
Marcel Moolenaar
5664617492 Implement db_disasm() by using the new disassembler. Temporarily
unimplement db_write_breakpoint() and db_clear_breakpoint().
2003-10-24 06:42:03 +00:00
Arun Sharma
f47392f4c2 Use a TR of size 1 << IA64_ID_PAGE_SHIFT instead of 16M to avoid
overlapping TR/TC entries (which results in a machine check). Note
that we don't look at the size of the memory descriptor, because
it doesn't guarantee non-overlap.

With this change, a UP kernel could boot on a Intel Tiger4 machine
with the following options:

options         LOG2_ID_PAGE_SIZE=26		# 64M
options         LOG2_PAGE_SIZE=14               # 16K

Approved by: marcel
2003-10-24 04:56:58 +00:00
Marcel Moolenaar
5c03a7c7f9 Don't use fuword() or suword() unconditionally. They explicitly
disallow reading or writing.
2003-10-24 02:33:26 +00:00
Marcel Moolenaar
3fc58f92dc Remove two unused fields in the operand structure (o_read & o_write). 2003-10-24 02:05:53 +00:00
Marcel Moolenaar
764015afda Cleanup. Remove the md_flags for threads. It's not used. The flags
we had were bogus.
While here, reassign the copyright to the Project. There's nothing
in this files that originates from NetBSD, especially now that the
FreeBSD/alpha bits have been removed, but even then the amount of
inherited code that we actually used was nil.
2003-10-23 06:41:59 +00:00
Marcel Moolenaar
32efda28bf Reimplement unaligned_fixup() using the new disassembler and a
mcontext_t for the register values. Currently only ld8 and ldfd
instructions are handled as those are the ones we need now (a
misaligned ld8 occurs 4 times in ntpd(8) and a misaligned ldfd
occurs once in mozilla 1.4 and 1.5). Other instructions are added
when needed.
2003-10-23 06:32:34 +00:00
Marcel Moolenaar
49e4ce1f63 Remove unused include of <machine/inst.h> 2003-10-23 06:23:55 +00:00
Marcel Moolenaar
5a931213f0 Remove prototype of unaligned_fixup() and fix a nearby style(9)
bug.
2003-10-23 06:21:44 +00:00
Marcel Moolenaar
26c41f9dd1 Add prototypes for spillfd() and unaligned_fixup(). 2003-10-23 06:20:38 +00:00
Marcel Moolenaar
075f7fe484 Add spillfd(). This function loads a double-precision FP register
at the first address and spills it to the second address. This
allows unaligned_fixup() to update the context of the process in
a way that assures proper rounding.
Similar functions for single-and extended-precision are added when
needed.
2003-10-23 06:19:06 +00:00
Marcel Moolenaar
b9eabb421b Add a new disassembler that improves over the previous disassembler
in that it provides an abstract (intermediate) representation for
instructions. This significantly improves working with instructions
such as emulation of instructions that are not implemented by the
hardware (e.g. long branch) or enhancing implemented instructions
(e.g. handling of misaligned memory accesses). Not to mention that
it's much easier to print instructions.

Functions are included that provide a textual representation for
opcodes, completers and operands.

The disassembler supports all ia64 instructions defined by revision
2.1 of the SDM (Oct 2002).
2003-10-23 06:01:52 +00:00
Marcel Moolenaar
9ee99eb496 Remove md_bspstore from the MD fields of struct thread. Now that
the backing store is at a fixed address, there's no need for a
per-thread variable.
2003-10-21 01:13:49 +00:00
Marcel Moolenaar
bab1f05277 Put the RSE backing store at a fixed address. This change is triggered
by libguile that needs to know the base of the RSE backing store. We
currently do not export the fixed address to userland by means of a
sysctl so user code needs to hardcode it for now. This will be revisited
later.

The RSE backing store is now at the bottom of region 4. The memory stack
is at the top of region 4. This means that the whole region is usable
for the stacks, giving a 61-bit stack space.

Port: lang/guile (depended of x11/gnome2)
2003-10-20 05:34:10 +00:00
Nate Lawson
4c3655b418 Add the cpu_idle_hook() function pointer so that other idlers can be
hooked at runtime.  Make C1 sleep (e.g., HLT) be the default.  This
prepares the way for further ACPI sleep states.
2003-10-18 22:25:07 +00:00
Marcel Moolenaar
b0f865c1f3 Implement cpu_idle() on ia64. We put the processor in a lightweight
halt state that minimizes power consumption while still preserving
cache and TLB coherency. Halting the processor is not conditional at
this time. Tested with UP and SMP kernels.
2003-10-17 02:24:59 +00:00
Robert Drehmel
ea924c4cd3 Implement preliminary support for the PT_SYSCALL command to ptrace(2). 2003-10-09 10:17:16 +00:00
Marcel Moolenaar
c3f4e4fbb5 With BETA 5 of libuwx some of the application registers are renamed
from UWX_REG_MUMBLE to UWX_REG_AR_MUMBLE. Compatibility defines are
present in libuwx. Change the names here so that we don't depend on
compatibility defines.

Note that there's now an UWX_REG_PFS and an UWX_REG_AR_PFS and the
former is not a compatibility define for the latter AFAICT. Change
to UWX_REG_AR_PFS as that seems to be the one we need to handle.
2003-10-09 03:11:37 +00:00
Marcel Moolenaar
f3e533d270 Include <sys/smp.h> for the prototype of smp_rendezvous(). 2003-10-08 19:55:45 +00:00
Bruce M Simpson
2bc7dd5661 Move pmap_resident_count() from the MD pmap.h to the MI pmap.h.
Add a definition of pmap_wired_count().
Add a definition of vmspace_wired_count().

Reviewed by:	truckman
Discussed with:	peter
2003-10-06 01:47:12 +00:00
Alan Cox
566526a957 Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added.  This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine.  Note: pmap_is_prefaultable() and pmap_mincore() have
much in common.  Going forward, it's worth considering their merger.
2003-10-03 22:46:53 +00:00
Marcel Moolenaar
5bf2d2b6b4 Swap the syscall caller frame info (i.e. the return pointer and
frame marker) and the syscall stub frame info in the trap frame.
Previously we stored the stub frame info in (rp,pfs) and the
caller frame info in (iip,cfm). This ends up being suboptimal
for the following reasons:
1. When we create a new context, such as for an execve(2), we had
   to set the (rp,pfs) pair for the entry point when using the
   syscall path out of the kernel but we need to set the (iip,cfm)
   pair when we take the interrupt way out. This is mostly just
   an inconsistency from the kernel's point of view, but an ugly
   irregularity from gdb(1)'s point of view.
2. The getcontext(2) and setcontext(2) syscalls had to swap the
   (rp,pfs) and (iip,cfm) pairs to make the context compatible
   with one created purely in userland.

Swapping the (rp,pfs) and (iip,cfm) pairs is visible to signal
handlers that actually peek at the mcontext_t and to gdb(1).
Since this change is made for gdb(1) and we don't care about
signal handlers that peek at the mcontext_t because we're still
a tier 2 platform, this ABI breakage is academic at this moment
in time.

Note that there was no real reason to save the caller frame info
in (iip,cfm) and the stub frame info in (rp,pfs).
2003-10-03 03:50:29 +00:00
Marcel Moolenaar
c0e56dc2c3 Drop any and all support for varargs. There's no history to worry
about because we're still tier 2 and our current compiler, as well
as future compilers will not support varargs. This is mostly a
no-op in practice, because <sys/varargs.h> should already cause
compile failures.
2003-09-28 05:34:07 +00:00
Poul-Henning Kamp
26b0e90ca2 Set cn_name, not cn_dev 2003-09-26 10:37:16 +00:00
Peter Wemm
c460ac3a00 Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function.  Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable.  This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'.  And that causes exec etc to fail since it can no
longer find space to mmap things.
2003-09-25 01:10:26 +00:00
Yoshihiro Takahashi
33e38a2cc8 Implement the bus_space_map() function to allocate resources and initialize
a bus_handle, but currently it does only initializing a bus_handle.
2003-09-23 08:22:34 +00:00
Marcel Moolenaar
719325db8e Fix the last remaining problem encountered by KSE: apparently it is
not guaranteed that the RSE writes the NaT collection immediately,
sort of atomically, to the backing store when it writes the register
immediately prior to the NaT collection point. This means that we
cannot assume that the low 9 bits of the backingstore pointer do not
point to the NaT collection. This is rather a surprise and I don't
know at this time if it's a bug in the Merced or that it's actually
a valid condition of the architecture. A quick scan over the sources
does not indicate that we depend on the false assumption elsewhere,
but it's something to keep in mind.

The fix is to write the saved contents of the ar.rnat register to
the backingstore prior to entering the loop that copies the dirty
registers from the kernel stack to the user stack.
2003-09-20 20:34:58 +00:00
Marcel Moolenaar
b8d941f010 Move uma_small_alloc() and uma_small_free() to uma_machdep.c. These
functions reference UMA internals from <vm/uma_int.h>, which makes
them highly unwanted in non-UMA specific files.

While here, prune the includes in pmap.c and use __FBSDID(). Move
the includes above the descriptive comment.

The copyright of uma_machdep.c is assigned to the project and can
be reassigned to the foundation if and when when such is preferrable.
2003-09-20 19:27:48 +00:00
Marcel Moolenaar
fe4723c884 Fix the most significant KSE breakage caused by not restoring the
restart instruction bits in the PSR. As such, we were returning
from interrupt to the instruction in the bundle that caused us
to enter the kernel, only now we're returning to a completely
different bundle.

While close here: add two KASSERTs to make sure that we restore
sync contexts only when entered the kernel through a syscall and
restore an async context only when entered the kernel through an
interrupt, trap or fault.

While not exactly here, but close enough: use suword64() when we
copy the dirty registers from the kernel stack to the user stack.
The code was intended to be be replaced shortly after being added,
but that was a couple of weeks ago. I might as well avoid that it
is a source for panics until it's replaced.
2003-09-19 22:51:26 +00:00
Marcel Moolenaar
ebe42add33 Revamp trap(): make it more explicit which kinds of traps/faults we
can get (or not) and what we do with them. This fixes the behaviour
for NaT consumption and speculation faults in that we now don't panic
for user faults.

Remove the dopanic label and move the code to a function. This makes
it easier in the simulator to set a breakpoint.

While here, remove the special handling of the old break-based syscall
path and move it to where we handle the break vector. While here,
reserve a new break immediate for KSE. We currently use the old break-
based syscall to deal with restoring async contexts. However, it has
the side-effect of also setting the signal mask and callong ast() on
the way out. The new break immediate simply restores the context and
returns without calling ast().
2003-09-19 22:41:52 +00:00
Marcel Moolenaar
d6c3e38bb2 Change TRAPF_USERMODE and CLOCKF_USERMODE to not test for CPL == 3,
but for CPL != 0. For some reason yet unknown it is possible for the
CPL to be 2. This would previously be counted as kernel mode, which
resulted in nasty panics. By changing the test it is now treated as
user mode, which is more correct. We still need to figure out how it
is possible that the privilege level can be 2 (or 1 for that matter),
because it's not used by us. We only use 3 (user mode) and 0 (kernel
mode).
2003-09-19 07:48:22 +00:00
Marcel Moolenaar
549ab7a654 Include "opt_kstack_pages.h". We export KSTACK_PAGES to assembly and
better have the right value.
2003-09-19 00:37:41 +00:00
Alan Cox
b9850eb224 Add a new parameter to pmap_extract_and_hold() that is needed to eliminate
Giant from vmapbuf().

Idea from:	tegge
2003-09-12 07:07:49 +00:00
Marcel Moolenaar
87ad0260ff Rewrite the SAPIC initialization to always program the RTEs with what
we think is the correct trigger mode and polarity. This allows us to
implement BUS_CONFIG_INTR() as an update of the RTE in question.
Consequently, we can trust the RTE when we enable an interrupt and
avoids that we need to know about the trigger mode and polarity at
that time.
2003-09-10 22:49:38 +00:00
John Baldwin
42a12b2bd8 Move the definitions for ACPI MADT table entries not present in the ACPICA
distribution to a MI header so it can be shared with other architectures.
2003-09-10 06:32:27 +00:00
Marcel Moolenaar
e6882c3469 Introduce IA64_ID_PAGE_{MASK|SHIFT|SIZE} and LOG2_ID_PAGE_SIZE. The
latter is a kernel option for IA64_ID_PAGE_SHIFT, which in turn
determines IA64_ID_PAGE_MASK and IA64_ID_PAGE_SIZE.

The constants are used instead of the literal hardcoding (in its
various forms) of the size of the direct mappings created in region
6 and 7. The default and probably only workable size is still 256M,
but for kicks we use 128M for LINT.
2003-09-09 05:59:09 +00:00
Alan Cox
ba2157f218 Introduce a new pmap function, pmap_extract_and_hold(). This function
atomically extracts and holds the physical page that is associated with the
given pmap and virtual address.  Such a function is needed to make the
memory mapping optimizations used by, for example, pipes and raw disk I/O
MP-safe.

Reviewed by:	tegge
2003-09-08 02:45:03 +00:00
Bill Paul
a94100fa9b Take the support for the 8139C+/8169/8169S/8110S chips out of the
rl(4) driver and put it in a new re(4) driver. The re(4) driver shares
the if_rlreg.h file with rl(4) but is a separate module. (Ultimately
I may change this. For now, it's convenient.)

rl(4) has been modified so that it will never attach to an 8139C+
chip, leaving it to re(4) instead. Only re(4) has the PCI IDs to
match the 8169/8169S/8110S gigE chips. if_re.c contains the same
basic code that was originally bolted onto if_rl.c, with the
following updates:

- Added support for jumbo frames. Currently, there seems to be
  a limit of approximately 6200 bytes for jumbo frames on transmit.
  (This was determined via experimentation.) The 8169S/8110S chips
  apparently are limited to 7.5K frames on transmit. This may require
  some more work, though the framework to handle jumbo frames on RX
  is in place: the re_rxeof() routine will gather up frames than span
  multiple 2K clusters into a single mbuf list.

- Fixed bug in re_txeof(): if we reap some of the TX buffers,
  but there are still some pending, re-arm the timer before exiting
  re_txeof() so that another timeout interrupt will be generated, just
  in case re_start() doesn't do it for us.

- Handle the 'link state changed' interrupt

- Fix a detach bug. If re(4) is loaded as a module, and you do
  tcpdump -i re0, then you do 'kldunload if_re,' the system will
  panic after a few seconds. This happens because ether_ifdetach()
  ends up calling the BPF detach code, which notices the interface
  is in promiscuous mode and tries to switch promisc mode off while
  detaching the BPF listner. This ultimately results in a call
  to re_ioctl() (due to SIOCSIFFLAGS), which in turn calls re_init()
  to handle the IFF_PROMISC flag change. Unfortunately, calling re_init()
  here turns the chip back on and restarts the 1-second timeout loop
  that drives re_tick(). By the time the timeout fires, if_re.ko
  has been unloaded, which results in a call to invalid code and
  blows up the system.

  To fix this, I cleared the IFF_UP flag before calling ether_ifdetach(),
  which stops the ioctl routine from trying to reset the chip.

- Modified comments in re_rxeof() relating to the difference in
  RX descriptor status bit layout between the 8139C+ and the gigE
  chips. The layout is different because the frame length field
  was expanded from 12 bits to 13, and they got rid of one of the
  status bits to make room.

- Add diagnostic code (re_diag()) to test for the case where a user
  has installed a broken 32-bit 8169 PCI NIC in a 64-bit slot. Some
  NICs have the REQ64# and ACK64# lines connected even though the
  board is 32-bit only (in this case, they should be pulled high).
  This fools the chip into doing 64-bit DMA transfers even though
  there is no 64-bit data path. To detect this, re_diag() puts the
  chip into digital loopback mode and sets the receiver to promiscuous
  mode, then initiates a single 64-byte packet transmission. The
  frame is echoed back to the host, and if the frame contents are
  intact, we know DMA is working correctly, otherwise we complain
  loudly on the console and abort the device attach. (At the moment,
  I don't know of any way to work around the problem other than
  physically modifying the board, so until/unless I can think of a
  software workaround, this will have do to.)

- Created re(4) man page

- Modified rlphy.c to allow re(4) to attach as well as rl(4).

Note that this code works for the sample 8169/Marvell 88E1000 NIC
that I have, but probably won't work for the 8169S/8110S chips.
RealTek has sent me some sample NICs, but they haven't arrived yet.
I will probably need to add an rlgphy driver to handle the on-board
PHY in the 8169S/8110S (it needs special DSP initialization).
2003-09-08 02:11:25 +00:00
Marcel Moolenaar
5e3cb29a6b Untangle the code in this file to improve understandability. Both
ia64_count_cpus() and ia64_probe_sapics() called a single function
to do the the actual work. The difference in behaviour was handled
in that function and was further complicated by adding bootverbose
related code. As such, even the simplest of changes was hard to
comprehend.

Untangling has been done by increasing code duplication and using
a more naive style of coding. FWIW, the object file is slightly
smaller than before, so things aren't as bad as it may seem.

Triggered by: a simple fix on the P4 branch that never got merged.
2003-09-07 23:09:08 +00:00
Alan Cox
5d314346f5 MFamd64/i386
Add necessary page locking to pmap_mincore().
2003-09-07 20:02:38 +00:00
Marcel Moolenaar
10a686623d MFp4: Revamped GENERIC (and hints). This is some much more pleasant
to look at...
2003-09-07 06:39:51 +00:00
Marcel Moolenaar
f1220bfe41 Replace sio(4) with uart(4). Remove the sio(4) hints and only add
those hints used by uart(4) for the determination of the serial
console in the absence of the HCDP table.
2003-09-07 05:47:10 +00:00
Marcel Moolenaar
8d8d970db1 Fix a place where I forgot to change the code that checks whether
we return to kernel or userland. This triggered a panic in a KSE
application when TDF_USTATCLOCK was set in the case userland was
interrupted, but we never called ast() on our way out. As such,
we called ast() at some other time. Unfortunately, TDF_USTATCLOCK
handling assumes running in the interrupt thread. This was not
the case anymore.

To avoid making the same mistake later, interrupt() now returns
to its caller whether we interrupted userland or not. This avoids
that we have to duplicate the check in assembly, where it's bound
to fall off the scope. Now we simply check the return value and
call ast() if appropriate.

Run into this: davidxu
2003-09-05 22:50:10 +00:00
Marcel Moolenaar
f02e8e8122 Use pmap_steal_memory() for the msgbuf instead of trying to squeeze
it in the last chunk (phys_avail block). The last chunk very often is
not larger than one or two pages, resulting in a msgbuf that's too
small to hold a complete verbose boot.
Note that pmap_steal_memory() will bzero the memory it "allocates".
Consequently, ia64 will never preserve previous msgbufs. This is not
a noticable difference in practice. If the msgbuf could be reused,
it was invariably too small to have anything preserved anyway.
2003-09-01 07:06:57 +00:00
Marcel Moolenaar
47f756866a Use direct mapped KVA for the sf_buf allocator, as made possible
by the previous commit. While here, fix a typo, reformat comments
and fix a long line.

Tested with: ftpd
2003-09-01 00:12:27 +00:00
Alan Cox
411d10a600 Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy
sockets into machine-dependent files.  The rationale for this
migration is illustrated by the modified amd64 allocator.  It uses the
amd64's direct map to avoid emphemeral mappings in the kernel's
address space.  On an SMP, the emphemeral mappings result in an IPI
for TLB shootdown for each transmitted page.  Yuck.

Maintainers of other 64-bit platforms with direct maps should be able
to use the amd64 allocator as a reference implementation.
2003-08-29 20:04:10 +00:00
Nate Lawson
5a4d072c93 Minor style cleanups. 2003-08-28 16:30:31 +00:00
Marcel Moolenaar
d0adfaea93 Change LOG2_PAGE_SIZE from 14 to 15 bits. This will cause the CTASSERT
in vm_page.h to be reached and thus slightly increases the overall
coverage of LINT on ia64.
2003-08-25 20:02:18 +00:00
Marcel Moolenaar
5b6a41bddf Add the bits for a LINT kernel. It has been verified to compile. We
may need to polish this.
2003-08-23 21:47:33 +00:00
Marcel Moolenaar
9539d5b4f6 Remove PAGE_SIZE_4K, PAGE_SIZE_8K and PAGE_SIZE_16K and replace them
with LOG2_PAGE_SIZE. A single option is better to LINT than multiple
mutual exclusive ones.
2003-08-23 03:39:55 +00:00
Marcel Moolenaar
ca668eda45 Remove unused inclusion of opt_acpi.h 2003-08-23 00:07:52 +00:00
John Baldwin
e7411b9d71 Regen. 2003-08-21 14:16:41 +00:00
John Baldwin
daf54a1e05 Swap sigaction/sigreturn since they are in the wrong order.
Noticed indirectly by:	peter
2003-08-21 14:16:00 +00:00
Marcel Moolenaar
4a98d8b095 Undo the mistake made in revision 1.77 of trap.c and which was the
ultimate trigger for the follow-up fixes in revisions 1.78, 1.80,
1.81 and 1.82 of trap.c. I was simply too pre-occupied with the
gateway page and how it blurs kernel space with user space and
vice versa that I couldn't see that it was all a load of bollocks.

It's not the IP address that matters, it's the privilege level that
counts. We never run in user space with lifted permissions and we
sure can not run in kernel space without it. Sure, the gateway page
is the exception, but not if you look at the privilege level. It's
user space if you run with user permissions and kernel space otherwise.

So, we're back to looking at the privilege level like it should be.
There's no other way.

Pointy hat: marcel
2003-08-20 05:30:35 +00:00
Gordon Tetlow
df3d69c217 Fixup the ELF branding information to point to the new home of rtld. 2003-08-17 08:08:38 +00:00
Marcel Moolenaar
710338e94f In vm_thread_swap{in|out}(), remove the alpha specific conditional
compilation and replace it with a call to cpu_thread_swap{in|out}().
This allows us to add similar code on ia64 without cluttering the
code even more.
2003-08-16 23:15:15 +00:00
Marcel Moolenaar
26502503e5 Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.

ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).

alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.

powerpc: MD prototypes left in cpu.h. Comment added.

Suggested by: bde
Tested with: make universe (pc98 incomplete)
2003-08-16 16:57:57 +00:00
Marcel Moolenaar
c6d402d3f2 Fix a range check bug. Don't left-shift the integer argument 'data'.
Sign extension happens after the shift, not before so that boundary
cases like 0x40000000 will not be caught properly.
Instead, right shift ndirty. It is guaranteed to be a multiple of 8.
While here, do some manual code motion and code commoning.

Range check bug pointed out by: iedowse
2003-08-16 01:49:38 +00:00
Marcel Moolenaar
1fdb0ba9bb Fix the generation of coredumps. We did not take the dirty registers
that were on the kernel stack into account. For now we write them
out to the register stack of the process before creating the dump.
This however is not the final solution. The problem is that we may
invalidate the coredump by overwriting vital information due to an
invalid backing store pointer. Instead we need to write the dirty
registers to an unused region of VM which will result in a seperate
segment in the coredump. For now we can at least get to all the
registers from a coredump.
2003-08-15 05:52:48 +00:00
Marcel Moolenaar
b00555136c Add an instruction group break after the move to application register
and the move to control register to avoid dependency violations when
these functions are used. Note that explicit data and instruction
serialization also need to be in a subsequent instruction group.
This too requires that we have an igrp break here.
2003-08-15 05:46:33 +00:00
Marcel Moolenaar
60518ee41c Introduce two machine specific ptrace(2) requests: PT_GETKSTACK and
PT_SETKSTACK. These requests allow the tracing process to access the
dirty registers of the traced process that are on the kernel stack.

Note that there's currently no way to access the rnat register for
those dirty registers that are not (yet) covered by a nat collection
point. The interface for this is still being slept on.

Also note that implied by these requests is the division of work:
The tracing process has to keep track of where registers are spilled
and is responsible to figure out where the NaT bit of the stacked
registers are at any time during the execution of the traced process.
The kernel provides the interfaces but will not abstract the fact
that the register stack can be split. This model does not follow
the approach taken in Linux where PT_PEEK and PT_POKE deals with
this automagically.
2003-08-15 05:40:59 +00:00
Marcel Moolenaar
6e1f209af1 Don't use VM_MIN_KERNEL_ADDRESS to check if the faulting address is
in user space or kernel space. VM_MIN_KERNEL_ADDRESS starts after the
gateway page, which means that improper memory accesses to the gateway
page while in user mode would panic the kernel. Use VM_MAX_ADDRESS
instead. It ends before the gateway page. The difference between
VM_MIN_KERNEL_ADDRESS and VM_MAX_ADDRESS is exactly the gateway page.
2003-08-13 03:20:10 +00:00