messages like this:
wdc0 at 0x1f0-0x1f7 irq 14 on isa
wdc0: unit 0 (wd0): <ST506>
wd0: size unknown, using BIOS values: 615 cyl, 4 head, 17 sec, bytes/sec 512
npx0 at 0xf0-0xff irq 13 on motherboard
npx0: changing root device to wd0a
^^^^^^
The spurious 'npx0: ' pops up if you have a 386 with a 387 FPU.
New functions create - vm_object_pip_wakeup and pagedaemon_wakeup that
are used to reduce the actual number of wakeups.
New function vm_page_protect which is used in conjuction with some new
page flags to reduce the number of calls to pmap_page_protect.
Minor changes to reduce unnecessary spl nesting.
Rewrote vm_page_alloc() to improve readability.
Various other mostly cosmetic changes.
just thinking about it.
Two changes need to be made to allow 'config kernel swap generic' to
work properly without requiring any compile-time flags:
/usr/src/usr.sbin/config/mkswapconf.c: we need to define a dummy stub
for the setconf() function to replace the one in swapgeneric.c that
isn't available in non-generic configurations.
/usr/src/sys/i386/i386/autoconf.c: the -a boot flag causes setroot()
to be skipped and lets setconf() prompt the user for a root device.
If you skip setroot() in a non-generic kernel, you could get severely
hosed. To avoid this, we silently ignore the -a flag if rootdev != NODEV.
(rootdev is always initialized to NODEV in swapgeneric.c, so if
we find that rootdev is something other than NODEV, we know we're
not using a generic configuration.)
briefly over it, and see some serious architectural issues in this stuff.
On the other hand, I doubt that we will have any solution to these issues
before 2.1, so we might as well leave this in.
Most of the stuff is bracketed by #ifdef's so it shouldn't matter too much
in the normal case.
Reviewed by: phk
Submitted by: HOSOKAWA, Tatsumi <hosokawa@mt.cs.keio.ac.jp>
entire kernel.
Unfortunately we didn't send him a copy of the style guide before he did it.
I'm trying to find all the benign and downright sound bits and will commit
them without any other explanation than "YF fix" if they are merely cosmetic.
Reviewed by: phk
Submitted by: yves@dutncp8.tn.tudelft.nl (Yves Fonk)
some comparisons as it is more correct (we want the kernel page tables
included).
Reorganized some of the expressions for efficiency.
Fixed the new pmap_prefault() routine - it would sometimes pick up the
wrong page if the page in the shadow was present but the page in object
was paged out. The routine remains unused and commented out, however.
Explicitly free zero reference count page tables (rather than waiting
for the pagedaemon to do it).
Submitted by: John Dyson
Now it matches the man page and also the only other commercial implementation
i have found so far ( Solaris 2.x).
Changed the name from ss_base to ss_sp.
Handles at least Trantor T130 and ProAudioSpectrum adapters.
The pas driver has consequently been removed.
This driver can be configured without without interrupts.
Manpage to follow when PAS16 has been edited in.
Reviewed by: phk
Submitted by: Serge Vakulenko, <vak@cronyx.ru>
(Boot with the -D flag if you want symbols.)
Make it easier to extend `struct bootinfo' without losing either forwards
or backwards compatibility.
ddb_aout.c:
Get the symbol table from wherever the loader put it.
Nuke db_symtab[SYMTAB_SPACE].
boot.c:
Enable loading of symbols. Align them on a page boundary. Add printfs
about the symbol table sizes.
Pass the memory sizes to the kernel.
Fix initialization of `unit' (it got moved out of the loop).
Fix adding the bss size (it got moved inside an ifdef).
Initialize serial port when RB_SERIAL is toggled on.
Fix comments.
Clean up formatting of recently added code.
io.c:
Clean up formatting of recently added code.
netboot/main.c, machdep.c, wd.c:
Change names of bootinfo fields.
LINT:
Nuke SYMTAB_SPACE.
Fix comment about DODUMP.
Makefile.i386:
Nuke use of dbsym.
Exclude gcc symbols from kernel unless compiling with -g.
Remove unused macro.
Fix comments and formatting.
genassym.c:
Generate defines for some new bootinfo fields. Change names of old ones.
locore.s:
Copy only the valid part of the `struct bootinfo' passed by the loader.
Reserve space for symbol table, if any.
machdep.c:
Check the memory sizes passed by the loader, if any. Don't use them yet.
bootinfo.h:
Add a size field so that we can resolve some mismatches between the loader
bootinfo and the kernel boot info. The version number is not so good for
this because of historical botches and because it's harder to maintain.
Add memory size and symbol table fields. Change the names of everything.
Hacks to save a few bytes:
asm.S, boot.c, boot2.S:
Replace `ouraddr' by `(BOOTSEG << 4)'.
boot.c:
Don't statically initialize `loadflags' to 0. Disable the "REDUNDANT"
code that skips the BIOS variables. Eliminate `total'. Combine some
more printfs.
boot.h, disk.c, io.c, table.c:
Move all statically initialzed data to table.c.
io.c:
Don't put the A20 gate bits in a variable.
Moved various pmap 'bit' test/set functions back into real functions; gcc
generates better code at the expense of more of it. (pmap.c)
Fixed a deadlock problem with pv entry allocations (pmap.c)
Added a new, optional function 'pmap_prefault' that does clustered page
table preloading (pmap.c)
Changed the way that page tables are held onto (trap.c).
Submitted by: John Dyson
work (mi_switch() counted the last timeslice again but this didn't affect
the exiting process' rusage because the rusage has already been finalized).
Remove stale comment.
Put in the much shorter and cleaner version for the calibrate_cycle_counter
for the Pentium that Bruce suggested. Tested here on my Pentium and
it works okay.
sigreturn() sometimes failed for ordinary returns from signal handlers.
Failures of ordinary returns "can't happen" and are badly handled.
"Temporary" fix: allow users to corrupt PSL_RF. This is fairly
harmless. A correct fix would involve saving the old %eflags (and
perhaps the old segment registers) where the user can't get at them.
attempted to check for insecure and fatal eflags and segment
selectors, but missed many cases and got the IOPL check back to
front. The other syscalls didn't check at all.
sys_process.c, machdep.c:
Only allow PT_WRITE_U to write to the registers (ordinary and FP).
psl.h, locore.s, machdep.c:
Eliminate PSL_MBZ, PSL_MBO and PSL_USERCLR. We are not supposed
to assume anything about the reserved bits. Use PSL_USERCHANGE
and PSL_KERNEL instead. Rename PSL_USERSET to PSL_USER.
exception.s:
Define a private label for use by doreti when returning to user
mode fails.
machdep.c:
In syscalls, allow changing only the eflags that can be changed on
486's in user mode (no longer attempt to allow benign IOPL changes;
allow changing the nasty PSL_NT; don't allow changing the i586
bits).
Don't attempt to check all the cases involving invalid selectors
and %eip's. Just check for privilege violations and let the invalid
things cause a trap.
procfs_machdep.c:
Call the ptrace register functions to do all the work for reading
and writing ordinary registers and for single stepping.
trap.c:
Ignore traps caused by PSL_NT being set. Previously, users could
cause a fatal trap in user mode by setting PSL_NT and executing an
iret, and a fatal trap in kernel mode by setting PSL_NT and making
a syscall. PSL_NT was cleared too late and not in enough modes to
fix the problem.
Make all traps in user mode (except T_NMI) nonfatal.
Recover from traps caused by attempting to load invalid user
registers in doreti by restarting the traps so that they appear to
occur in user mode.
---
Fix bogons that I noticed while fixing the above:
psl.h:
Fix some comments.
Uniformize idempotency ifdef.
exception.s, machdep.c:
Remove rsvd[0-14]. rsvd0 hasn't been reserved since the 486 came
out. Replace rsvd0 by `align'. rsvd[0-11] used wrong (magic
non-unique) trap numbers. Replace rsvd[1-14] by rsvd.
locore.s:
Enable alignment check flag on 486's and 586's.
machdep.c:
Use a better type for kstack[].
Use TFREGP() to find the registers.
Reformat ptrace functions from SEF to something closer to KNF.
procfs_machdep.c:
The wrong pointer to the registers got fixed as a side effect.
Implement reading and writing of FP registers.
/proc/*/*regs now work (only) for processes that are in memory.
Clean up comments.
trap.c, trap.h:
Remove unused trap types.
unreachable case label in kdb_trap().
Use the correct case labels in kdb_trap() so that normal ddb entry doesn't
print a message.
Change all printf's to db_printf's. Now you can put a breakpoint at printf,
and ddb entry messages don't spam the syslog output.
Cosmetic:
Use ISPL() instead of magic numbers.
Don't compile the unused function kdb_kbd_trap().
Improve some asms.
Print the arg to Debugger().
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.
vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.
vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.
vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.
vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.
machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.
machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.
Submitted by: John Dyson and David Greenman
shifting. Also correct the original code as Garrett noticed it in mail.
Leave the mishandled code in to use it later if future versions of gcc
are correct. The code was part of the calibrate_cyclecounter routine to
get the speed of the pentium chip.
Remove bogus input operands for fnsave(), fnstcw() and fnstsw().
Change all fwait's to fnop's. This might help avoid hardware bugs.
Wait after fninit with an fnop. This should be safer now.
Fix some spelling and formatting errors.
Use natural sizes for control and status words (u_short, promotes to int).
Don't clobber the SWI_CLOCK_MASK bits in npx0_imask when using IRQ13.
Set the devconf state correctly (always busy, if configured). Improve
code for npx_registerdev() a little (gcc can't keep id->id_unit in a
register for some reason). Don't register a nonexistent npx device.
Print a useful message in npxattach() again (delete references to errors
and not the whole message). Don't print "387 emulator" if there is no
emulator in the kernel.
Use %p for pointers in error messages.
Don't clobber the FPU state when there is an FPU exception. Just clear
the exception flags (after saving the flags as before). This allows
debuggers and SIGFPE handlers to look at the full exception state.
SIGFPE handlers should normally return via longjmp(), which restores a
good FPU state (as before). Returning from a SIGFPE handler may leave
the FPU in the wrong state (as before).
Clear the busy latch _after_ clearing the exception flags so that there
is less chance of getting a bogus h/w interrupt for a control operation.
Clear the saved exception status word when the next FPU instruction is
excuted so that it doesn't stick around until the next exception.
Clear the busy latch after fnsave() in npxsave() in case it was set when
npxsave() was called.
- /sys/i386/i386/swapgeneric.c is just plain broke. But fear not, for I
have unbroken it. One thing that swapgeneric.c does is walk through the
list of configured devices searching for a boot device. The only easy
way to accomplish this in 2.0 is to use Garret Wollman's kern_devconf
stuff. *BUT*, the head of the kern_devconf linked list (dc_list) is declared
static in /sys/kern/kern_devconf.c. This means that swapgeneric.c can't
see it at link time. I had to remove the 'static' keyword to get around
this little problem. I hope this doesn't break anything anywhere.
*Furthermore,* there's a small matter of making the call to setconf()
in swapgeneric.c disappear when 'config kernel swap generic' isn't used.
You could change /sbin/config to create a dummy setconf() function in
swapkernel.c, but that seems messy somehow. (It's also someting of an
'it isn't broken, why are you fixing it' situation.) My solution was to
do what the NetBSD people did and put an #ifdef GENERIC around the call
to setconf(). If your kernel is called GENERIC or you define 'options
GENERIC,' then you can use 'config kernel swap generic' and it'll work.
That aside, the upshot is that: a) swapgeneric.c actually works, and
and b) the -a boot flag now works as well. If you boot with -a, as in
"Boot: wd(0,a)/kernel -a" you will be presented with a 'root device?'
prompt after the autoconfig phase, at which point you can specify what
device you want mounted as root. Regrettably, you can't specify an NFS
filesystem. Yet. Three files are affected: /sys/i386/i386/swapgeneric.c,
/sys/i386/i386/autoconf.c and /sys/kern/kern_devconf.c.
Submitted by: wpaul
Move definition of `stat_imask' to clock.c.
clock.c:
Rename `rtcmask' to `stat_imask' and export it. Rename `clkmask' to
`clk_imask' for consistency.
Only calculate TIMER_DIV(hz) once.
Merge debugging and "garbage" code to produce debugging code and format the
output better.
Make writertc() static inline and use it everywhere. Now all accesses to
the clock registers go through rtcin() and writertc().
Move rtc initialization to cpu_initclocks().
Merge enablertclock() with cpu_initclocks() and remove enablertclock().
The extra entry point was just a leftover from 1.1.5.
Fix single-stepping of emulated FPU instructions.
Don't panic if an FPU instruction is attempted but there is no FPU
and no FPU emulator is configured.
Keep track of interrupt nesting level. It is normally 0
for syscalls and traps, but is fudged to 1 for their exit
processing in case they metamorphose into an interrupt
handler.
i386/genassym.c;
Remove support for the obsolete pcb_iml and pcb_cmap2.
Add support for pcb_inl.
i386/swtch.s:
Fudge the interrupt nesting level across context switches and in
the idle loop so that the work for preemptive context switches
gets counted as interrupt time, the work for voluntary context
switches gets counted mostly as system time (the part when
curproc == 0 gets counted as interrupt time), and only truly idle
time gets counted as idle time.
Remove obsolete support (commented out and otherwise) for pcb_iml.
Load curpcb just before curproc instead of just after so that
curpcb is always valid if curproc is. A few more changes like
this may fix tracing through context switches.
Remove obsolete function swtch_to_inactive().
include/cpu.h:
Use the new interrupt nesting level variable to implement a
non-fake CLF_INTR() so that accounting for the interrupt state
works.
You can use top, iostat or (best) an up to date systat to see
interrupt overheads. I see the expected huge interrupt overheads
for ISA devices (on a 486DX/33, about 55% for an IDE drive
transferring 1250K/sec and the same for a WD8013EBT network card
transferring 1100K/sec). The huge interrupt overheads for serial
devices are unfortunately normally invisible.
include/pcb.h:
Remove the obsolete pcb_iml and pcb_cmap2. Replace them by
padding to preserve binary compatibility.
Use part of the new padding for pcb_inl.
isa/icu.s:
isa/vector.s:
Keep track of interrupt nesting level.
of config so YOU MUST RECOMPILE CONFIG. Modifying config was the cleanest
solution to integrating this driver into the tree which will become more
obvious in the next commit.
was supposed to have already been made, but got botched somewhere.
Don't clobber the last page of memory (where the message buffer is). Some
BIOS don't gratuitously wipe it out on reboot.
Alphabetize.
Write all i/o functions in sleep so that we don't use anything from
NetBSD.
Restore the correct type of u_int for ports. This saves a whole cycle
per i/o on 486's.
Change `inline' back to __inline to avoid compiler warnings with
-Wreally-all.
Don't implement bdb() unless BDE_DEBUGGER is defined. Declare bdb_exists
outside the function to avoid hundreds of compiler warnings.
Let the compiler pick the register in asms if possible.
Implement ffs() using inline asm(). gcc provides a slightly different
one. It was broken in gcc-2.4.5 but works now. Declaring a correct
version inline ensures getting a correct version. FreeBSD-1.1.5 has
an slow inline version but FreeBSD-2.0 has a library version (which
probably never gets used).
Do inb() and outb() without using %edx for constant ports below 0x100.
Remove casts to the same type in queue functions.
Declare prototypes for everything implemented i386/*.s and also for
everything that is normally implemented as an inline here (I don't
like the current complete dependency on gcc). Ifdef out the prototypes
that are declared elsewhere. THere should be a separate header to
declare things implemented in i386/*.s, but then it would be harder
to override declarations with inlines.
${UII}
with the current default exception (un)mask. There should be no such
processes unless you change the mask. Someday the mask should be
changed to the IEEE default of everything masked. The npx state
gets saved so that it can be checked and this may have the side effect
of fixing a bug that was reported for 1.1.5. (npx exceptions may
sometimes leak across exits and clobber another process. I can't see
how this can happen.)
Get some missing/wrong declarations from headers now that the headers
have them.
the following.
Move declarations to and from <machine/segments.h>. Make segment stuff
static if possible.
Remove unused (although initialized) global variables _default_ldt,
currentldt, _gsel_tss (rename the latter to the auto variable
gtel_tss).
Use "correct" and consistent types for interrupt handlers.
Remove a mailing address from the code.
Fix type mismatches found by adding prototypes.
Partly support BDE_DEBUGGER. Still broken by conflict with APM. Does
nothing if BDE_DEBUGGER is not defined.
Clean up prototypes and data declarations. Declare most of the segment
functions that are implemented in support.s. Make data private in
machdep.c if possible.
Parenthesize expressions in macros properly!
${Uniformize idempotency ifdef}.
to avoid compiler warnings.
Clean up prototypes: alphabetize; don't use redundant `extern' or
meaningless `extern inline'.
Uniformize idempotency ifdef.
Somebody should make a mib variable for it.
Just now it is pointless to dump the kernel, since we have nothing which
can read the dump.
Furthermore is should never be the default to dump.
options DODUMP
will enable dumps.
for all reasonable HZ's. HZ > 1000 doesn't work because of sloppy
conversions in hzto() (division by (tick / 1000) == 0). This was
fixed in 1.1.5.
Eliminate some extern declarations by including the appropriate header
files that now contain appropriate declarations.
doesn't have to calculate it every call.
Rename `timer0_prescale' to `timer0_prescaler_count' and maintain it
correctly. Previously we lost a few 8253 cycles for every "prescaled"
clock interrupt, and the lossage grows rapidly at 16 KHz. Now we
only lose a few cycles for every standard clock interrupt.
Rename `*_divisor' to `*_max_count'.
Do the calculation of TIMER_DIV(rate) only once instead of 3 times each
time the rate is changed.
Don't allow preposterously large interrupt rates. Bug fixes elsewhere
should allow the system to survive rates that saturate the system, however.
Clean up declarations.
Include <machine/clock.h> to check our own declarations.
for it is incomplete and buggy. There is no problem unless Xintr0()
is reentered or should be reentered, but high clock interrupt
frequencies for pcaudio cause Xintr0() to be reentered (or clock
ticks to be lost when Xintr0() should have been reentered but
wasn't), and we lose little by delaying the call to softclock().
Move declarations related to the clock driver to clock.h.
Move declarations related to the npx driver to npx.h.
Clean up the remaining declarations.
I know that many of these entries are bogus and need to be revisited,
but let's get the tree working again for now and then do a pass through
looking at all the __FreeBSD__ entries, shall we?
Cosmetic.
Return from trap() if trap_fatal() returns. trap_fatal() isn't
fatal if you have ddb. Returning from trap() is usually the right
thing to do and much better than falling through.
Build a dummy frame at the top of tmpstk to help debuggers trace the stack
when the system is idle.
swtch.s: idle():
Initialize the frame pointer so that debuggers don't try to trace a bogus
stack.
Load the frame pointer, load the stack pointer and switch out the old
stack in the unique order that never leaves one of the pointers pointers
invalid so that debuggers can trace idle(). Disabling interrupts
provides sufficient validity for normal operation, but debuggers use
(trace) traps.
Changed the fifth parameter to register_intr() from u_int mask into
u_int *maskptr in preparation for new features (shared interrupts and
removable devices, eg. for PCMCIA).
Changed the fifth parameter to register_intr() from u_int mask into
u_int *maskptr in preparation for new features (shared interrupts and
removable devices, eg. for PCMCIA).
of memory to work without running out of kernel VM (and increasing it to
even more than it is now (96MB) is out of the question. Changed bufpages
calculation to allocation a little less bufer cache (16% of mem-2MB instead
of 20%); this is simply a better figure for most systems.
text. Fixed rounding bug that caused the last page of kernel text to be
read/write instead of read-only. This is important now that tmpstk can
crash into it. Removed +4 bias of tmpstk because it screws up ddb's
ability to traceback correctly.
and all SCSI devices (except that it's not done quite the way I want). New
information added includes:
- A text description of the device
- A ``state''---unknown, unconfigured, idle, or busy
- A generic parent device (with support in the m.i. code)
- An interrupt mask type field (which will hopefully go away) so that
. ``doconfig'' can be written
This requires a new version of the `lsdev' program as well (next commit).
explanation. More doc needed, but not hard to do, if you want to.
A big hand to Martin Renters for the netboot program !
Anybody want to compete on who can "make world" in the shortest
amount of time ? I have 127 i486DX2/66 and 5 P60's I can use
now. And 3 times 66 Gb file servers to support it... :->
Anyway, NFS will be standard in the GENERIC kernel now, so that
people can use the bin-tarball to set up shop.
drivers have a chance to change their IRQ before it is checked.
This was implemented in revision 1.21 and broken in revision 1.26.
Drivers that can change their IRQ should probably be configured
with "irq ?".
else has been probed. This feature could go away again, if we can curb the
problem another way.
if_ed.c, syscons.c: Set the above flag. ed# because it needs it, syscons
because it looks stupid to "detect" the display you have already filled up
with text :-)
bt742a.c: Check bt_cmd() return-val during probe, thus failing on adaptec's.
Also silenced various printf's during the probe.
isa.c: Probe devices with the above flag set before the rest. Reduce the
number of "conflict" messages per device to one.
***
Please test the GENERIC-kernel now, if nobody can make it fail, GENERICAH
and GENERICBT has a finite and short life-expectancy...
***
scheme of things, so I've changed them to be more appropriate. page in/ous
are now associated with the pager that did them. Nuked v_fault as the
only fault of interest that wouldn't be already counted in v_trap is a VM
fault, and this is counted seperately.
2) Implemented most of the remaining counters and corrected the counting of
some that were done wrong. They are all almost correct now...just a few
minor ones left to fix.
"APM" macro.
machdep.c: Made the APM-descriptors unconditional.
Bruce: if these still conflict with your debugger, please put in a reservation
for your debugger. These three desc. can be anywhere, as long as they are
contiguous, so just move them as needed.
pmap.c: tons of unused vars zapped, various other warnings silenced.
trap.c: unused vars zapped.
vm_machdep.c: A wrong argument, which by chance did the right thing, was
corrected.
that and when it does it will be done differently.
2. The kernel now does a frame setup on entry so it ``looks'' like a
real function call. This will be needed by future boot code and
debuggers.
3. Clean up stack offsets to all be in decimal and use %ebp when copying
parameters in from the boot code.
4. Implement version 1 of the uniform boot code passing mechanism with
support for kernelname passing and nfs_diskless structure passing.
5. Document the 3 different ways the kernel is called depending on what code
is calling it.
must #define NFS before including <sys/mount.h> to pick up some of
the definitions needed for struct diskless. Be sure to undef it after this
so you do not effect other code.
This is kinda sick, but it does the job. Problem found by davidg.
2. New detection code so we know what boot code called us.
3. Remove old DISKLESS support code and halt if we are called by that boot
code as it will NOT work with the new nfs_diskless structure.
This is really in preperation for new boot code and new diskless support.
Reviewed by: davidg
including files. vector.s sometimes left the data section misaligned
(depending on the configuration) so all the time-critical globals in icu.s
were sometimes misaligned.
cycles. While waiting there I added a lot of the extra ()'s I have, (I have
never used LISP to any extent). So I compiled the kernel with -Wall and
shut up a lot of "suggest you add ()'s", removed a bunch of unused var's
and added a couple of declarations here and there. Having a lap-top is
highly recommended. My kernel still runs, yell at me if you kernel breaks.
matter, but similar bogusness in npx.c causes compiling without -O to fail.
Use __volatile in all asms.
Parenthesize macro args.
Change the names of the macros to avoid namespace pollution.
Remove unnecessary "#ifdef __i386__".
Sort #defines.
Add comments.
have got the following:
Back out the changes in the previous revision. Function-like macros
were replaced by compound statements that work in less contexts.
Unoformize idempotency #ifdef.
Restore the simple leap year calculation as a macro and document it so
that it doesn't become complicated again. The simple version works
for all leap years covered by 32-bit time_t's. The complicated version
doesn't work for all leap years covered by 64-bit time_t's since among
other reasons, the solar system is not stable for long enough.
Fix declarations.
Nuke spinwait().
This code is mostly taken from the 1.1 port (which was in turn taken from
Dave Mills's kern.tar.Z example). A few significant differences:
1) ntp_gettime() is now a MIB variable rather than a system call. A few
fiddles are done in libc to make it behave the same.
2) mono_time does not participate in the PLL adjustments.
3) A new interface has been defined (in <machine/clock.h>) for doing
possibly machine-dependent things around the time of the clock update.
This is used in Pentium kernels to disable interrupts, set `time', and
reset the CPU cycle counter as quickly as possible to avoid jitter in
microtime(). Measurements show an apparent resolution of a bit more than
8.14usec, which is reasonable given system-call overhead.
Removed inb function since it's more correctly in pio.h
Copied write_eflags and read_eflags over from npx.c
(Some changes to the macros suggested by Bruce were not made at this
time since his suggestions probably apply to all the macros and
these inlined/macro definitions need a lot of cleaning up at some
point in the future.)
Reviewed by: Bruce
Fix from Bruce Evans. There were missing sets of parantheses:
1. The checks for the standard data selectors were botched, so %ss == 0
and probably %cs == 0 were allowed. A fix is enclosed. The checks
for the standard selectors could be omitted without losing anything
since the standard selectors pass the valid_ldt_sel() tests.
320x200 256col VGA. This is nessesary for the iBCS stuff to work right.
(And we get the benefit of more video modes). Uses the videocard BIOS
to optain mode tables.
Added a "green" saver, switches off the syncs for "green" monitors.
Reviewed by:
Submitted by:
Obtained from:
don't hard-code netisr values in icu.s, but rather, use an array of
function pointers and set them all up in machdep.c for statically-linked
protocol families. (This will eventually be done differently.)
otherwise the machine will overflow the stack in a recursive fault loop
(causing the machine to spontaneously reboot because of the stack fault
that ultimately happens).
Submitted by: Inspired by Bruce Evans, but this change is different
than what he suggested.
if KERNEL is not defined. lib/msun/i387/*.S include asmacros.h to
get the definitions of ENTRY(), etc. This is bogus since asmacros.h
is only supposed to give definitions suitable for the kernel. The
current definitions for the kernel almost worked but are missing
the ".type" declarations. This caused the linker to print warnings
about doubtful relocations for almost anything linked to libm[sun].
Uniformize name and use of idempotence identifier.
the Mach/i386 version of the BSD/vax(?) <machine/psl.h>. The Mach
version has slightly better names for many macros but is now out of
date and little used. It was originally used even less (for spelling
PSL_T as EFL_TF in <machine/db_machdep.h>).
negation whenever we access memory between 640k and 1M.
Original code from NetBSD 1.0-BETA. The exact origins are unclear but
Theo de Raadt, Charles, and Michael V. may have contributed to it.
Submitted by: pst
Submitted by:
1) if_ie.c:
Changed a printf and put a space in it. Formerly the "<3C507>"
confused the syslog. He tried to see that as the priority to
log that message.
2) isa_device.h:
Changed the iobase variable from short to u_short. EISA
Adresses can go up to 0xf000 and the sign extension doesn't
look good in the probe output. Example:
ep1 at 0xffff8000-0xffff8000f is not good :-), i like more a
ep1 at 0x8000-0x8000f.
3) isa.c:
Changed a string constant from "probe" to "prob", it gets
later already an "ed" tagged on the end.
with BOUNCE_BUFFERS. This is more intuitive, and is better for future
multiplatform support. Added BOUNCE_BUFFERS option to the GENERIC and
LINT kernel config files.
in your kernel config now).
2) Added ps ddb function from 1.1.5. Cleaned it up a bit and moved into its
own file.
3) Added \r handing in db_printf.
4) Added missing memory usage stats to statclock().
5) Added dummy function to pseudo_set so it will be emitted if there
are no other pseudo declarations.
Changed u_int_inb to just inb and deleted define.
The code generated is identical to that generated with the cast so
the problem was obviously fixed at some point after gcc 1.4
Reviewed by:
Submitted by:
hosing syscons. Doesn anyone know anything about this
or can we just delete it now?
/*
* This roundabout method of returning a u_char helps stop gcc-1.40 from
* generating unnecessary movzbl's.
*/
#ifdef disable_for_gcc-2_6_0
#define inb(port) ((u_char) u_int_inb(port))
#endif
static inline u_int
u_int_inb(u_int port)
{
u_char data;
/*
* We use %%dx and not %1 here because i/o is done at %dx and
not at
* %edx, while gcc-2.2.2 generates inferior code (movw instead
of movl)
* if we tell it to load (u_short) port.
*/
__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
return data;
}
Reviewed by:
Submitted by:
machdep.c:
Changed printf's a little and call vfs_unmountall() if the sync was
successful.
cd9660_vfsops.c, ffs_vfsops.c, nfs_vfsops.c, lfs_vfsops.c:
Allow dismount of root FS. It is now disallowed at a higher level.
vfs_conf.c:
Removed unused rootfs global.
vfs_subr.c:
Added new routines vfs_unmountall and vfs_unmountroot. Filesystems
are now dismounted if the machine is properly rebooted.
ffs_vfsops.c:
Toggle clean bit at the appropriate places. Print warning if an
unclean FS is mounted.
ffs_vfsops.c, lfs_vfsops.c:
Fix bug in selecting proper flags for VOP_CLOSE().
vfs_syscalls.c:
Disallow dismounting root FS via umount syscall.
2. Hack.
Hack is to define RCSID() to null macro so that new msun stuff
will compile. This does NOT belong here, and I DON'T want it to
stay, I just need to put this here for now to enable msun and we need
to talk about what our RCSID story is supposed to be. We talked about
supporting RCSID() one day, and everyone seemed to like the idea
reasonably well of making it a macro you could just no-op this way,
but we never did anything. Now I see that JTCs code has it and I'm
loath to remove it or do anything until we've discussed it some more.
Well, so how about it? What's our story vis-a-vis RCSID() going to
be?
Submitted by: jkh
- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.
/usr/src/sys/i386/isa/clock.c:
o Garrett's statclock changes.
o Wire xxxintr, not Vclk.
o Wire using register_intr(), not setidt().
/usr/src/sys/i386/isa/icu.s:
o Garrett's statclock changes.
o Removed unused variable high_imask.
o Fake int 8 for rtc as well as int 0 for clk. Required for kernel
profiling with statclock, harmless otherwise.
/usr/src/sys/i386/isa/isa.c:
o Allow isdp->id_irq and other things in *isdp to be changed by
probes. Changing interrupts later requires direct calls to
register_intr() and unregister_intr() and more care.
ALLOW_CONFLICT_* is brought over from 1.1.5, except
ALLOW_CONFLICT_IRQ is not supported. IRQ conflict checking is
delayed until after probing so that drivers can change the IRQ
to a free one; real conflicts require more cooperation between
drivers to handle.
o Too many details to list.
o This file requires splitting and a lot more work.
/usr/src/sys/i386/isa/isa_device.h:
o Declare more things more completely.
/usr/src/sys/i386/isa/sio.c:
o Prepare to register interrupt handlers as fast.
/usr/src/sys/i386/isa/vector.s:
o Generate entry code for 16 fast interrupt handlers and 16 normal
interrupt handlers. Changed some constants to variables:
# $unit is now intr_unit[intr]. Type is int. Someday it should
be a cookie suitable for the handler (e.g., a struct com_s for
sio).
# $handler is now intr_handler[intr].
# intrcnt_actv[id_num] is now *intr_countp[intr]. The indirection
is required to get a contiguous range of counters for vmstat
and so that the drivers depend more in the driver than on the
interrupt number (drivers could take turns using an interrupt
and the counts would remain correct). There is a separate
counter for each device and for each stray interrupt. In
1.1.5, stray interrupt 7 clobbers the count for device 7 or
something worse if there is no device 7 :-(.
# mask is now intr_mask[intr] (was already indirect).
o Entry points are now _XintrI and _XfastintrI (I = intr = 0-15),
not _VdevU (U = unit).
o Removed BUILD_VECTORS stuff. There's a trace of it left for
the string table for vmstat but config now generates the
string in one piece because nothing more is required.
o Removed old handling of stray interrupts and older comments
about it.
Submitted by: Bruce Evans
not provide the full accuracy of a randomized statistical clock, it does
provide greater accuracy than the previous method, while not significantly
increasing overhead. It also provides profiling support at 1024 Hz.
You must re-compile config before making a new kernel, or you will end
up with unresolved symbols.
Reviewed uy: Bruce evans said it worked for him.
Delete the ifdef GPL_EMULATE case here and made the padding work for
both types of emulators so that there is no longer a need to compile
ps and friends new if you are using the GPL math emulator instead the
normal one.
``changes'' are actually not changes at all, but CVS sometimes has trouble
telling the difference.
This also includes support for second-directory compiles. This is not
quite complete yet, as `config' doesn't yet do the right thing. You can
still make it work trivially, however, by doing the following:
rm /sys/compile
mkdir /usr/obj/sys/compile
ln -s M-. /sys/compile
cd /sys/i386/conf
config MYKERNEL
cd ../../compile/MYKERNEL
ln -s /sys @
rm machine
ln -s @/i386/include machine
make depend
make
improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/
pmap_kremove. These routine allow fast mapping of pages for those
architectures that have "normal" MMUs. Also included is a fix to the
pageout daemon to properly check a queue end condition.
Submitted by: John Dyson
me:
1) TLB flush optimization that effectively eliminates half of all of the
TLB flushes. This works by only flushing the TLB when a page is "present"
in memory (i.e. the valid bit is set in the page table entry). See section
5.3.5 of the Intel 386 Programmer's Reference Manual.
2) The handling of "CMAP" has been improved to catch attempts at multiple
simultaneous use.
John:
1) Added pmap_qenter/pmap_qremove functions for fast mapping of pages into
the kernel. This is for future optimizations and support for the upcoming
merged VM/buffer cache.
Reviewed by: John Dyson
From Bruce Evans:
fu[i]byte() checked the wrong register. This caused interesting behaviour
in the GPL math emulator. The emulator does not check the values returned
by fu*() or su*() (:-() and it interpreted the address of -12(%ebp) as
-1(%ebp). The same probably occurs for all signed 8-bit offsets from
registers.
I cleaned up the new bzero() a bit.
Vastly improved trap.c from me. This rewritten version has a variety of
features, amoung them: higher performance and much higher code quality.
support.s, cpufunc.h:
No longer use gs override to enforce range limits - compare directly
against VM_MAXUSER_ADDRESS instead. The old way caused problems in
preserving the gs selector...and this method is just as fast or faster.
the NTP kernel PLL is disabled, and acquire_timer0() is enabled, thus
opening the door for microtime() (and hence gettimeofday()) to return
bogus timestamps. This option is necessary for the `pca' driver to
work, but is implemented to underscore the fact that accurate timekeeping
and the `pca' driver are incompatible at present. If someone writes a version
of microtime() that works when the `pca' driver is being used, this can get
junked.
1) check va before clearing the page clean flag. Not doing so was
causing the vnode pager error 5 messages when paging from
NFS. (pmap.c)
2) put back interrupt protection in idle_loop. Bruce didn't think
it was necessary, John insists that it is (and I agree). (swtch.s)
3) various improvements to the clustering code (vm_machdep.c). It's
now enabled/used by default.
4) bad disk blocks are now handled properly when doing clustered IOs.
(wd.c, vm_machdep.c)
5) bogus bad block handling fixed in wd.c.
6) algorithm improvements to the pageout/pagescan daemons. It's amazing
how well 4MB machines work now.
1) Removed all instances of disable_intr()/enable_intr() and changed
them back to splimp/splx. The previous method was done to improve
the performance, but Bruces recent changes to inline spl* have
made this unnecessary.
2) Cleaned up vm_machdep.c considerably. Probably fixed a few bugs, too.
3) Added a new mechanism for collecting page statistics - now done by
a new system process "pagescan". Previously this was done by the
pageout daemon, but this proved to be impractical.
4) Improved the page usage statistics gathering mechanism - performance is
much improved in small memory machines.
5) Modified mbuf.h to enable the support for an external free routine when
using mbuf clusters. Added appropriate glue in various places to
allow this to work.
6) Adapted a suggested change to the NFS code from Yuval Yurom to take
advantage of #5.
7) Added fault/swap statistics support.
1) fixed some bugs related to the bounce buffer code
2) vnode pager now supports clustered pageouts
3) experimental code for clustering all I/O via a new "cldisksort"
4) added >16MB check to Bustek driver
5) made some experimental algorithmic changes to the pageout daemon
6) fixed bugs in truncating mapped files (esp when mapped via NFS)
7) reorganized vnode pager I/O code
list of changes, I've made the following additional changes:
1) i386/include/ipl.h renamed to spl.h as the name conflicts with the
file of the same name in i386/isa/ipl.h.
2) changed all use of *mask (i.e. netmask, biomask, ttymask, etc) to
*_imask (net_imask, etc).
3) changed vestige of splnet use in if_is to splimp.
4) got rid of "impmask" completely (Bruce had gotten rid of netmask),
and are now using net_imask instead.
5) dozens of minor cruft to glue in Bruce's changes.
These require changes I made to config(8) as well, and thus it must
be rebuilt.
-DG
from Bruce Evans:
sio:
o No diff is supplied. Remove the define of setsofttty(). I hope
that is enough.
*.s:
o i386/isa/debug.h no longer exists. The event counters became too
much trouble to maintain. All function call entry and exception
entry counters can be recovered by using profiling kernel (the new
profiling supports all entry points; however, it is too slow to
leave enabled all the time; it also). Only BDBTRAP() from debug.h
is now used. That is moved to exception.s. It might be worth
preserving SHOW_BITS() and calling it from _mcount() (if enabled).
o T_ASTFLT is now only set just before calling trap().
o All exception handlers set SWI_AST_MASK in cpl as soon as possible
after entry and arrange for _doreti to restore it atomically with
exiting. It is not possible to set it atomically with entering
the kernel, so it must be checked against the user mode bits in
the trap frame before committing to using it. There is no place
to store the old value of cpl for syscalls or traps, so there are
some complications restoring it.
Profiling stuff (mostly in *.s):
o Changes to kern/subr_mcount.c, gcc and gprof are not supplied yet.
o All interesting labels `foo' are renamed `_foo' and all
uninteresting labels `_bar' are renamed `bar'. A small change
to gprof allows ignoring labels not starting with underscores.
o MCOUNT_LABEL() is to provide names for counters for times spent
in exception handlers.
o FAKE_MCOUNT() is a version of MCOUNT() suitable for exception
handlers. Its arg is the pc where the exception occurred. The
new mcount() pretends that this was a call from that pc to a
suitable MCOUNT_LABEL().
o MEXITCOUNT is to turn off any timer started by MCOUNT().
/usr/src/sys/i386/i386/exception.s:
o The non-BDB BPTTRAP() macros were doing a sti even when interrupts
were disabled when the trap occurred. The sti (fixed) sti is
actually a no-op unless you have my changes to machdep.c that make
the debugger trap gates interrupt gates, but fixing that would
make the ifdefs messier. ddb seems to be unharmed by both
interrupts always disabled and always enabled (I had the branch in
the fix back to front for some time :-().
o There is no known pushal bug.
o tf_err can be left as garbage for syscalls.
/usr/src/sys/i386/i386/locore.s:
o Fix and update BDE_DEBUGGER support.
o ENTRY(btext) before initialization was dangerous.
o Warm boot shot was longer than intended.
/usr/src/sys/i386/i386/machdep.c:
o DON'T APPLY ALL OF THIS DIFF. It's what I'm using, but may require
other changes.
Use the following:
o Remove aston() and setsoftclock().
Maybe use the following:
o No netisr.h.
o Spelling fix.
o Delay to read the Rebooting message.
o Fix for vm system unmapping a reduced area of memory
after bounds_check_with_label() reduces the size of
a physical i/o for a partition boundary. A similar
fix is required in kern_physio.c.
o Correct use of __CONCAT. It never worked here for non-
ANSI cpp's. Is it time to drop support for non-ANSI?
o gdt_segs init. 0xffffffffUL is bogus because ssd_limit
is not 32 bits. The replacement may have the same
value :-), but is more natural.
o physmem was one page too low. Confusing variable names.
Don't use the following:
o Better numbers of buffers. Each 8K page requires up to
16 buffer headers. On my system, this results in 5576
buffers containing [up to] 2854912 bytes of memory.
The usual allocation of about 384 buffers only holds
192K of disk if you use it on an fs with a block size
of 512.
o gdt changes for bdb.
o *TGT -> *IDT changes for bdb.
o #ifdefed changes for bdb.
/usr/src/sys/i386/i386/microtime.s:
o Use the correct asm macros. I think asm.h was copied from Mach
just for microtime and isn't used now. It certainly doesn't
belong in <sys>. Various macros are also duplicated in
sys/i386/boot.h and libc/i386/*.h.
o Don't switch to and from the IRR; it is guaranteed to be selected
(default after ICU init and explicitly selected in isa.c too, and
never changed until the old microtime clobbered it).
/usr/src/sys/i386/i386/support.s:
o Non-essential changes (none related to spls or profiling).
o Removed slow loads of %gs again. The LDT support may require
not relying on %gs, but loading it is not the way to fix it!
Some places (copyin ...) forgot to load it. Loading it clobbers
the user %gs. trap() still loads it after certain types of
faults so that fuword() etc can rely on it without loading it
explicitly. Exception handlers don't restore it. If we want
to preserve the user %gs, then the fastest method is to not
touch it except for context switches. Comparing with
VM_MAXUSER_ADDRESS and branching takes only 2 or 4 cycles on
a 486, while loading %gs takes 9 cycles and using it takes
another.
o Fixed a signed branch to unsigned.
/usr/src/sys/i386/i386/swtch.s:
o Move spl0() outside of idle loop.
o Remove cli/sti from idle loop. sw1 does a cli, and in the
unlikely event of an interrupt occurring and whichqs becoming
zero, sw1 will just jump back to _idle.
o There's no spl0() function in asm any more, so use splz().
o swtch() doesn't need to be superaligned, at least with the
new mcounting.
o Fixed a signed branch to unsigned.
o Removed astoff().
/usr/src/sys/i386/i386/trap.c:
o The decentralized extern decls were inconsistent, of course.
o Fixed typo MATH_EMULTATE in comments. */
o Removed unused variables.
o Old netmask is now impmask; print it instead. Perhaps we
should print some of the new masks.
o BTW, trap() should not print anything for normal debugger
traps.
/usr/src/sys/i386/include/asmacros.h:
o DON'T APPLY ALL OF THIS DIFF. Just use some of the null macros
as necessary.
/usr/src/sys/i386/include/cpu.h:
o CLKF_BASEPRI() changes since cpl == SWI_AST_MASK is now normal
while the kernel is running.
o Don't use var++ to set boolean variables. It fails after a mere
4G times :-) and is slower than storing a constant on [3-4]86s.
/usr/src/sys/i386/include/cpufunc.h:
o DON'T APPLY ALL OF THIS DIFF. You need mainly the include of
<machine/ipl.h>. Unfortunately, <machine/ipl.h> is needed by
almost everything for the inlines.
/usr/src/sys/i386/include/ipl.h:
o New file. Defines spl inlines and SWI macros and declares most
variables related to hard and soft interrupt masks.
/usr/src/sys/i386/isa/icu.h:
o Moved definitions to <machine/ipl.h>
/usr/src/sys/i386/isa/icu.s:
o Software interrupts (SWIs) and delayed hardware interrupts (HWIs)
are now handled uniformally, and dispatching them from splx() is
more like dispatching them from _doreti. The dispatcher is
essentially *(handler[ffs(ipending & ~cpl)]().
o More care (not quite enough) is taken to avoid unbounded nesting
of interrupts.
o The interface to softclock() is changed so that a trap frame is
not required.
o Fast interrupt handlers are now handled more uniformally.
Configuration is still too early (new handlers would require
bits in <machine/ipl.h> and functions to vector.s).
o splnnn() and splx() are no longer here; they are inline functions
(could be macros for other compilers). splz() is the nontrivial
part of the old splx().
/usr/src/sys/i386/isa/ipl.h
o New file. Supposed to have only bus-dependent stuff. Perhaps
the h/w masks should be declared here.
/usr/src/sys/i386/isa/isa.c:
o DON'T APPLY ALL OF THIS DIFF. You need only things involving
*mask and *MASK and comments about them. netmask is now a pure
software mask. It works like the softclock mask.
/usr/src/sys/i386/isa/vector.s:
o Reorganize AUTO_EOI* macros.
o Option FAST_INTR_HANDLER_USERS_ES for people who don't trust
fastintr handlers.
o fastintr handlers need to metamorphose into ordinary interrupt
handlers if their SWI bit has become set. Previously, sio had
unintended latency for handling output completions and input
of SLIP framing characters because this was not done.
/usr/src/sys/net/netisr.h:
o The machine-dependent stuff is now imported from <machine/ipl.h>.
/usr/src/sys/sys/systm.h
o DON'T APPLY ALL OF THIS DIFF. You need mainly the different
splx() prototype. The spl*() prototypes are duplicated as
inlines in <machine/ipl.h> but they need to be duplicated here
in case there are no inlines. I sent systm.h and cpufunc.h
to Garrett. We agree that spl0 should be replaced by splnone
and not the other way around like I've done.
/usr/src/sys/kern/kern_clock.c
o splsoftclock() now lowers cpl so the direct call to softclock()
works as intended.
o softclock() interface changed to avoid passing the whole frame
(some machines may need another change for profile_tick()).
o profiling renamed _profiling to avoid ANSI namespace pollution.
(I had to improve the mcount() interface and may as well fix it.)
The GUPROF variant doesn't actually reference profiling here,
but the 'U' in GUPROF should mean to select the microtimer
mcount() and not change the interface.
1) A new mechanism has been added to prevent pages from being paged
out called "vm_page_hold". Similar to vm_page_wire, but
much lower overhead.
2) Scheduling algorithm has been changed to improve interactive
performance.
3) Paging algorithm improved.
4) Some vnode and swap pager bugs fixed.
Eliminates vm_fault overhead on process startup and
mmap referenced data for in-memory pages.
(process startup time using in-memory segments *much* faster)
2) Even more efficient pmap code. Code partially cleaned up.
More comments yet to follow.
(generally more efficient pte management)
3) Pageout clustering ( in addition to the FreeBSD V1.1 pagein
clustering.)
(much faster paging performance on non-write behind disk
subsystems, slightly faster performance on other systems.)
4) Slightly changed vm_pageout code for more efficiency and
better statistics. Also, resist swapout a little more.
(less likely to pageout a recently used page)
5) Slight improvement to the page table page trap efficiency.
(generally faster system VM fault performance)
6) Defer creation of unnamed anonymous regions pager until needed.
(speeds up shared memory bss creation)
7) Remove possible deadlock from swap_pager initialization.
8) Enhanced procfs to provide "vminfo" about vm objects and user
pmaps.
9) Increased MCLSHIFT/MCLBYTES from 2K to 4K to improve net &
socket performance and to prepare for things to come.
John Dyson
dyson@implode.root.com
David Greenman
davidg@root.com