1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-15 10:17:20 +00:00
Commit Graph

2263 Commits

Author SHA1 Message Date
Pawel Jakub Dawidek
3a996d6e91 Do not allocate memory based on not-checked argument from userland.
It can be used to panic the kernel by giving too big value.
Fix it by moving allocation and size verification into kern_getfsstat().
This even simplifies kern_getfsstat() consumers, but destroys symmetry -
memory is allocated inside kern_getfsstat(), but has to be freed by the
caller.

Found by:	FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by:	Peter Holm <peter@holm.cc>
2005-06-11 14:58:20 +00:00
Pawel Jakub Dawidek
96c111da9b Fix copy&paste bug. 2005-06-11 11:46:32 +00:00
Alan Cox
1c245ae7d1 Introduce a procedure, pmap_page_init(), that initializes the
vm_page's machine-dependent fields.  Use this function in
vm_pageq_add_new_page() so that the vm_page's machine-dependent and
machine-independent fields are initialized at the same time.

Remove code from pmap_init() for initializing the vm_page's
machine-dependent fields.

Remove stale comments from pmap_init().

Eliminate the Boolean variable pmap_initialized from the alpha, amd64,
i386, and ia64 pmap implementations.  Its use is no longer required
because of the above changes and earlier changes that result in physical
memory that is being mapped at initialization time being mapped without
pv entries.

Tested by: cognet, kensmith, marcel
2005-06-10 03:33:36 +00:00
Joseph Koshy
f263522a45 MFP4:
- Implement sampling modes and logging support in hwpmc(4).

- Separate MI and MD parts of hwpmc(4) and allow sharing of
  PMC implementations across different architectures.
  Add support for P4 (EMT64) style PMCs to the amd64 code.

- New pmcstat(8) options: -E (exit time counts) -W (counts
  every context switch), -R (print log file).

- pmc(3) API changes, improve our ability to keep ABI compatibility
  in the future.  Add more 'alias' names for commonly used events.

- bug fixes & documentation.
2005-06-09 19:45:09 +00:00
Pawel Jakub Dawidek
13a82b9623 Avoid code duplication in serval places by introducing universal
kern_getfsstat() function.

Obtained from:	jhb
2005-06-09 17:44:46 +00:00
Maxim Sobolev
ded18ff2ab Regen after addition of linux_getpriority wrapper.
PR:		kern/81951
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
MFC after:	1 week
2005-06-08 20:47:30 +00:00
Maxim Sobolev
bc165ab0fe Properly convert FreeBSD priority values into Linux values in the
getpriority(2) syscall.

PR:		kern/81951
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
2005-06-08 20:41:28 +00:00
Wilko Bulte
fa9d6cfa34 Until someone who owns the various TGA-based cards has time to fix the
driver it is better to not include the driver in GENERIC as it panics
the system on probing a TGA.
2005-06-06 10:53:10 +00:00
Robert Watson
3984b2328c Rebuild generated system call definition files following the addition of
the audit event field to the syscalls.master file format.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:20:21 +00:00
Robert Watson
f3596e3370 Introduce a new field in the syscalls.master file format to hold the
audit event identifier associated with each system call, which will
be stored by makesyscalls.sh in the sy_auevent field of struct sysent.
For now, default the audit identifier on all system calls to AUE_NULL,
but in the near future, other BSM event identifiers will be used.  The
mapping of system calls to event identifiers is many:one due to
multiple system calls that map to the same end functionality across
compatibility wrappers, ABI wrappers, etc.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:09:18 +00:00
Marcel Moolenaar
470cd51ee6 Create nexus in configure_first() instead of in configure(). This
makes sure that sysinit tasks that run after configure_first(),
but before configure() have a nexus to hang devices off.
2005-05-29 23:44:22 +00:00
Marcel Moolenaar
0ceee7d758 o Call cninit_finish() in configure_final().
o  Remove unused and compiled-out code while here.
2005-05-29 22:42:27 +00:00
Robert Watson
2ec4389f6c White space normalization: use tabs instead of spaces before and after
the system call type field.
2005-05-29 21:06:56 +00:00
Yoshihiro Takahashi
d4fcf3cba5 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
Yoshihiro Takahashi
f7965374d4 Change the spkr_set_pitch() function to a macro to fix low level profiling. 2005-05-28 13:40:27 +00:00
Marcel Moolenaar
6fbeb2e223 For ISA DMA maps, the chipsets scatter/gather feature is used. As
such, the segments pointer in the DMA tag will always be NULL. In
bus_dmamap_load(), temporarily point the segments pointer in the
DMA tag to a local variable so that we don't dereference a NULL
pointer. Reset the segments pointer to NULL after calling the
callback function with it.

PR: alpha/30486
MFC after: 1 week
2005-05-25 07:25:12 +00:00
Pawel Jakub Dawidek
e92eb5805c Add missing jail.h include. 2005-05-22 22:23:37 +00:00
Pawel Jakub Dawidek
af6e6219e2 This code seems to be dead, but anyway:
- Don't leak fsid.
- Don't forget about prison_check_mount().
- Don't use additional variable when there is no need to.
2005-05-22 22:20:46 +00:00
Yoshihiro Takahashi
b22bf66063 - Move bus dependent defines to {isa,cbus}_dmareg.h.
- Use isa/isareg.h rather than <arch>/isa/isa.h.

Tested on: i386, pc98
2005-05-14 10:14:56 +00:00
Yoshihiro Takahashi
24072ca35b - Move timerreg.h to <arch>/include and split i8253 specific defines into
i8253reg.h, and add some defines to control a speaker.
- Move PPI related defines from i386/isa/spkr.c into ppireg.h and use them.
- Move IO_{PPI,TIMER} defines into ppireg.h and timerreg.h respectively.
- Use isa/isareg.h rather than <arch>/isa/isa.h.

Tested on: i386, pc98
2005-05-14 09:10:02 +00:00
David Xu
21fc316430 Change cpu_set_kse_upcall to more generic style, so we can reuse it
in other codes. Add cpu_set_user_tls, use it to tweak user register
and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall,
but since cpu_set_kse_upcall is also used by M:N threads which may
not need this feature, so I wrote a separated cpu_set_user_tls.
2005-04-23 02:32:32 +00:00
Marcel Moolenaar
ff7125a623 Add empty header (except of the multiple-inclusion protection) to
get hwpmc(4) to compile on this platform.
2005-04-20 18:44:53 +00:00
Warner Losh
06db52b609 Break out the definition of bus_space_{tag,handle}_t and a few other types
into _bus.h to help with name space polution from including all of bus.h.
In a few days, I'll commit changes to the MI code to take advantage of thse
sepration (after I've made sure that these changes don't break anything in
the main tree, I've tested in my trees, but you never know...).

Suggested by: bde (in 2002 or 2003 I think)
Reviewed in principle by: jhb
2005-04-18 21:45:34 +00:00
John Baldwin
aa9aa68d2f Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
John Baldwin
85a4f4d527 Fix another instance of TDP_OWEPREEMPT -> td_owepreempt.
Reported by:	tinderbox
2005-04-09 18:15:17 +00:00
John Baldwin
c6a37e8413 Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
John Baldwin
98df9218da - Change the vm_mmap() function to accept an objtype_t parameter specifying
the type of object represented by the handle argument.
- Allow vm_mmap() to map device memory via cdev objects in addition to
  vnodes and anonymous memory.  Note that mmaping a cdev directly does not
  currently perform any MAC checks like mapping a vnode does.
- Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the
  cdev the ioctl is acting on rather than trying to find a suitable vnode
  to map from.

Reviewed by:	alc, arch@
2005-04-01 20:00:11 +00:00
Dag-Erling Smørgrav
8987631f85 MFi386 (1.610): let TUNABLE_ULONG_FETCH handle the suffix. 2005-04-01 10:59:13 +00:00
John Baldwin
50b584201d Use a custom version of copyinuio() to implement osf1_{read,write}v() via
kern_{read,write}v().
2005-03-31 22:56:14 +00:00
Maxim Sobolev
6bcf003260 Add USB Communication Device Class Ethernet driver. Originally written for
FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported
to NetBSD and finally NetBSD version merged with original one goes into
FreeBSD.

Obtained from:  http://www.gank.org/freebsd/cdce/
                NetBSD
                OpenBSD
2005-03-22 14:52:40 +00:00
Murray Stokely
991f5121f0 Add a comment to note that pseudo-device bpf is required for DHCP.
This is mentioned in the Handbook but it is not as obvious to new
users why bpf is needed compared to the other largely self-explanatory
items in GENERIC.

PR:		conf/40855
MFC after:	1 week
2005-03-18 15:24:00 +00:00
Ian Dowse
60719a1a44 Split configure() into 3 separate steps like we do on other
architectures. This makes it possible to insert hooks before and
after the device attachment step.

Tested thanks to:	marcel
2005-03-18 09:45:43 +00:00
Warner Losh
d564c8537d Customize this for the alpha by removing pc98 defines (unused on alpha)
as well as saying that the alpha is wired up in a certain way.
2005-03-16 20:54:48 +00:00
Scott Long
5974e5c71c Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch.  This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date.  Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API.  sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.
2005-03-14 16:46:28 +00:00
Maxime Henrion
56a4dfb981 Fix a long-standing bug in alpha's implementation of busdma. We need to
OR the physical address with alpha_XXX_dmamap_or to get the DMA address,
like the name of the variable suggests.  However, while we were doing
this correctly in the alpha_XXX_dmamap() macro, the busdma code added
the variable to the physical address instead of or'ing it.  Fortunately
and if my math is not entirely wrong, you would need more than 128GB of
RAM and a device able to do DMA in 64bits to experience the bug.

Spotted by:	cognet
2005-03-12 02:43:50 +00:00
Scott Long
8bf0837c7a Remove dead code. 2005-03-07 02:18:52 +00:00
Maxim Sobolev
ecab0de7c1 Regen after addition of linux_nosys handler. 2005-03-07 00:23:58 +00:00
Maxim Sobolev
e3478fe000 Handle unimplemented syscall by instantly returning ENOSYS instead of sending
signal first and only then returning ENOSYS to match what real linux does.

PR:		kern/74302
Submitted by:	Travis Poppe <tlp@LiquidX.org>
2005-03-07 00:18:06 +00:00
Joerg Wunsch
a5f50ef9e4 netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild.  Extension to other compilers is supposed
to be possible, of course.

Submitted by:	netchild
Reviewed by:	various developers on arch@, some time ago
2005-03-02 21:33:29 +00:00
Wes Peters
95e2054492 Attempt to doff the pointy hat: implement 'hw.realmem' on remaining
architectures.  Pointed out by O'Brien, ScottL via email.

Reviewed by:	obrien (various)
2005-03-01 21:55:27 +00:00
Ruslan Ermilov
3971d2cf5e Use a common multi-inclusion protection, and add such a
protection to alpha/include/exec.h.
2005-02-19 21:16:48 +00:00
John Baldwin
fb72f180d5 - Implement osf1_emul_find() using kern_alternate_path(). This changes
the semantics in that the returned filename to use is now a kernel
  pointer rather than a user space pointer.  This required changing the
  arguments to the CHECKALT*() macros some and changing the various system
  calls that used pathnames to use the kern_foo() functions that can accept
  kernel space filename pointers instead of calling the system call
  directly.
- Use kern_open(), kern_stat(), kern_lstat(), kern_fstat(), kern_access(),
  kern_truncate(), kern_pathconf(), kern_execve(), kern_select(),
  kern_setitimer(), kern_getitimer(), kern_statfs(), and kern_fstatfs().

Silence on:	alpha@
2005-02-18 18:37:26 +00:00
John Baldwin
f4b3589bdb Use LCONVPATHEXIST() rather than CHECKALTEXIST() and use
exec_copyin_args(), kern_execve(), and exec_free_args() rather than
execve() to eliminate stackgap use from Alpha's linux_execve().

Silence on:	alpha@
2005-02-18 18:32:32 +00:00
Maxim Sobolev
1a88a252fd Backout previous change (disabling of security checks for signals delivered
in emulation layers), since it appears to be too broad.

Requested by:   rwatson
2005-02-13 17:37:20 +00:00
Maxim Sobolev
d8ff44b79f Split out kill(2) syscall service routine into user-level and kernel part, the
former is callable from user space and the latter from the kernel one. Make
kernel version take additional argument which tells if the respective call
should check for additional restrictions for sending signals to suid/sugid
applications or not.

Make all emulation layers using non-checked version, since signal numbers in
emulation layers can have different meaning that in native mode and such
protection can cause misbehaviour.

As a result remove LIBTHR from the signals allowed to be delivered to a
suid/sugid application.

Requested (sorta) by:	rwatson
MFC after:	2 weeks
2005-02-13 16:42:08 +00:00
Bernd Walter
d84def1440 Implement interrupt routing for DEC_KN20AA.
Tested by:	wilko
MFC after:	2 weeks
2005-02-10 00:35:31 +00:00
Poul-Henning Kamp
0c3c54da63 Since we are quite unlikely to ever face another platform which
uses the i8237 without trying to emulate the PC architecture move
the register definitions for the i8237 chip into the central include
file for the chip, except for the PC98 case which is magic.

Add new isa_dmatc() function which tells us as cheaply as possible
if the terminal count has been reached for a given channel.
2005-02-06 13:46:39 +00:00
Nate Lawson
3888a87205 Finish the job of sorting all includes and fix the build by including
malloc.h before proc.h on sparc64.  Noticed by das@

Compiled on:	alpha, amd64, i386, pc98, sparc64
2005-02-06 01:55:08 +00:00
Nate Lawson
a287c0ffaf Sort includes a little so that bus.h comes before cpu.h (for device_t). 2005-02-04 06:58:09 +00:00
Nate Lawson
4c4381e288 Add an implementation of cpu_est_clockrate(9). This function estimates the
current clock frequency for the given CPU id in units of Hz.
2005-02-04 05:32:56 +00:00
Bernd Walter
1e0b0c0f97 add cpu_idle support for 21066A based lca systems 2005-01-31 23:07:42 +00:00
Maxim Sobolev
610ecfe035 o Split out kernel part of execve(2) syscall into two parts: one that
copies arguments into the kernel space and one that operates
  completely in the kernel space;

o use kernel-only version of execve(2) to kill another stackgap in
  linuxlator/i386.

Obtained from:  DragonFlyBSD (partially)
MFC after:      2 weeks
2005-01-29 23:12:00 +00:00
Ruslan Ermilov
3e17be06d6 Hopefully unbreak modules build. 2005-01-29 21:43:34 +00:00
Scott Long
2f69affe36 Add bus_dmamap_load_mbuf_sg() to alpha. 2005-01-15 20:11:25 +00:00
John Baldwin
f4ef9cec40 - Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer
the last action of kern_exit().  Instead, it is a MD callout to cleanup
  per-process state during exit.
- Add notes of concern to Alpha and ia64 about the possible need to drop
  fp state in cpu_thread_exit() rather than in cpu_exit() since it is
  per-thread state rather than per-process.
2005-01-14 20:13:04 +00:00
Warner Losh
125f6d40bd These are no longer relevant. They are scripts for extracting hints
from 4.x kernel config files.  User's wishing to upgrade from 4.x to 6
will need to go through 5.x, or grab this script from there.  These
scripts will remain in RELENG_5...
2005-01-07 00:54:35 +00:00
Warner Losh
0027ba028a These appear to be unused in our tree, so remove them. 2005-01-05 20:50:31 +00:00
Warner Losh
f44fc746fb Begin all license/copyright comments with /*- or #- 2005-01-05 20:05:52 +00:00
Jun Kuriyama
6f4e528a8e o Use tab instead of spaces for puc(4) line.
o Use capitalized "Ethernet" for consistency.
2005-01-05 05:25:21 +00:00
Wilko Bulte
1538dc7743 - make machine model list more comprehensive, the whole Alpha family tree
should now be present
- clean up comment a bit

MFC after: 1 week
2005-01-01 16:11:53 +00:00
John Baldwin
a0ede505d3 Sync with i386 GENERIC some:
- Update comments to newer style (space after #)
- Bring across various comment updates.
- Add AHC_REG_PRETTY_PRINT, ADAPTIVE_GIANT, and rue(4).
2004-12-30 15:32:31 +00:00
Wilko Bulte
51391ce93b LCA is 21066 and 21068. Add EV7 (bloody optimist.. :) 2004-12-26 13:23:01 +00:00
Alan Cox
1f70d62298 Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
2004-12-23 20:16:11 +00:00
Alan Cox
85f5b24573 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
Marcel Moolenaar
bcc5241c43 Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
   FP registers fall in that category. Passing the new register value
   by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
   the packet will have the register value in target-byte order and
   in the memory representation (modulo the fact that bytes are sent
   as 2 printable hexadecimal numbers of course). We only need to
   decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64
2004-12-01 06:40:35 +00:00
Peter Edwards
2909df6916 When required to negate the absoulte result of a division/remainder
operation (by subtracting the absolute result from 0), don't test
for overflow.

This avoids an arithmetic exception when dividing LONG_MIN by 1:
This is the only case that causes overflow, and the resulting value
is correct under 2's compliment arithmetic.

PR:		72024
Approved by:	dwmalone@
Obtained from:	NetBSD
MFC after:	4 days
2004-11-27 20:59:49 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
John Baldwin
3f40c36312 Fix a cpuid mismatch from the recent cpuid rototill in Alpha: boot_cpu_id
is a PAL ID, while PCPU_GET(cpuid) is a FreeBSD CPU ID.  The FreeBSD CPU
ID of the BSP is always zero, so use that to see which CPU should run the
full clock functions.
2004-11-23 22:11:53 +00:00
David Schultz
ab44ebf537 Remove UAREA_PAGES.
Reviewed by:	arch@
2004-11-20 02:29:50 +00:00
David Schultz
449835405d U areas are going away, so don't allocate them. It's worrisome that
mp_machdep.c was using UAREA_PAGES to allocate something that isn't a
U area, and that there seems to be an implicit assumption that the PCB
is just past the end of the kernel stack.

Reviewed by:	arch@
2004-11-20 02:29:36 +00:00
David Schultz
ff3fd2e764 user.h is included only to get pcb.h, so use the latter directly instead. 2004-11-20 02:28:14 +00:00
Wilko Bulte
4272a4898f Get in sync with reality: TurboLaser was never really well supported to
start with, so let it die in peace.  While there, remove Multia-class
as 486-like performance will not buy us much when 6.x arrives.
2004-11-09 22:24:47 +00:00
John Baldwin
1aafbc01f9 - Add a new MD per-CPU field for Alpha 'pal_id' which is the PAL ID
associated with each processor.  This ID is inferred from the index
  of the pcs structure in the hwprb.
- Give Alpha CPUs FreeBSD CPU IDs more like other architectures where the
  boot processor is always CPU 0 and the other processors are numbered
  1 ... N.  List active CPUs in the system in cpu_mp_announce() as well.

Silence on:	alpha@
2004-11-05 19:16:44 +00:00
Andre Oppermann
32672ba88d Reduce annoying SCSI probing delay from 15 to 5 seconds in all GENRIC kernels.
Discussed on:	-current
2004-11-02 20:57:20 +00:00
John Baldwin
d39d4a6e64 - Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant.  The variable
  can be examined and changed in ddb as '$lines'.  Setting the variable to
  0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
  newlines and carriage returns so that one can rub out content on the
  current line via '\r     \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
  the routine exits.
- Add some aliases to the simple pager to make it more compatible with
  more(1): 'e' and 'j' do a single line.  'd' does half a page, and
  'f' does a full page.

MFC after:	1 month
Inspired by:	kris
2004-11-01 22:15:15 +00:00
John Baldwin
c05c4140e1 Fix a typo so that this compiles. 2004-10-20 16:22:53 +00:00
Poul-Henning Kamp
95bc568977 Add new function ttyinitmode() which sets our systemwide default
modes on a tty structure.

Both the ".init" and the current settings are initialized allowing
the function to be used both at attach and open time.

The function takes an argument to decide if echoing should be enabled.
Echoing should not be enabled for regular physical serial ports
unless they are consoles, in which case they should be configured
by ttyconsolemode() instead.

Use the new function throughout.
2004-10-18 21:51:27 +00:00
Poul-Henning Kamp
256d6e16b0 Add missing flag to userland_sysctl() 2004-10-14 10:43:47 +00:00
John Baldwin
78c85e8dfc Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
  pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
  don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
  times it needs rather than calling getrusage() twice with associated
  stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
  for user, system, and interrupt time as well as a bintime of the total
  runtime.  A new p_rux field in struct proc replaces the same inline fields
  from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime).  A new p_crux
  field in struct proc contains the "raw" child time usage statistics.
  ruadd() has been changed to handle adding the associated rusage_ext
  structures as well as the values in rusage.  Effectively, the values in
  rusage_ext replace the ru_utime and ru_stime values in struct rusage.  These
  two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
  calculates appropriate timevals for user and system time as well as updating
  the rux_[isu]u fields of a passed in rusage_ext structure.  calcru() uses a
  copy of the process' p_rux structure to compute the timevals after updating
  the runtime appropriately if any of the threads in that process are
  currently executing.  It also now only locks sched_lock internally while
  doing the rux_runtime fixup.  calcru() now only requires the caller to
  hold the proc lock and calcru1() only requires the proc lock internally.
  calcru() also no longer allows callers to ask for an interrupt timeval
  since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
  calling calcru1() on p_crux.  Note that this means that any code that wants
  child times must now call this function rather than reading from p_cru
  directly.  This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
  in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
  proctree lock is used to protect the process group rather than the process
  group lock.  By holding this lock until the end of the function we now
  ensure that the process/thread that we pick to dump info about will no
  longer vanish while we are trying to output its info to the console.

Submitted by:	bde (mostly)
MFC after:	1 month
2004-10-05 18:51:11 +00:00
Alan Cox
8ceb3dcb60 The physical address stored in the vm_page is page aligned. There is no
need to mask off the page offset bits.  (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
2004-10-03 00:16:43 +00:00
Alan Cox
07b3303943 Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure.  The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
2004-10-02 07:34:58 +00:00
John Baldwin
76764432e4 - Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
  every 18 lines.  If you hit Enter, it will print another line and prompt
  again.  If you hit space it will output another page and then prompt.
  If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
  to print.  This is useful if your trace happens to walk back onto
  0xdeadc0de and gets stuck in an endless loop.

MFC after:	1 month
Tested on:	i386, alpha, sparc64
2004-09-20 19:05:32 +00:00
Alan Cox
de6c3db01f Simplify the reference counting of page table pages. Specifically, use
the page table page's wired count rather than its hold count to contain
the reference count.  My rationale for this change is based on several
factors:

1. The machine-independent and pmap layers used the same hold count field
   in subtly different ways.  The machine-independent layer uses the hold
   count to implement a form of ephemeral wiring that is used by pipes,
   physio, etc.  In other words, subsystems where we wish to temporarily
   block a page from being swapped out while it is mapped into the kernel's
   address space.  Such pages are never removed from the page queues.
   Instead, the page daemon recognizes a non-zero hold count to mean "hands
   off this page."  In contrast, page table pages are never in the page
   queues; they are wired from birth to death.  The hold count was being
   used as a kind of reference count, specifically, the number of valid
   page table entries within the page.  Not surprisingly, these two
   different uses imply different synchronization rules: in the machine-
   independent layer access to the hold count requires the page queues
   lock; whereas in the pmap layer the pmap lock is required.  Thus,
   continued use by the pmap layer of vm_page_unhold(), which asserts that
   the page queues lock is held, made no sense.

2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired
   count.  An unexpected wired count on a page table page was ignored and
   the underlying page leaked.

3. In a word, microoptimization.  Using the wired count exclusively, rather
   than a combination of the wired and hold counts, makes the code slightly
   smaller and faster.

Reviewed by: tegge@
2004-09-19 21:20:01 +00:00
Alan Cox
6134d96917 MFamd64/i386
Avoid recomputing PHYS_TO_VM_PAGE() unnecessarily in pmap_protect().
2004-09-19 05:34:49 +00:00
Alan Cox
8478ea241b Remove an outdated assertion from _pmap_allocpte(). (When vm_page_alloc()
succeeds, the page's queue field is unconditionally set to PQ_NONE by
vm_pageq_remove_nowakeup().)
2004-09-19 02:39:31 +00:00
Poul-Henning Kamp
216d5bb528 Allocate tty at attach time instead of open time. 2004-09-17 11:04:57 +00:00
Poul-Henning Kamp
c7076ea1b8 Be slightly less bogus in struct tty allocation. 2004-09-17 11:02:53 +00:00
Poul-Henning Kamp
7ce1979be6 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
Alan Cox
fe8d8261ec Add nge. (I've used one for about a week in an XP1000.) 2004-09-11 07:26:50 +00:00
Marcel Moolenaar
7dafab2e78 The previous commit, roughly one and a half years ago removed the
branch prediction optimization for LINT, because the kernel was too
large. This commit now removes it altogether since it causes build
failures for GENERIC kernels and the various applicable trends are
such that one can expect that it these failure will cause more
problems than they're worth in the future. These trends include:
1. Alpha was demoted from tier 1 to tier 2 due to lack of active
   support. The number of people willing to fix build breakages
   is not likely to increase and those developers that do have the
   gumption to test MI changes on alpha are not likely to spend
   time fixing unexpected build failures first.
2. The kernel will only increase in size. Even though stripped-down
   kernels do link without problems now, compiler optimizations (like
   inlining) and new (non-optional) functionality will likely cause
   stripped-down kernels to break in the future as well.

So, with my asbestos suit on, get rid of potential problems before
they happen.

MT5 candidate.
2004-09-10 05:00:27 +00:00
Scott Long
50736a153b Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform.  The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum.  However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change.  The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)
2004-09-08 04:54:19 +00:00
Scott Long
444ba94513 Switch the default scheduler to 4BSD to match what will go into RELENG_5 soon.
It can be switched back once 5.3 is tested and released.  Also turn on
PREEMPTION as many of the stability problems with it have been fixed.

MT5: 3 days.
2004-09-07 22:37:43 +00:00
Poul-Henning Kamp
23385eb7dd Make the alpha timecounter preferable to the i8254. 2004-09-07 07:06:36 +00:00
Julian Elischer
ed062c8d66 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
Scott Long
9923b511ed Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
Julian Elischer
2630e4c90c Give setrunqueue() and sched_add() more of a clue as to
where they are coming from and what is expected from them.

MFC after:	2 days
2004-09-01 02:11:28 +00:00
Julian Elischer
5995adc206 Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after:	3 days
2004-08-31 07:34:54 +00:00
Julian Elischer
99e9dcb817 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
Wilko Bulte
0d86d31bba Add em(4) to Alpha. I had a couple running recently on Alpha and it
appeared to work fine.

Submitted by:	Konstantin Saurbier saurbier at mathematik uni-bielefeld de
2004-08-30 18:40:00 +00:00
Marcel Moolenaar
d4da990081 In alpha_pci_alloc_resource(), when allocating a memory resource,
do not set the virtual address to the bus address when the bus
doesn't have either of the PCI_RF_DENSE or PCI_RF_BWX flags set.
The TGA driver uses the virtual address to access the registers,
which on some machines can cause a memory management fault.  Map
the bus address as K0SEG virtual memory instead. Note that with
some hardware combinations involving the TGA2 adapter this change
merely results that the memory management fault is replaced by a
machine check.
2004-08-29 19:07:14 +00:00