freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-04 09:09:56 +00:00

Author	SHA1	Message	Date
Andriy Gapon	61548876b1	kdb_backtrace: use stack_print_ddb instead of stack_print This is a followup to r212964. stack_print call chain obtains linker sx lock and thus potentially may lead to a deadlock depending on a kind of a panic. stack_print_ddb doesn't acquire any locks and it doesn't use any facilities of ddb backend. Using stack_print_ddb outside of DDB ifdef required taking a number of helper functions from under it as well. It is a good idea to rename linker_ddb_* and stack_*_ddb functions to have 'unlocked' component in their name instead of 'ddb', because those functions do not use any DDB services, but instead they provide unlocked access to linker symbol information. The latter was previously needed only for DDB, hence the 'ddb' name component. Alternative is to ditch unlocked versions altogether after implementing proper panic handling: 1. stop other cpus upon a panic 2. make all non-spinlock lock operations (mutex, sx, rwlock) be a no-op when panicstr != NULL Suggested by: mdf Discussed with: attilio MFC after: 2 weeks	2010-09-22 06:45:07 +00:00
Alexander Motin	9dfc483c4a	If kernel built with DEVICE_POLLING, keep one CPU always in active state to handle it.	2010-09-22 05:32:37 +00:00
John Baldwin	a8103ae8ca	Comment nit, set TDF_NEEDRESCHED after the comment describing why it is done rather than before. MFC after: 1 week	2010-09-21 19:12:22 +00:00
Alexander Motin	bcb74c4c95	If new callout scheduled to another CPU and we are using global timer, there is high probability that timer is already programmed by some other CPU. Especially by one that registered this callout, and so active now.	2010-09-21 17:37:28 +00:00
Alexander Motin	afe41f2da7	Remember last kern.eventtimer.periodic value, explicitly set by user. If timer capabilities forcing us to change periodicity mode, try to restore it back later, as soon as new choosen timer capable to do it. Without this, timer change like HPET->RTC->HPET always results in enabling periodic mode.	2010-09-21 16:50:24 +00:00
Alan Cox	8f7f5a7f26	Fix exec_imgact_shell()'s handling of two error cases: (1) Previously, if the first line of a script exceeded MAXSHELLCMDLEN characters, then exec_imgact_shell() silently truncated the line and passed on the truncated interpreter name or argument. Now, exec_imgact_shell() will fail and return ENOEXEC, which is the commonly used errno among Unix variants for this type of error. (2) Previously, exec_imgact_shell()'s check on the length of the interpreter's name was ineffective. In other words, exec_imgact_shell() could not possibly fail and return ENAMETOOLONG. The reason being that the length of the interpreter name had to exceed MAXSHELLCMDLEN characters in order that ENAMETOOLONG be returned. But, the search for the end of the interpreter name stops after at most MAXSHELLCMDLEN - 2 characters are scanned. (In the end, this particular error is eventually discovered outside of exec_imgact_shell() and ENAMETOOLONG is returned. So, the real effect of this second change is that the error is detected earlier, in exec_imgact_shell().) Update the definition of MAXINTERP to the actual limit on the size of the interpreter name that has been in effect since r142453 (from 2005). In collaboration with: kib	2010-09-21 16:24:51 +00:00
Andriy Gapon	088acbb312	kdb_backtrace: stack(9)-based code to print backtrace without any backend The idea is to add KDB and KDB_TRACE options to GENERIC kernels on stable branches, so that at least the minimal information is produced for non-specific panics like traps on page faults. The GENERICs in stable branches seem to already include STACK option. Reviewed by: attilio MFC after: 2 weeks	2010-09-21 15:07:44 +00:00
Alexander Motin	95d23438dd	Until hardclock() and respectively tc_windup() called first time, system is running on "dummy" time counter. But to function properly in one-shot mode, event timer management code requires working time counter. Slow moving "dummy" time counter delays first hardclock() call by few seconds on my systems, even though timer interrupts were correctly kicking kernel. That causes few seconds delay during boot with one-shot mode enabled. To break this loop, explicitly call tc_windup() first time during initialization process to let it switch to some real time counter.	2010-09-21 08:02:02 +00:00
Edward Tomasz Napierala	4089cc8aa1	First step at adopting FreeBSD to support PSARC/2010/029. This makes acl_is_trivial_np(3) properly recognize the new trivial ACLs. From the user point of view, that means "ls -l" no longer shows plus signs for all the files when running ZFS v28.	2010-09-20 17:10:06 +00:00
Ed Schouten	d1817ed7f3	Just make callout devices and /dev/console force CLOCAL on open(). Instead of adding custom checks to wait for DCD on open(), just modify the termios structure to set CLOCAL. This means SIGHUP is no longer generated when losing DCD as well. Reviewed by: kib@ MFC after: 1 week	2010-09-19 16:35:42 +00:00
Ed Schouten	4b5d5046ab	Ignore DCD handling on /dev/console entirely. This makes /dev/console more fail-safe and prevents a potential console lock-up during boot. Discussed on: stable@ Tested by: koitsu@ MFC after: 1 week	2010-09-19 14:21:39 +00:00
Robert Watson	adb6aa9ab9	With reworking of the socket life cycle in 7.x, the need for a "sotryfree()" was eliminated: all references to sockets are explicitly managed by sorele() and the protocols. As such, garbage collect sotryfree(), and update sofree() comments to make the new world order more clear. MFC after: 3 days Reported by: Anuranjan Shukla <anshukla at juniper dot net>	2010-09-18 11:18:42 +00:00
Andriy Gapon	19b8a6dbc1	kern.sched.topology_spec sysctl: use step of 1 for group levels numeration This is just a cosmetic change for prettier output. 'indent' variable/parameter serves two purposes: it specifies whitespace indentation level and also implies cpu group level/depth. It would have been better to split those two uses, but for now just a simple change. MFC after: 1 week	2010-09-18 11:16:43 +00:00
Alexander Motin	8e860de4bf	When global timer used at SMP system, update nextevent field on BSP before sending IPI to other CPUs. Otherwise, other CPUs will try to honor stale value, programming timer for zero interval. If timer is fast enough, it caused extra interrupt before timer correctly reprogrammed by BSP.	2010-09-18 07:18:30 +00:00
Warner Losh	5ff4999243	By popular demand, kill all the non GIANT related interrupt messages. They are confusing and add little value. Reviewed by: jhb@	2010-09-17 16:05:25 +00:00
Matthew D Fleming	4e6571599b	Re-add r212370 now that the LOR in powerpc64 has been resolved: Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough SBUF_FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk (original patch)	2010-09-16 16:13:12 +00:00
Alexander Motin	9aff0c8ff7	Fix panic on NULL dereference possible after r212541.	2010-09-14 10:26:49 +00:00
Alexander Motin	0e18987383	Make kern_tc.c provide minimum frequency of tc_ticktock() calls, required to handle current timecounter wraps. Make kern_clocksource.c to honor that requirement, scheduling sleeps on first CPU for no more then specified period. Allow other CPUs to sleep up to 1/4 second (for any case).	2010-09-14 08:48:06 +00:00
Alexander Motin	4763a8b8c1	Replace spin lock with the set of atomics. It is impractical for one tc_ticktock() call to wait for another's completion -- just skip it.	2010-09-14 04:57:30 +00:00
Alexander Motin	dd9595e7fa	Add some foot shooting protection by checking singlemul value correctness. Rephrase sysctls descriptions. Suggested by: edmaste	2010-09-14 04:48:04 +00:00
Matthew D Fleming	404a593e28	Revert r212370, as it causes a LOR on powerpc. powerpc does a few unexpected things in copyout(9) and so wiring the user buffer is not sufficient to perform a copyout(9) while holding a random mutex. Requested by: nwhitehorn	2010-09-13 18:48:23 +00:00
Andriy Gapon	b7d28b2e0b	bus_add_child: add specialized default implementation that calls panic If a kobj method doesn't have any explicitly provided default implementation, then it is auto-assigned kobj_error_method. kobj_error_method is proper only for methods that return error code, because it just returns ENXIO. So, in the case of unimplemented bus_add_child caller would get (device_t)ENXIO as a return value, which would cause the mistake to go unnoticed, because return value is typically checked for NULL. Thus, a specialized null_add_child is added. It would have sufficied for correctness to return NULL, but this type of mistake was deemed to be rare and serious enough to call panic instead. Watch out for this kind of problem with other kobj methods. Suggested by: jhb, imp MFC after: 2 weeks	2010-09-13 08:34:20 +00:00
Alexander Motin	a157e42516	Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.	2010-09-13 07:25:35 +00:00
Alexander Motin	90baf564d2	Do not print "frequency 0 Hz", when frequency is unknown.	2010-09-11 20:18:15 +00:00
Alexander Kabaev	eb262be333	Add missing pointer increment to sbuf_cat.	2010-09-11 19:42:50 +00:00
Konstantin Belousov	9a24dc0760	Protect mnt_syncer with the sync_mtx. This prevents a (rare) vnode leak when mount and update are executed in parallel. Encapsulate syncer vnode deallocation into the helper function vfs_deallocate_syncvnode(), to not externalize sync_mtx from vfs_subr.c. Found and reviewed by: jh (previous version of the patch) Tested by: pho MFC after: 3 weeks	2010-09-11 13:06:06 +00:00
Alexander Motin	b722ad008b	Merge some SCHED_ULE features to SCHED_4BSD: - Teach SCHED_4BSD to inform cpu_idle() about high sleep/wakeup rate to choose optimized handler. In case of x86 it is MONITOR/MWAIT. Also it will be needed to bypass forthcoming idle tick skipping logic to not consume resources on events rescheduling when it won't give any benefits. - Teach SCHED_4BSD to wake up idle CPUs without using IPI. In case of x86, when MONITOR/MWAIT is active, it require just single memory write. This doubles performance on some heavily switching test loads.	2010-09-11 07:08:22 +00:00
Jamie Gritton	f337198db0	Don't exit kern_jail_set without freeing options when enforce_statfs has an illegal value. MFC after: 3 days	2010-09-10 21:45:42 +00:00
Matthew D Fleming	4d369413e1	Replace sbuf_overflowed() with sbuf_error(), which returns any error code associated with overflow or with the drain function. While this function is not expected to be used often, it produces more information in the form of an errno that sbuf_overflowed() did.	2010-09-10 16:42:16 +00:00
Alexander Motin	9f9ad565a1	Do not IPI CPU that is already spinning for load. It doubles effect of spining (comparing to MWAIT) on some heavly switching test loads.	2010-09-10 13:24:47 +00:00
Andriy Gapon	3d844eddb7	bus_add_child: change type of order parameter to u_int This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days	2010-09-10 11:19:03 +00:00
Matthew D Fleming	dd67e2103c	Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk	2010-09-09 18:33:46 +00:00
Matthew D Fleming	4351ba272c	Add drain functionality to sbufs. The drain is a function that is called when the sbuf internal buffer is filled. For kernel sbufs with a drain, the internal buffer will never be expanded. For userland sbufs with a drain, the internal buffer may still be expanded by sbuf_[v]printf(3). Sbufs now have three basic uses: 1) static string manipulation. Overflow is marked. 2) dynamic string manipulation. Overflow triggers string growth. 3) drained string manipulation. Overflow triggers draining. In all cases the manipulation is 'safe' in that overflow is detected and managed. Reviewed by: phk (the previous version)	2010-09-09 17:49:18 +00:00
Matthew D Fleming	01f6f5fcd4	Refactor sbuf code so that most uses of sbuf_extend() are in a new sbuf_put_byte(). This makes it easier to add drain functionality when a buffer would overflow as there are fewer code points. Reviewed by: phk	2010-09-09 16:51:52 +00:00
Rui Paulo	d3555b6fc2	Fix two bugs in DTrace: * when the process exits, remove the associated USDT probes * when the process forks, duplicate the USDT probes. Sponsored by: The FreeBSD Foundation	2010-09-09 09:58:05 +00:00
Pawel Jakub Dawidek	4946fa6791	Remove VI_MOUNT flag from vnode on VFS_MOUNT() failure.	2010-09-09 07:55:13 +00:00
Pawel Jakub Dawidek	7443b79b81	Doing first mount and updating mount points are both handled by the same syscall and the same function, but are very different and share almost no code. To make it easier to read and analyze, split vfs_domount() into vfs_domount_first() and vfs_domount_update(). Reviewed by: kib	2010-09-08 21:00:53 +00:00
Pawel Jakub Dawidek	a34512e3f0	- Log all the problems in devfs_fixup(). - Correct error paths. The system will be useless on devfs_fixup() failure, so why bother? Maybe for the same reason why a dead body is washed and dressed in a nice suit before it is put into a coffin? Maybe system's last will is to panic without any locks held? Reviewed by: kib	2010-09-08 20:56:18 +00:00
Andriy Gapon	3b0620e06c	subr_bus: use hexadecimal representation for bit flags It seems that this format is more custom in our code, and it is more convenient too. Suggested by: jhb No objection: imp MFC after: 1 week	2010-09-08 17:35:06 +00:00
Michael Tuexen	049640c1f0	Implement correct handling of address parameter and sendinfo for SCTP send calls. MFC after: 4 weeks.	2010-09-05 20:13:07 +00:00
Alexander Motin	d89be9509f	Initialize buffer for case of empty string. Happens only on non-refactored platforms.	2010-09-05 06:16:04 +00:00
Andriy Gapon	ef3b7ba04f	struct device: widen type of flags and order fields to u_int Also change int -> u_int for order parameter in device_add_child_ordered. There should not be any ABI change as struct device is private to subr_bus.c and the API change should be compatible. To do: change int -> u_int for order parameter of bus_add_child method and its implementations. The change should also be API compatible, but is a bit more churn. Suggested by: imp, jhb MFC after: 1 week	2010-09-04 17:28:29 +00:00
Matthew D Fleming	181ff3d503	Use a better #if guard. Suggested by pluknet <pluknet at gmail dot com>.	2010-09-03 17:42:17 +00:00
Matthew D Fleming	c05dbe7a54	Style(9) fixes and eliminate the use of min().	2010-09-03 17:42:12 +00:00
Matthew D Fleming	969292fb1b	Fix user-space libsbuf build. Why isn't CTASSERT available to user-space?	2010-09-03 17:23:26 +00:00
Matthew D Fleming	f5a5dc5da8	Fix brain fart when converting an if statement into a KASSERT.	2010-09-03 16:12:39 +00:00
Matthew D Fleming	f4bafab8da	Use math rather than iteration when the desired sbuf size is larger than SBUF_MAXEXTENDSIZE.	2010-09-03 16:09:17 +00:00
Justin T. Gibbs	f03f7a0ca3	Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation MFC after: 1 month	2010-09-02 19:40:28 +00:00
Matthew D Fleming	ba4932b5a2	Fix UP build. MFC after: 2 weeks	2010-09-02 16:23:05 +00:00
Matthew D Fleming	0f7a0ebd59	Fix a bug with sched_affinity() where it checks td_pinned of another thread in a racy manner, which can lead to attempting to migrate a thread that is pinned to a CPU. Instead, have sched_switch() determine which CPU a thread should run on if the current one is not allowed. KASSERT in sched_bind() that the thread is not yet pinned to a CPU. KASSERT in sched_switch() that only migratable threads or those moving due to a sched_bind() are changing CPUs. sched_affinity code came from jhb@. MFC after: 2 weeks	2010-09-01 20:32:47 +00:00

1 2 3 4 5 ...

11861 Commits