2003-01-26 05:23:15 +00:00
|
|
|
/*-
|
2006-12-29 10:37:07 +00:00
|
|
|
* Copyright (c) 2002-2006, Jeffrey Roberson <jeff@freebsd.org>
|
2003-01-26 05:23:15 +00:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice unmodified, this list of conditions, and the following
|
|
|
|
* disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
|
|
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
|
|
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
|
|
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
|
|
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
|
|
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
|
|
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
|
|
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
|
|
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
|
|
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
*/
|
|
|
|
|
2003-06-11 00:56:59 +00:00
|
|
|
#include <sys/cdefs.h>
|
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
2005-06-24 00:16:57 +00:00
|
|
|
#include "opt_hwpmc_hooks.h"
|
|
|
|
#include "opt_sched.h"
|
2004-09-02 18:59:15 +00:00
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
#include <sys/param.h>
|
|
|
|
#include <sys/systm.h>
|
2004-07-10 21:38:22 +00:00
|
|
|
#include <sys/kdb.h>
|
2003-01-26 05:23:15 +00:00
|
|
|
#include <sys/kernel.h>
|
|
|
|
#include <sys/ktr.h>
|
|
|
|
#include <sys/lock.h>
|
|
|
|
#include <sys/mutex.h>
|
|
|
|
#include <sys/proc.h>
|
2003-04-02 06:46:43 +00:00
|
|
|
#include <sys/resource.h>
|
2003-11-04 07:45:41 +00:00
|
|
|
#include <sys/resourcevar.h>
|
2003-01-26 05:23:15 +00:00
|
|
|
#include <sys/sched.h>
|
|
|
|
#include <sys/smp.h>
|
|
|
|
#include <sys/sx.h>
|
|
|
|
#include <sys/sysctl.h>
|
|
|
|
#include <sys/sysproto.h>
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
#include <sys/turnstile.h>
|
2006-08-25 06:12:53 +00:00
|
|
|
#include <sys/umtx.h>
|
2003-01-26 05:23:15 +00:00
|
|
|
#include <sys/vmmeter.h>
|
|
|
|
#ifdef KTRACE
|
|
|
|
#include <sys/uio.h>
|
|
|
|
#include <sys/ktrace.h>
|
|
|
|
#endif
|
|
|
|
|
2005-04-19 04:01:25 +00:00
|
|
|
#ifdef HWPMC_HOOKS
|
|
|
|
#include <sys/pmckern.h>
|
|
|
|
#endif
|
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
#include <machine/cpu.h>
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#include <machine/smp.h>
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
/* decay 95% of `p_pctcpu' in 60 seconds; see CCPU_SHIFT before changing */
|
|
|
|
/* XXX This is bogus compatability crap for ps */
|
|
|
|
static fixpt_t ccpu = 0.95122942450071400909 * FSCALE; /* exp(-1/20) */
|
|
|
|
SYSCTL_INT(_kern, OID_AUTO, ccpu, CTLFLAG_RD, &ccpu, 0, "");
|
|
|
|
|
|
|
|
static void sched_setup(void *dummy);
|
|
|
|
SYSINIT(sched_setup, SI_SUB_RUN_QUEUE, SI_ORDER_FIRST, sched_setup, NULL)
|
|
|
|
|
2005-12-19 08:26:09 +00:00
|
|
|
static void sched_initticks(void *dummy);
|
|
|
|
SYSINIT(sched_initticks, SI_SUB_CLOCKS, SI_ORDER_THIRD, sched_initticks, NULL)
|
|
|
|
|
2004-07-23 23:09:00 +00:00
|
|
|
static SYSCTL_NODE(_kern, OID_AUTO, sched, CTLFLAG_RW, 0, "Scheduler");
|
2003-04-11 03:47:14 +00:00
|
|
|
|
2004-07-23 23:09:00 +00:00
|
|
|
SYSCTL_STRING(_kern_sched, OID_AUTO, name, CTLFLAG_RD, "ule", 0,
|
|
|
|
"Scheduler name");
|
2004-06-21 22:05:46 +00:00
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
static int slice_min = 1;
|
|
|
|
SYSCTL_INT(_kern_sched, OID_AUTO, slice_min, CTLFLAG_RW, &slice_min, 0, "");
|
|
|
|
|
2003-06-15 02:18:29 +00:00
|
|
|
static int slice_max = 10;
|
2003-04-11 03:47:14 +00:00
|
|
|
SYSCTL_INT(_kern_sched, OID_AUTO, slice_max, CTLFLAG_RW, &slice_max, 0, "");
|
|
|
|
|
|
|
|
int realstathz;
|
2005-12-19 08:26:09 +00:00
|
|
|
int tickincr = 1 << 10;
|
2003-02-10 14:11:23 +00:00
|
|
|
|
2005-06-04 09:23:28 +00:00
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Thread scheduler specific section.
|
2004-09-05 02:09:54 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched {
|
|
|
|
TAILQ_ENTRY(td_sched) ts_procq; /* (j/z) Run queue. */
|
|
|
|
int ts_flags; /* (j) TSF_* flags. */
|
|
|
|
struct thread *ts_thread; /* (*) Active associated thread. */
|
|
|
|
fixpt_t ts_pctcpu; /* (j) %cpu during p_swtime. */
|
|
|
|
u_char ts_rqindex; /* (j) Run queue index. */
|
2004-09-05 02:09:54 +00:00
|
|
|
enum {
|
2006-12-06 06:34:57 +00:00
|
|
|
TSS_THREAD = 0x0, /* slaved to thread state */
|
|
|
|
TSS_ONRUNQ
|
|
|
|
} ts_state; /* (j) thread sched specific status. */
|
|
|
|
int ts_slptime;
|
|
|
|
int ts_slice;
|
|
|
|
struct runq *ts_runq;
|
|
|
|
u_char ts_cpu; /* CPU that we have affinity for. */
|
2004-09-05 02:09:54 +00:00
|
|
|
/* The following variables are only used for pctcpu calculation */
|
2006-12-06 06:34:57 +00:00
|
|
|
int ts_ltick; /* Last tick that we were running on */
|
|
|
|
int ts_ftick; /* First tick that we were running on */
|
|
|
|
int ts_ticks; /* Tick count */
|
2004-09-05 02:09:54 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
/* originally from kg_sched */
|
|
|
|
int skg_slptime; /* Number of ticks we vol. slept */
|
|
|
|
int skg_runtime; /* Number of ticks we were running */
|
2004-09-05 02:09:54 +00:00
|
|
|
};
|
2006-12-06 06:34:57 +00:00
|
|
|
#define ts_assign ts_procq.tqe_next
|
|
|
|
/* flags kept in ts_flags */
|
|
|
|
#define TSF_ASSIGNED 0x0001 /* Thread is being migrated. */
|
|
|
|
#define TSF_BOUND 0x0002 /* Thread can not migrate. */
|
|
|
|
#define TSF_XFERABLE 0x0004 /* Thread was added as transferable. */
|
|
|
|
#define TSF_HOLD 0x0008 /* Thread is temporarily bound. */
|
|
|
|
#define TSF_REMOVED 0x0010 /* Thread was removed while ASSIGNED */
|
|
|
|
#define TSF_INTERNAL 0x0020 /* Thread added due to migration. */
|
|
|
|
#define TSF_PREEMPTED 0x0040 /* Thread was preempted */
|
2006-12-29 10:37:07 +00:00
|
|
|
#define TSF_DIDRUN 0x2000 /* Thread actually ran. */
|
|
|
|
#define TSF_EXIT 0x4000 /* Thread is being killed. */
|
2006-12-06 06:34:57 +00:00
|
|
|
|
|
|
|
static struct td_sched td_sched0;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
/*
|
2003-06-21 02:22:47 +00:00
|
|
|
* The priority is primarily determined by the interactivity score. Thus, we
|
2006-12-29 10:37:07 +00:00
|
|
|
* give lower(better) priorities to threads that use less CPU. The nice
|
2003-06-21 02:22:47 +00:00
|
|
|
* value is then directly added to this to allow nice to have some effect
|
|
|
|
* on latency.
|
2003-03-04 02:45:59 +00:00
|
|
|
*
|
|
|
|
* PRI_RANGE: Total priority range for timeshare threads.
|
2003-06-21 02:22:47 +00:00
|
|
|
* PRI_NRESV: Number of nice values.
|
2003-03-04 02:45:59 +00:00
|
|
|
* PRI_BASE: The start of the dynamic range.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2003-03-04 02:45:59 +00:00
|
|
|
#define SCHED_PRI_RANGE (PRI_MAX_TIMESHARE - PRI_MIN_TIMESHARE + 1)
|
2003-11-02 03:49:32 +00:00
|
|
|
#define SCHED_PRI_NRESV ((PRIO_MAX - PRIO_MIN) + 1)
|
|
|
|
#define SCHED_PRI_NHALF (SCHED_PRI_NRESV / 2)
|
2003-06-21 02:22:47 +00:00
|
|
|
#define SCHED_PRI_BASE (PRI_MIN_TIMESHARE)
|
2003-04-11 03:47:14 +00:00
|
|
|
#define SCHED_PRI_INTERACT(score) \
|
2003-06-21 02:22:47 +00:00
|
|
|
((score) * SCHED_PRI_RANGE / SCHED_INTERACT_MAX)
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
/*
|
2003-03-04 02:45:59 +00:00
|
|
|
* These determine the interactivity of a process.
|
2003-01-26 05:23:15 +00:00
|
|
|
*
|
2003-02-10 14:03:45 +00:00
|
|
|
* SLP_RUN_MAX: Maximum amount of sleep time + run time we'll accumulate
|
|
|
|
* before throttling back.
|
2003-11-02 03:36:33 +00:00
|
|
|
* SLP_RUN_FORK: Maximum slp+run time to inherit at fork time.
|
2003-06-15 02:18:29 +00:00
|
|
|
* INTERACT_MAX: Maximum interactivity value. Smaller is better.
|
2003-03-04 02:45:59 +00:00
|
|
|
* INTERACT_THRESH: Threshhold for placement on the current runq.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2003-10-16 08:17:43 +00:00
|
|
|
#define SCHED_SLP_RUN_MAX ((hz * 5) << 10)
|
2003-11-02 03:36:33 +00:00
|
|
|
#define SCHED_SLP_RUN_FORK ((hz / 2) << 10)
|
2003-06-15 02:18:29 +00:00
|
|
|
#define SCHED_INTERACT_MAX (100)
|
|
|
|
#define SCHED_INTERACT_HALF (SCHED_INTERACT_MAX / 2)
|
2003-10-16 08:17:43 +00:00
|
|
|
#define SCHED_INTERACT_THRESH (30)
|
2003-03-04 02:45:59 +00:00
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
/*
|
|
|
|
* These parameters and macros determine the size of the time slice that is
|
|
|
|
* granted to each thread.
|
|
|
|
*
|
|
|
|
* SLICE_MIN: Minimum time slice granted, in units of ticks.
|
|
|
|
* SLICE_MAX: Maximum time slice granted.
|
|
|
|
* SLICE_RANGE: Range of available time slices scaled by hz.
|
2003-04-02 06:46:43 +00:00
|
|
|
* SLICE_SCALE: The number slices granted per val in the range of [0, max].
|
|
|
|
* SLICE_NICE: Determine the amount of slice granted to a scaled nice.
|
2003-11-02 04:10:15 +00:00
|
|
|
* SLICE_NTHRESH: The nice cutoff point for slice assignment.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2003-04-11 03:47:14 +00:00
|
|
|
#define SCHED_SLICE_MIN (slice_min)
|
|
|
|
#define SCHED_SLICE_MAX (slice_max)
|
2004-02-01 10:38:13 +00:00
|
|
|
#define SCHED_SLICE_INTERACTIVE (slice_max)
|
2003-11-02 04:10:15 +00:00
|
|
|
#define SCHED_SLICE_NTHRESH (SCHED_PRI_NHALF - 1)
|
2003-03-04 02:45:59 +00:00
|
|
|
#define SCHED_SLICE_RANGE (SCHED_SLICE_MAX - SCHED_SLICE_MIN + 1)
|
2003-01-26 05:23:15 +00:00
|
|
|
#define SCHED_SLICE_SCALE(val, max) (((val) * SCHED_SLICE_RANGE) / (max))
|
2003-04-02 06:46:43 +00:00
|
|
|
#define SCHED_SLICE_NICE(nice) \
|
2003-11-02 04:10:15 +00:00
|
|
|
(SCHED_SLICE_MAX - SCHED_SLICE_SCALE((nice), SCHED_SLICE_NTHRESH))
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
/*
|
2004-09-05 02:09:54 +00:00
|
|
|
* This macro determines whether or not the thread belongs on the current or
|
2003-01-26 05:23:15 +00:00
|
|
|
* next run queue.
|
|
|
|
*/
|
2006-10-26 21:42:22 +00:00
|
|
|
#define SCHED_INTERACTIVE(td) \
|
|
|
|
(sched_interact_score(td) < SCHED_INTERACT_THRESH)
|
2006-12-06 06:34:57 +00:00
|
|
|
#define SCHED_CURR(td, ts) \
|
|
|
|
((ts->ts_thread->td_flags & TDF_BORROWING) || \
|
|
|
|
(ts->ts_flags & TSF_PREEMPTED) || SCHED_INTERACTIVE(td))
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Cpu percentage computation macros and defines.
|
|
|
|
*
|
|
|
|
* SCHED_CPU_TIME: Number of seconds to average the cpu usage across.
|
|
|
|
* SCHED_CPU_TICKS: Number of hz ticks to average the cpu usage across.
|
|
|
|
*/
|
|
|
|
|
2003-04-02 08:22:33 +00:00
|
|
|
#define SCHED_CPU_TIME 10
|
2003-01-26 05:23:15 +00:00
|
|
|
#define SCHED_CPU_TICKS (hz * SCHED_CPU_TIME)
|
|
|
|
|
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* tdq - per processor runqs and statistics.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq {
|
2006-12-29 10:37:07 +00:00
|
|
|
struct runq tdq_idle; /* Queue of IDLE threads. */
|
|
|
|
struct runq tdq_timeshare[2]; /* Run queues for !IDLE. */
|
|
|
|
struct runq *tdq_next; /* Next timeshare queue. */
|
|
|
|
struct runq *tdq_curr; /* Current queue. */
|
|
|
|
int tdq_load_timeshare; /* Load for timeshare. */
|
|
|
|
int tdq_load; /* Aggregate load. */
|
|
|
|
short tdq_nice[SCHED_PRI_NRESV]; /* threadss in each nice bin. */
|
|
|
|
short tdq_nicemin; /* Least nice. */
|
2003-02-03 05:30:07 +00:00
|
|
|
#ifdef SMP
|
2006-12-29 10:37:07 +00:00
|
|
|
int tdq_transferable;
|
|
|
|
LIST_ENTRY(tdq) tdq_siblings; /* Next in tdq group. */
|
|
|
|
struct tdq_group *tdq_group; /* Our processor group. */
|
|
|
|
volatile struct td_sched *tdq_assigned; /* assigned by another CPU. */
|
2004-02-01 02:48:36 +00:00
|
|
|
#else
|
2006-12-29 10:37:07 +00:00
|
|
|
int tdq_sysload; /* For loadavg, !ITHD load. */
|
2003-02-03 05:30:07 +00:00
|
|
|
#endif
|
2003-01-26 05:23:15 +00:00
|
|
|
};
|
|
|
|
|
2003-12-11 03:57:10 +00:00
|
|
|
#ifdef SMP
|
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* tdq groups are groups of processors which can cheaply share threads. When
|
2003-12-11 03:57:10 +00:00
|
|
|
* one processor in the group goes idle it will check the runqs of the other
|
|
|
|
* processors in its group prior to halting and waiting for an interrupt.
|
|
|
|
* These groups are suitable for SMT (Symetric Multi-Threading) and not NUMA.
|
|
|
|
* In a numa environment we'd want an idle bitmap per group and a two tiered
|
|
|
|
* load balancer.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq_group {
|
2006-12-29 10:37:07 +00:00
|
|
|
int tdg_cpus; /* Count of CPUs in this tdq group. */
|
|
|
|
cpumask_t tdg_cpumask; /* Mask of cpus in this group. */
|
|
|
|
cpumask_t tdg_idlemask; /* Idle cpus in this group. */
|
|
|
|
cpumask_t tdg_mask; /* Bit mask for first cpu. */
|
|
|
|
int tdg_load; /* Total load of this group. */
|
|
|
|
int tdg_transferable; /* Transferable load of this group. */
|
|
|
|
LIST_HEAD(, tdq) tdg_members; /* Linked list of all members. */
|
2003-12-11 03:57:10 +00:00
|
|
|
};
|
|
|
|
#endif
|
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
/*
|
2006-12-29 10:37:07 +00:00
|
|
|
* One thread queue per processor.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2003-01-29 07:00:51 +00:00
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
static cpumask_t tdq_idle;
|
2006-12-29 10:37:07 +00:00
|
|
|
static int tdg_maxid;
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct tdq tdq_cpu[MAXCPU];
|
|
|
|
static struct tdq_group tdq_groups[MAXCPU];
|
2004-06-02 05:46:48 +00:00
|
|
|
static int bal_tick;
|
|
|
|
static int gbal_tick;
|
2004-12-26 22:56:08 +00:00
|
|
|
static int balance_groups;
|
2004-06-02 05:46:48 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
#define TDQ_SELF() (&tdq_cpu[PCPU_GET(cpuid)])
|
|
|
|
#define TDQ_CPU(x) (&tdq_cpu[(x)])
|
|
|
|
#define TDQ_ID(x) ((x) - tdq_cpu)
|
|
|
|
#define TDQ_GROUP(x) (&tdq_groups[(x)])
|
2003-12-11 03:57:10 +00:00
|
|
|
#else /* !SMP */
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct tdq tdq_cpu;
|
2004-06-02 05:46:48 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
#define TDQ_SELF() (&tdq_cpu)
|
|
|
|
#define TDQ_CPU(x) (&tdq_cpu)
|
2003-01-29 07:00:51 +00:00
|
|
|
#endif
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct td_sched *sched_choose(void); /* XXX Should be thread * */
|
|
|
|
static void sched_slice(struct td_sched *);
|
2006-10-26 21:42:22 +00:00
|
|
|
static void sched_priority(struct thread *);
|
2005-06-04 09:23:28 +00:00
|
|
|
static void sched_thread_priority(struct thread *, u_char);
|
2006-10-26 21:42:22 +00:00
|
|
|
static int sched_interact_score(struct thread *);
|
|
|
|
static void sched_interact_update(struct thread *);
|
|
|
|
static void sched_interact_fork(struct thread *);
|
2006-12-06 06:34:57 +00:00
|
|
|
static void sched_pctcpu_update(struct td_sched *);
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-02-03 05:30:07 +00:00
|
|
|
/* Operations on per processor queues */
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct td_sched * tdq_choose(struct tdq *);
|
|
|
|
static void tdq_setup(struct tdq *);
|
|
|
|
static void tdq_load_add(struct tdq *, struct td_sched *);
|
|
|
|
static void tdq_load_rem(struct tdq *, struct td_sched *);
|
|
|
|
static __inline void tdq_runq_add(struct tdq *, struct td_sched *, int);
|
|
|
|
static __inline void tdq_runq_rem(struct tdq *, struct td_sched *);
|
|
|
|
static void tdq_nice_add(struct tdq *, int);
|
|
|
|
static void tdq_nice_rem(struct tdq *, int);
|
|
|
|
void tdq_print(int cpu);
|
2003-02-03 05:30:07 +00:00
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
static int tdq_transfer(struct tdq *, struct td_sched *, int);
|
|
|
|
static struct td_sched *runq_steal(struct runq *);
|
2004-06-02 05:46:48 +00:00
|
|
|
static void sched_balance(void);
|
|
|
|
static void sched_balance_groups(void);
|
2006-12-06 06:34:57 +00:00
|
|
|
static void sched_balance_group(struct tdq_group *);
|
|
|
|
static void sched_balance_pair(struct tdq *, struct tdq *);
|
|
|
|
static void tdq_move(struct tdq *, int);
|
|
|
|
static int tdq_idled(struct tdq *);
|
|
|
|
static void tdq_notify(struct td_sched *, int);
|
|
|
|
static void tdq_assign(struct tdq *);
|
|
|
|
static struct td_sched *tdq_steal(struct tdq *, int);
|
|
|
|
#define THREAD_CAN_MIGRATE(ts) \
|
|
|
|
((ts)->ts_thread->td_pinned == 0 && ((ts)->ts_flags & TSF_BOUND) == 0)
|
2003-02-03 05:30:07 +00:00
|
|
|
#endif
|
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_print(int cpu)
|
2003-02-03 05:30:07 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
2003-04-11 03:47:14 +00:00
|
|
|
int i;
|
2003-04-03 00:29:28 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_CPU(cpu);
|
2003-04-03 00:29:28 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
printf("tdq:\n");
|
2006-12-29 10:37:07 +00:00
|
|
|
printf("\tload: %d\n", tdq->tdq_load);
|
|
|
|
printf("\tload TIMESHARE: %d\n", tdq->tdq_load_timeshare);
|
2003-11-02 10:56:48 +00:00
|
|
|
#ifdef SMP
|
2006-12-29 10:37:07 +00:00
|
|
|
printf("\tload transferable: %d\n", tdq->tdq_transferable);
|
2003-11-02 10:56:48 +00:00
|
|
|
#endif
|
2006-12-29 10:37:07 +00:00
|
|
|
printf("\tnicemin:\t%d\n", tdq->tdq_nicemin);
|
2003-04-11 03:47:14 +00:00
|
|
|
printf("\tnice counts:\n");
|
2003-11-02 03:49:32 +00:00
|
|
|
for (i = 0; i < SCHED_PRI_NRESV; i++)
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_nice[i])
|
2003-04-11 03:47:14 +00:00
|
|
|
printf("\t\t%d = %d\n",
|
2006-12-29 10:37:07 +00:00
|
|
|
i - SCHED_PRI_NHALF, tdq->tdq_nice[i]);
|
2003-04-11 03:47:14 +00:00
|
|
|
}
|
2003-04-03 00:29:28 +00:00
|
|
|
|
2003-11-15 07:32:07 +00:00
|
|
|
static __inline void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_runq_add(struct tdq *tdq, struct td_sched *ts, int flags)
|
2003-11-15 07:32:07 +00:00
|
|
|
{
|
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
if (THREAD_CAN_MIGRATE(ts)) {
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_transferable++;
|
|
|
|
tdq->tdq_group->tdg_transferable++;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags |= TSF_XFERABLE;
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
2003-11-15 07:32:07 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_flags & TSF_PREEMPTED)
|
2005-08-08 14:20:10 +00:00
|
|
|
flags |= SRQ_PREEMPTED;
|
2006-12-06 06:34:57 +00:00
|
|
|
runq_add(ts->ts_runq, ts, flags);
|
2003-11-15 07:32:07 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static __inline void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_runq_rem(struct tdq *tdq, struct td_sched *ts)
|
2003-11-15 07:32:07 +00:00
|
|
|
{
|
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_flags & TSF_XFERABLE) {
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_transferable--;
|
|
|
|
tdq->tdq_group->tdg_transferable--;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags &= ~TSF_XFERABLE;
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
2003-11-15 07:32:07 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
runq_remove(ts->ts_runq, ts);
|
2003-11-15 07:32:07 +00:00
|
|
|
}
|
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_load_add(struct tdq *tdq, struct td_sched *ts)
|
2003-04-11 03:47:14 +00:00
|
|
|
{
|
2003-11-02 10:56:48 +00:00
|
|
|
int class;
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
class = PRI_BASE(ts->ts_thread->td_pri_class);
|
2003-11-02 10:56:48 +00:00
|
|
|
if (class == PRI_TIMESHARE)
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_load_timeshare++;
|
|
|
|
tdq->tdq_load++;
|
|
|
|
CTR1(KTR_SCHED, "load: %d", tdq->tdq_load);
|
2006-12-06 06:34:57 +00:00
|
|
|
if (class != PRI_ITHD && (ts->ts_thread->td_proc->p_flag & P_NOLOAD) == 0)
|
2004-02-01 02:48:36 +00:00
|
|
|
#ifdef SMP
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_group->tdg_load++;
|
2004-02-01 02:48:36 +00:00
|
|
|
#else
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_sysload++;
|
2003-12-12 07:33:51 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_thread->td_pri_class == PRI_TIMESHARE)
|
|
|
|
tdq_nice_add(tdq, ts->ts_thread->td_proc->p_nice);
|
2003-02-03 05:30:07 +00:00
|
|
|
}
|
2003-04-11 03:47:14 +00:00
|
|
|
|
2003-04-03 00:29:28 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_load_rem(struct tdq *tdq, struct td_sched *ts)
|
2003-02-03 05:30:07 +00:00
|
|
|
{
|
2003-11-02 10:56:48 +00:00
|
|
|
int class;
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
class = PRI_BASE(ts->ts_thread->td_pri_class);
|
2003-11-02 10:56:48 +00:00
|
|
|
if (class == PRI_TIMESHARE)
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_load_timeshare--;
|
2006-12-06 06:34:57 +00:00
|
|
|
if (class != PRI_ITHD && (ts->ts_thread->td_proc->p_flag & P_NOLOAD) == 0)
|
2004-02-01 02:48:36 +00:00
|
|
|
#ifdef SMP
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_group->tdg_load--;
|
2004-02-01 02:48:36 +00:00
|
|
|
#else
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_sysload--;
|
2003-12-12 07:33:51 +00:00
|
|
|
#endif
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_load--;
|
|
|
|
CTR1(KTR_SCHED, "load: %d", tdq->tdq_load);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_runq = NULL;
|
|
|
|
if (ts->ts_thread->td_pri_class == PRI_TIMESHARE)
|
|
|
|
tdq_nice_rem(tdq, ts->ts_thread->td_proc->p_nice);
|
2003-02-03 05:30:07 +00:00
|
|
|
}
|
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_nice_add(struct tdq *tdq, int nice)
|
2003-02-03 05:30:07 +00:00
|
|
|
{
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2003-04-11 03:47:14 +00:00
|
|
|
/* Normalize to zero. */
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_nice[nice + SCHED_PRI_NHALF]++;
|
|
|
|
if (nice < tdq->tdq_nicemin || tdq->tdq_load_timeshare == 1)
|
|
|
|
tdq->tdq_nicemin = nice;
|
2003-02-03 05:30:07 +00:00
|
|
|
}
|
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_nice_rem(struct tdq *tdq, int nice)
|
2003-02-03 05:30:07 +00:00
|
|
|
{
|
2003-04-11 03:47:14 +00:00
|
|
|
int n;
|
|
|
|
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2003-04-11 03:47:14 +00:00
|
|
|
/* Normalize to zero. */
|
|
|
|
n = nice + SCHED_PRI_NHALF;
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_nice[n]--;
|
|
|
|
KASSERT(tdq->tdq_nice[n] >= 0, ("Negative nice count."));
|
2003-04-11 03:47:14 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If this wasn't the smallest nice value or there are more in
|
|
|
|
* this bucket we can just return. Otherwise we have to recalculate
|
|
|
|
* the smallest nice.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if (nice != tdq->tdq_nicemin ||
|
|
|
|
tdq->tdq_nice[n] != 0 ||
|
|
|
|
tdq->tdq_load_timeshare == 0)
|
2003-04-11 03:47:14 +00:00
|
|
|
return;
|
|
|
|
|
2003-11-02 03:49:32 +00:00
|
|
|
for (; n < SCHED_PRI_NRESV; n++)
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_nice[n]) {
|
|
|
|
tdq->tdq_nicemin = n - SCHED_PRI_NHALF;
|
2003-04-11 03:47:14 +00:00
|
|
|
return;
|
|
|
|
}
|
2003-02-03 05:30:07 +00:00
|
|
|
}
|
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
#ifdef SMP
|
2003-06-09 00:39:09 +00:00
|
|
|
/*
|
2003-11-15 07:32:07 +00:00
|
|
|
* sched_balance is a simple CPU load balancing algorithm. It operates by
|
2003-06-09 00:39:09 +00:00
|
|
|
* finding the least loaded and most loaded cpu and equalizing their load
|
|
|
|
* by migrating some processes.
|
|
|
|
*
|
|
|
|
* Dealing only with two CPUs at a time has two advantages. Firstly, most
|
|
|
|
* installations will only have 2 cpus. Secondly, load balancing too much at
|
|
|
|
* once can have an unpleasant effect on the system. The scheduler rarely has
|
|
|
|
* enough information to make perfect decisions. So this algorithm chooses
|
|
|
|
* algorithm simplicity and more gradual effects on load in larger systems.
|
|
|
|
*
|
|
|
|
* It could be improved by considering the priorities and slices assigned to
|
|
|
|
* each task prior to balancing them. There are many pathological cases with
|
|
|
|
* any approach and so the semi random algorithm below may work as well as any.
|
|
|
|
*
|
|
|
|
*/
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
static void
|
2004-06-02 05:46:48 +00:00
|
|
|
sched_balance(void)
|
2003-06-09 00:39:09 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq_group *high;
|
|
|
|
struct tdq_group *low;
|
2006-12-29 10:37:07 +00:00
|
|
|
struct tdq_group *tdg;
|
2003-12-12 07:33:51 +00:00
|
|
|
int cnt;
|
2003-06-09 00:39:09 +00:00
|
|
|
int i;
|
|
|
|
|
2004-12-26 22:56:08 +00:00
|
|
|
bal_tick = ticks + (random() % (hz * 2));
|
2003-06-28 08:24:42 +00:00
|
|
|
if (smp_started == 0)
|
2004-12-26 22:56:08 +00:00
|
|
|
return;
|
2003-12-12 07:33:51 +00:00
|
|
|
low = high = NULL;
|
2006-12-29 10:37:07 +00:00
|
|
|
i = random() % (tdg_maxid + 1);
|
|
|
|
for (cnt = 0; cnt <= tdg_maxid; cnt++) {
|
|
|
|
tdg = TDQ_GROUP(i);
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
2003-12-12 07:33:51 +00:00
|
|
|
* Find the CPU with the highest load that has some
|
|
|
|
* threads to transfer.
|
2003-12-11 03:57:10 +00:00
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if ((high == NULL || tdg->tdg_load > high->tdg_load)
|
|
|
|
&& tdg->tdg_transferable)
|
|
|
|
high = tdg;
|
|
|
|
if (low == NULL || tdg->tdg_load < low->tdg_load)
|
|
|
|
low = tdg;
|
|
|
|
if (++i > tdg_maxid)
|
2003-12-12 07:33:51 +00:00
|
|
|
i = 0;
|
2003-06-09 00:39:09 +00:00
|
|
|
}
|
2003-12-12 07:33:51 +00:00
|
|
|
if (low != NULL && high != NULL && high != low)
|
2006-12-29 10:37:07 +00:00
|
|
|
sched_balance_pair(LIST_FIRST(&high->tdg_members),
|
|
|
|
LIST_FIRST(&low->tdg_members));
|
2003-12-12 07:33:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2004-06-02 05:46:48 +00:00
|
|
|
sched_balance_groups(void)
|
2003-12-12 07:33:51 +00:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
2004-12-26 22:56:08 +00:00
|
|
|
gbal_tick = ticks + (random() % (hz * 2));
|
2004-06-02 05:46:48 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2003-12-12 07:33:51 +00:00
|
|
|
if (smp_started)
|
2006-12-29 10:37:07 +00:00
|
|
|
for (i = 0; i <= tdg_maxid; i++)
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_balance_group(TDQ_GROUP(i));
|
2003-12-12 07:33:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2006-12-29 10:37:07 +00:00
|
|
|
sched_balance_group(struct tdq_group *tdg)
|
2003-12-12 07:33:51 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct tdq *high;
|
|
|
|
struct tdq *low;
|
2003-12-12 07:33:51 +00:00
|
|
|
int load;
|
|
|
|
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdg->tdg_transferable == 0)
|
2003-12-12 07:33:51 +00:00
|
|
|
return;
|
|
|
|
low = NULL;
|
|
|
|
high = NULL;
|
2006-12-29 10:37:07 +00:00
|
|
|
LIST_FOREACH(tdq, &tdg->tdg_members, tdq_siblings) {
|
|
|
|
load = tdq->tdq_load;
|
|
|
|
if (high == NULL || load > high->tdq_load)
|
2006-12-06 06:34:57 +00:00
|
|
|
high = tdq;
|
2006-12-29 10:37:07 +00:00
|
|
|
if (low == NULL || load < low->tdq_load)
|
2006-12-06 06:34:57 +00:00
|
|
|
low = tdq;
|
2003-12-12 07:33:51 +00:00
|
|
|
}
|
|
|
|
if (high != NULL && low != NULL && high != low)
|
|
|
|
sched_balance_pair(high, low);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_balance_pair(struct tdq *high, struct tdq *low)
|
2003-12-12 07:33:51 +00:00
|
|
|
{
|
|
|
|
int transferable;
|
|
|
|
int high_load;
|
|
|
|
int low_load;
|
|
|
|
int move;
|
|
|
|
int diff;
|
|
|
|
int i;
|
|
|
|
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* If we're transfering within a group we have to use this specific
|
2006-12-06 06:34:57 +00:00
|
|
|
* tdq's transferable count, otherwise we can steal from other members
|
2003-12-11 03:57:10 +00:00
|
|
|
* of the group.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if (high->tdq_group == low->tdq_group) {
|
|
|
|
transferable = high->tdq_transferable;
|
|
|
|
high_load = high->tdq_load;
|
|
|
|
low_load = low->tdq_load;
|
2003-12-12 07:33:51 +00:00
|
|
|
} else {
|
2006-12-29 10:37:07 +00:00
|
|
|
transferable = high->tdq_group->tdg_transferable;
|
|
|
|
high_load = high->tdq_group->tdg_load;
|
|
|
|
low_load = low->tdq_group->tdg_load;
|
2003-12-12 07:33:51 +00:00
|
|
|
}
|
2003-12-11 03:57:10 +00:00
|
|
|
if (transferable == 0)
|
2003-12-12 07:33:51 +00:00
|
|
|
return;
|
2003-11-15 07:32:07 +00:00
|
|
|
/*
|
|
|
|
* Determine what the imbalance is and then adjust that to how many
|
2006-12-29 10:37:07 +00:00
|
|
|
* threads we actually have to give up (transferable).
|
2003-11-15 07:32:07 +00:00
|
|
|
*/
|
2003-12-12 07:33:51 +00:00
|
|
|
diff = high_load - low_load;
|
2003-06-09 00:39:09 +00:00
|
|
|
move = diff / 2;
|
|
|
|
if (diff & 0x1)
|
|
|
|
move++;
|
2003-12-11 03:57:10 +00:00
|
|
|
move = min(move, transferable);
|
2003-06-09 00:39:09 +00:00
|
|
|
for (i = 0; i < move; i++)
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_move(high, TDQ_ID(low));
|
2003-06-09 00:39:09 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_move(struct tdq *from, int cpu)
|
2003-06-09 00:39:09 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct tdq *to;
|
|
|
|
struct td_sched *ts;
|
|
|
|
|
|
|
|
tdq = from;
|
|
|
|
to = TDQ_CPU(cpu);
|
|
|
|
ts = tdq_steal(tdq, 1);
|
|
|
|
if (ts == NULL) {
|
2006-12-29 10:37:07 +00:00
|
|
|
struct tdq_group *tdg;
|
2006-12-06 06:34:57 +00:00
|
|
|
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg = tdq->tdq_group;
|
|
|
|
LIST_FOREACH(tdq, &tdg->tdg_members, tdq_siblings) {
|
|
|
|
if (tdq == from || tdq->tdq_transferable == 0)
|
2003-12-11 03:57:10 +00:00
|
|
|
continue;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = tdq_steal(tdq, 1);
|
2003-12-11 03:57:10 +00:00
|
|
|
break;
|
|
|
|
}
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts == NULL)
|
|
|
|
panic("tdq_move: No threads available with a "
|
2003-12-11 03:57:10 +00:00
|
|
|
"transferable count of %d\n",
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg->tdg_transferable);
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
2006-12-06 06:34:57 +00:00
|
|
|
if (tdq == to)
|
2003-12-11 03:57:10 +00:00
|
|
|
return;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_state = TSS_THREAD;
|
|
|
|
tdq_runq_rem(tdq, ts);
|
|
|
|
tdq_load_rem(tdq, ts);
|
|
|
|
tdq_notify(ts, cpu);
|
2003-06-09 00:39:09 +00:00
|
|
|
}
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
|
2003-12-11 03:57:10 +00:00
|
|
|
static int
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_idled(struct tdq *tdq)
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
{
|
2006-12-29 10:37:07 +00:00
|
|
|
struct tdq_group *tdg;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *steal;
|
|
|
|
struct td_sched *ts;
|
2003-12-11 03:57:10 +00:00
|
|
|
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg = tdq->tdq_group;
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
2006-12-29 10:37:07 +00:00
|
|
|
* If we're in a cpu group, try and steal threads from another cpu in
|
2003-12-11 03:57:10 +00:00
|
|
|
* the group before idling.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdg->tdg_cpus > 1 && tdg->tdg_transferable) {
|
|
|
|
LIST_FOREACH(steal, &tdg->tdg_members, tdq_siblings) {
|
|
|
|
if (steal == tdq || steal->tdq_transferable == 0)
|
2003-12-11 03:57:10 +00:00
|
|
|
continue;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = tdq_steal(steal, 0);
|
|
|
|
if (ts == NULL)
|
2003-12-11 03:57:10 +00:00
|
|
|
continue;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_state = TSS_THREAD;
|
|
|
|
tdq_runq_rem(steal, ts);
|
|
|
|
tdq_load_rem(steal, ts);
|
|
|
|
ts->ts_cpu = PCPU_GET(cpuid);
|
|
|
|
ts->ts_flags |= TSF_INTERNAL | TSF_HOLD;
|
|
|
|
sched_add(ts->ts_thread, SRQ_YIELDING);
|
2003-12-11 03:57:10 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* We only set the idled bit when all of the cpus in the group are
|
2006-12-06 06:34:57 +00:00
|
|
|
* idle. Otherwise we could get into a situation where a thread bounces
|
2003-12-11 03:57:10 +00:00
|
|
|
* back and forth between two idle cores on seperate physical CPUs.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg->tdg_idlemask |= PCPU_GET(cpumask);
|
|
|
|
if (tdg->tdg_idlemask != tdg->tdg_cpumask)
|
2003-12-11 03:57:10 +00:00
|
|
|
return (1);
|
2006-12-29 10:37:07 +00:00
|
|
|
atomic_set_int(&tdq_idle, tdg->tdg_mask);
|
2003-12-11 03:57:10 +00:00
|
|
|
return (1);
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_assign(struct tdq *tdq)
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *nts;
|
|
|
|
struct td_sched *ts;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
|
|
|
|
do {
|
2006-12-29 10:37:07 +00:00
|
|
|
*(volatile struct td_sched **)&ts = tdq->tdq_assigned;
|
|
|
|
} while(!atomic_cmpset_ptr((volatile uintptr_t *)&tdq->tdq_assigned,
|
2006-12-06 06:34:57 +00:00
|
|
|
(uintptr_t)ts, (uintptr_t)NULL));
|
|
|
|
for (; ts != NULL; ts = nts) {
|
|
|
|
nts = ts->ts_assign;
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_group->tdg_load--;
|
|
|
|
tdq->tdq_load--;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags &= ~TSF_ASSIGNED;
|
|
|
|
if (ts->ts_flags & TSF_REMOVED) {
|
|
|
|
ts->ts_flags &= ~TSF_REMOVED;
|
2005-07-31 15:11:21 +00:00
|
|
|
continue;
|
|
|
|
}
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags |= TSF_INTERNAL | TSF_HOLD;
|
|
|
|
sched_add(ts->ts_thread, SRQ_YIELDING);
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_notify(struct td_sched *ts, int cpu)
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
struct thread *td;
|
|
|
|
struct pcpu *pcpu;
|
2004-12-26 22:56:08 +00:00
|
|
|
int class;
|
2004-08-10 07:52:21 +00:00
|
|
|
int prio;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_CPU(cpu);
|
2004-12-26 22:56:08 +00:00
|
|
|
/* XXX */
|
2006-12-06 06:34:57 +00:00
|
|
|
class = PRI_BASE(ts->ts_thread->td_pri_class);
|
2004-12-26 22:56:08 +00:00
|
|
|
if ((class == PRI_TIMESHARE || class == PRI_REALTIME) &&
|
2006-12-29 10:37:07 +00:00
|
|
|
(tdq_idle & tdq->tdq_group->tdg_mask))
|
|
|
|
atomic_clear_int(&tdq_idle, tdq->tdq_group->tdg_mask);
|
|
|
|
tdq->tdq_group->tdg_load++;
|
|
|
|
tdq->tdq_load++;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_cpu = cpu;
|
|
|
|
ts->ts_flags |= TSF_ASSIGNED;
|
|
|
|
prio = ts->ts_thread->td_priority;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
|
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Place a thread on another cpu's queue and force a resched.
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
*/
|
|
|
|
do {
|
2006-12-29 10:37:07 +00:00
|
|
|
*(volatile struct td_sched **)&ts->ts_assign = tdq->tdq_assigned;
|
|
|
|
} while(!atomic_cmpset_ptr((volatile uintptr_t *)&tdq->tdq_assigned,
|
2006-12-06 06:34:57 +00:00
|
|
|
(uintptr_t)ts->ts_assign, (uintptr_t)ts));
|
2004-08-10 07:52:21 +00:00
|
|
|
/*
|
|
|
|
* Without sched_lock we could lose a race where we set NEEDRESCHED
|
|
|
|
* on a thread that is switched out before the IPI is delivered. This
|
|
|
|
* would lead us to miss the resched. This will be a problem once
|
|
|
|
* sched_lock is pushed down.
|
|
|
|
*/
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
pcpu = pcpu_find(cpu);
|
|
|
|
td = pcpu->pc_curthread;
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_thread->td_priority < td->td_priority ||
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
td == pcpu->pc_idlethread) {
|
|
|
|
td->td_flags |= TDF_NEEDRESCHED;
|
|
|
|
ipi_selected(1 << cpu, IPI_AST);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct td_sched *
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
runq_steal(struct runq *rq)
|
|
|
|
{
|
|
|
|
struct rqhead *rqh;
|
|
|
|
struct rqbits *rqb;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
int word;
|
|
|
|
int bit;
|
|
|
|
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
|
|
|
rqb = &rq->rq_status;
|
|
|
|
for (word = 0; word < RQB_LEN; word++) {
|
|
|
|
if (rqb->rqb_bits[word] == 0)
|
|
|
|
continue;
|
|
|
|
for (bit = 0; bit < RQB_BPW; bit++) {
|
2003-12-07 09:57:51 +00:00
|
|
|
if ((rqb->rqb_bits[word] & (1ul << bit)) == 0)
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
continue;
|
|
|
|
rqh = &rq->rq_queues[bit + (word << RQB_L2BPW)];
|
2006-12-06 06:34:57 +00:00
|
|
|
TAILQ_FOREACH(ts, rqh, ts_procq) {
|
|
|
|
if (THREAD_CAN_MIGRATE(ts))
|
|
|
|
return (ts);
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct td_sched *
|
|
|
|
tdq_steal(struct tdq *tdq, int stealidle)
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* Steal from next first to try to get a non-interactive task that
|
|
|
|
* may not have run for a while.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if ((ts = runq_steal(tdq->tdq_next)) != NULL)
|
2006-12-06 06:34:57 +00:00
|
|
|
return (ts);
|
2006-12-29 10:37:07 +00:00
|
|
|
if ((ts = runq_steal(tdq->tdq_curr)) != NULL)
|
2006-12-06 06:34:57 +00:00
|
|
|
return (ts);
|
2003-12-11 03:57:10 +00:00
|
|
|
if (stealidle)
|
2006-12-29 10:37:07 +00:00
|
|
|
return (runq_steal(&tdq->tdq_idle));
|
2003-12-11 03:57:10 +00:00
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_transfer(struct tdq *tdq, struct td_sched *ts, int class)
|
2003-12-11 03:57:10 +00:00
|
|
|
{
|
2006-12-29 10:37:07 +00:00
|
|
|
struct tdq_group *ntdg;
|
|
|
|
struct tdq_group *tdg;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *old;
|
2003-12-11 03:57:10 +00:00
|
|
|
int cpu;
|
2004-12-26 22:56:08 +00:00
|
|
|
int idx;
|
2003-12-11 03:57:10 +00:00
|
|
|
|
2003-12-20 14:03:14 +00:00
|
|
|
if (smp_started == 0)
|
|
|
|
return (0);
|
2003-12-11 03:57:10 +00:00
|
|
|
cpu = 0;
|
2003-12-20 14:03:14 +00:00
|
|
|
/*
|
2004-08-10 07:52:21 +00:00
|
|
|
* If our load exceeds a certain threshold we should attempt to
|
|
|
|
* reassign this thread. The first candidate is the cpu that
|
|
|
|
* originally ran the thread. If it is idle, assign it there,
|
|
|
|
* otherwise, pick an idle cpu.
|
|
|
|
*
|
2006-12-29 10:37:07 +00:00
|
|
|
* The threshold at which we start to reassign has a large impact
|
2003-12-20 14:03:14 +00:00
|
|
|
* on the overall performance of the system. Tuned too high and
|
|
|
|
* some CPUs may idle. Too low and there will be excess migration
|
2004-04-09 14:31:29 +00:00
|
|
|
* and context switches.
|
2003-12-20 14:03:14 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
old = TDQ_CPU(ts->ts_cpu);
|
2006-12-29 10:37:07 +00:00
|
|
|
ntdg = old->tdq_group;
|
|
|
|
tdg = tdq->tdq_group;
|
2006-12-06 06:34:57 +00:00
|
|
|
if (tdq_idle) {
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq_idle & ntdg->tdg_mask) {
|
|
|
|
cpu = ffs(ntdg->tdg_idlemask);
|
2004-12-26 22:56:08 +00:00
|
|
|
if (cpu) {
|
|
|
|
CTR2(KTR_SCHED,
|
2006-12-06 06:34:57 +00:00
|
|
|
"tdq_transfer: %p found old cpu %X "
|
|
|
|
"in idlemask.", ts, cpu);
|
2004-08-10 07:52:21 +00:00
|
|
|
goto migrate;
|
2004-12-26 22:56:08 +00:00
|
|
|
}
|
2004-08-10 07:52:21 +00:00
|
|
|
}
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* Multiple cpus could find this bit simultaneously
|
|
|
|
* but the race shouldn't be terrible.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
cpu = ffs(tdq_idle);
|
2004-12-26 22:56:08 +00:00
|
|
|
if (cpu) {
|
2006-12-06 06:34:57 +00:00
|
|
|
CTR2(KTR_SCHED, "tdq_transfer: %p found %X "
|
|
|
|
"in idlemask.", ts, cpu);
|
2004-12-26 22:56:08 +00:00
|
|
|
goto migrate;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
idx = 0;
|
|
|
|
#if 0
|
2006-12-29 10:37:07 +00:00
|
|
|
if (old->tdq_load < tdq->tdq_load) {
|
2006-12-06 06:34:57 +00:00
|
|
|
cpu = ts->ts_cpu + 1;
|
|
|
|
CTR2(KTR_SCHED, "tdq_transfer: %p old cpu %X "
|
|
|
|
"load less than ours.", ts, cpu);
|
2004-12-26 22:56:08 +00:00
|
|
|
goto migrate;
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* No new CPU was found, look for one with less load.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
for (idx = 0; idx <= tdg_maxid; idx++) {
|
|
|
|
ntdg = TDQ_GROUP(idx);
|
|
|
|
if (ntdg->tdg_load /*+ (ntdg->tdg_cpus * 2)*/ < tdg->tdg_load) {
|
|
|
|
cpu = ffs(ntdg->tdg_cpumask);
|
2006-12-06 06:34:57 +00:00
|
|
|
CTR2(KTR_SCHED, "tdq_transfer: %p cpu %X load less "
|
|
|
|
"than ours.", ts, cpu);
|
2004-08-10 07:52:21 +00:00
|
|
|
goto migrate;
|
2004-12-26 22:56:08 +00:00
|
|
|
}
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
2004-12-26 22:56:08 +00:00
|
|
|
#endif
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* If another cpu in this group has idled, assign a thread over
|
|
|
|
* to them after checking to see if there are idled groups.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdg->tdg_idlemask) {
|
|
|
|
cpu = ffs(tdg->tdg_idlemask);
|
2004-12-26 22:56:08 +00:00
|
|
|
if (cpu) {
|
2006-12-06 06:34:57 +00:00
|
|
|
CTR2(KTR_SCHED, "tdq_transfer: %p cpu %X idle in "
|
|
|
|
"group.", ts, cpu);
|
2004-08-10 07:52:21 +00:00
|
|
|
goto migrate;
|
2004-12-26 22:56:08 +00:00
|
|
|
}
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
|
|
|
return (0);
|
2004-08-10 07:52:21 +00:00
|
|
|
migrate:
|
|
|
|
/*
|
|
|
|
* Now that we've found an idle CPU, migrate the thread.
|
|
|
|
*/
|
|
|
|
cpu--;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_runq = NULL;
|
|
|
|
tdq_notify(ts, cpu);
|
2004-08-10 07:52:21 +00:00
|
|
|
|
|
|
|
return (1);
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
}
|
2003-12-11 03:57:10 +00:00
|
|
|
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#endif /* SMP */
|
2003-02-03 05:30:07 +00:00
|
|
|
|
2003-07-08 06:19:40 +00:00
|
|
|
/*
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
* Pick the highest priority task we have and return it.
|
2003-07-08 06:19:40 +00:00
|
|
|
*/
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
static struct td_sched *
|
|
|
|
tdq_choose(struct tdq *tdq)
|
2003-02-03 05:30:07 +00:00
|
|
|
{
|
|
|
|
struct runq *swap;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
2004-10-30 12:19:15 +00:00
|
|
|
int nice;
|
2003-02-03 05:30:07 +00:00
|
|
|
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2003-04-11 03:47:14 +00:00
|
|
|
swap = NULL;
|
|
|
|
|
|
|
|
for (;;) {
|
2006-12-29 10:37:07 +00:00
|
|
|
ts = runq_choose(tdq->tdq_curr);
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts == NULL) {
|
2003-04-11 03:47:14 +00:00
|
|
|
/*
|
2004-07-02 19:09:50 +00:00
|
|
|
* We already swapped once and didn't get anywhere.
|
2003-04-11 03:47:14 +00:00
|
|
|
*/
|
|
|
|
if (swap)
|
|
|
|
break;
|
2006-12-29 10:37:07 +00:00
|
|
|
swap = tdq->tdq_curr;
|
|
|
|
tdq->tdq_curr = tdq->tdq_next;
|
|
|
|
tdq->tdq_next = swap;
|
2003-04-11 03:47:14 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* If we encounter a slice of 0 the td_sched is in a
|
|
|
|
* TIMESHARE td_sched group and its nice was too far out
|
2003-04-11 03:47:14 +00:00
|
|
|
* of the range that receives slices.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
nice = ts->ts_thread->td_proc->p_nice + (0 - tdq->tdq_nicemin);
|
2005-09-22 01:19:37 +00:00
|
|
|
#if 0
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_slice == 0 || (nice > SCHED_SLICE_NTHRESH &&
|
|
|
|
ts->ts_thread->td_proc->p_nice != 0)) {
|
|
|
|
runq_remove(ts->ts_runq, ts);
|
|
|
|
sched_slice(ts);
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_next;
|
2006-12-06 06:34:57 +00:00
|
|
|
runq_add(ts->ts_runq, ts, 0);
|
2003-04-11 03:47:14 +00:00
|
|
|
continue;
|
|
|
|
}
|
2005-09-22 01:19:37 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
return (ts);
|
2003-02-03 05:30:07 +00:00
|
|
|
}
|
|
|
|
|
2006-12-29 10:37:07 +00:00
|
|
|
return (runq_choose(&tdq->tdq_idle));
|
2003-04-02 06:46:43 +00:00
|
|
|
}
|
2003-01-29 07:00:51 +00:00
|
|
|
|
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_setup(struct tdq *tdq)
|
2003-01-29 07:00:51 +00:00
|
|
|
{
|
2006-12-29 10:37:07 +00:00
|
|
|
runq_init(&tdq->tdq_timeshare[0]);
|
|
|
|
runq_init(&tdq->tdq_timeshare[1]);
|
|
|
|
runq_init(&tdq->tdq_idle);
|
|
|
|
tdq->tdq_curr = &tdq->tdq_timeshare[0];
|
|
|
|
tdq->tdq_next = &tdq->tdq_timeshare[1];
|
|
|
|
tdq->tdq_load = 0;
|
|
|
|
tdq->tdq_load_timeshare = 0;
|
2003-01-29 07:00:51 +00:00
|
|
|
}
|
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
static void
|
|
|
|
sched_setup(void *dummy)
|
|
|
|
{
|
2003-07-07 21:08:28 +00:00
|
|
|
#ifdef SMP
|
2003-01-26 05:23:15 +00:00
|
|
|
int i;
|
2003-07-07 21:08:28 +00:00
|
|
|
#endif
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2005-12-19 08:26:09 +00:00
|
|
|
/*
|
|
|
|
* To avoid divide-by-zero, we set realstathz a dummy value
|
|
|
|
* in case which sched_clock() called before sched_initticks().
|
|
|
|
*/
|
|
|
|
realstathz = hz;
|
2003-06-28 06:04:47 +00:00
|
|
|
slice_min = (hz/100); /* 10ms */
|
|
|
|
slice_max = (hz/7); /* ~140ms */
|
2003-03-04 02:45:59 +00:00
|
|
|
|
2003-07-04 19:59:00 +00:00
|
|
|
#ifdef SMP
|
2003-12-12 07:33:51 +00:00
|
|
|
balance_groups = 0;
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Initialize the tdqs.
|
2003-12-11 03:57:10 +00:00
|
|
|
*/
|
|
|
|
for (i = 0; i < MAXCPU; i++) {
|
2006-12-29 12:55:32 +00:00
|
|
|
struct tdq *tdq;
|
2003-12-11 03:57:10 +00:00
|
|
|
|
2006-12-29 12:55:32 +00:00
|
|
|
tdq = &tdq_cpu[i];
|
|
|
|
tdq->tdq_assigned = NULL;
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_setup(&tdq_cpu[i]);
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
2003-07-04 19:59:00 +00:00
|
|
|
if (smp_topology == NULL) {
|
2006-12-29 10:37:07 +00:00
|
|
|
struct tdq_group *tdg;
|
2006-12-29 12:55:32 +00:00
|
|
|
struct tdq *tdq;
|
2004-12-26 22:56:08 +00:00
|
|
|
int cpus;
|
2003-12-11 03:57:10 +00:00
|
|
|
|
2004-12-26 22:56:08 +00:00
|
|
|
for (cpus = 0, i = 0; i < MAXCPU; i++) {
|
|
|
|
if (CPU_ABSENT(i))
|
|
|
|
continue;
|
2006-12-29 12:55:32 +00:00
|
|
|
tdq = &tdq_cpu[i];
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg = &tdq_groups[cpus];
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Setup a tdq group with one member.
|
2003-12-11 03:57:10 +00:00
|
|
|
*/
|
2006-12-29 12:55:32 +00:00
|
|
|
tdq->tdq_transferable = 0;
|
|
|
|
tdq->tdq_group = tdg;
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg->tdg_cpus = 1;
|
|
|
|
tdg->tdg_idlemask = 0;
|
|
|
|
tdg->tdg_cpumask = tdg->tdg_mask = 1 << i;
|
|
|
|
tdg->tdg_load = 0;
|
|
|
|
tdg->tdg_transferable = 0;
|
|
|
|
LIST_INIT(&tdg->tdg_members);
|
2006-12-29 12:55:32 +00:00
|
|
|
LIST_INSERT_HEAD(&tdg->tdg_members, tdq, tdq_siblings);
|
2004-12-26 22:56:08 +00:00
|
|
|
cpus++;
|
2003-07-04 19:59:00 +00:00
|
|
|
}
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg_maxid = cpus - 1;
|
2003-07-04 19:59:00 +00:00
|
|
|
} else {
|
2006-12-29 10:37:07 +00:00
|
|
|
struct tdq_group *tdg;
|
2003-12-11 03:57:10 +00:00
|
|
|
struct cpu_group *cg;
|
2003-07-04 19:59:00 +00:00
|
|
|
int j;
|
2003-04-11 03:47:14 +00:00
|
|
|
|
2003-07-04 19:59:00 +00:00
|
|
|
for (i = 0; i < smp_topology->ct_count; i++) {
|
|
|
|
cg = &smp_topology->ct_group[i];
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg = &tdq_groups[i];
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* Initialize the group.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg->tdg_idlemask = 0;
|
|
|
|
tdg->tdg_load = 0;
|
|
|
|
tdg->tdg_transferable = 0;
|
|
|
|
tdg->tdg_cpus = cg->cg_count;
|
|
|
|
tdg->tdg_cpumask = cg->cg_mask;
|
|
|
|
LIST_INIT(&tdg->tdg_members);
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* Find all of the group members and add them.
|
|
|
|
*/
|
|
|
|
for (j = 0; j < MAXCPU; j++) {
|
|
|
|
if ((cg->cg_mask & (1 << j)) != 0) {
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdg->tdg_mask == 0)
|
|
|
|
tdg->tdg_mask = 1 << j;
|
|
|
|
tdq_cpu[j].tdq_transferable = 0;
|
|
|
|
tdq_cpu[j].tdq_group = tdg;
|
|
|
|
LIST_INSERT_HEAD(&tdg->tdg_members,
|
|
|
|
&tdq_cpu[j], tdq_siblings);
|
2003-12-11 03:57:10 +00:00
|
|
|
}
|
|
|
|
}
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdg->tdg_cpus > 1)
|
2003-12-12 07:33:51 +00:00
|
|
|
balance_groups = 1;
|
2003-07-04 19:59:00 +00:00
|
|
|
}
|
2006-12-29 10:37:07 +00:00
|
|
|
tdg_maxid = smp_topology->ct_count - 1;
|
2003-07-04 19:59:00 +00:00
|
|
|
}
|
2003-12-12 07:33:51 +00:00
|
|
|
/*
|
|
|
|
* Stagger the group and global load balancer so they do not
|
|
|
|
* interfere with each other.
|
|
|
|
*/
|
2004-06-02 05:46:48 +00:00
|
|
|
bal_tick = ticks + hz;
|
2003-12-12 07:33:51 +00:00
|
|
|
if (balance_groups)
|
2004-06-02 05:46:48 +00:00
|
|
|
gbal_tick = ticks + (hz / 2);
|
2003-07-04 19:59:00 +00:00
|
|
|
#else
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_setup(TDQ_SELF());
|
2003-06-09 00:39:09 +00:00
|
|
|
#endif
|
2003-07-04 19:59:00 +00:00
|
|
|
mtx_lock_spin(&sched_lock);
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_load_add(TDQ_SELF(), &td_sched0);
|
2003-07-04 19:59:00 +00:00
|
|
|
mtx_unlock_spin(&sched_lock);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
2005-12-19 08:26:09 +00:00
|
|
|
/* ARGSUSED */
|
|
|
|
static void
|
|
|
|
sched_initticks(void *dummy)
|
|
|
|
{
|
|
|
|
mtx_lock_spin(&sched_lock);
|
|
|
|
realstathz = stathz ? stathz : hz;
|
|
|
|
slice_min = (realstathz/100); /* 10ms */
|
|
|
|
slice_max = (realstathz/7); /* ~140ms */
|
|
|
|
|
|
|
|
tickincr = (hz << 10) / realstathz;
|
|
|
|
/*
|
|
|
|
* XXX This does not work for values of stathz that are much
|
|
|
|
* larger than hz.
|
|
|
|
*/
|
|
|
|
if (tickincr == 0)
|
|
|
|
tickincr = 1;
|
|
|
|
mtx_unlock_spin(&sched_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
/*
|
|
|
|
* Scale the scheduling priority according to the "interactivity" of this
|
|
|
|
* process.
|
|
|
|
*/
|
2003-04-11 03:47:14 +00:00
|
|
|
static void
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_priority(struct thread *td)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
|
|
|
int pri;
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
if (td->td_pri_class != PRI_TIMESHARE)
|
2003-04-11 03:47:14 +00:00
|
|
|
return;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
pri = SCHED_PRI_INTERACT(sched_interact_score(td));
|
2003-03-04 02:45:59 +00:00
|
|
|
pri += SCHED_PRI_BASE;
|
2006-10-26 21:42:22 +00:00
|
|
|
pri += td->td_proc->p_nice;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
if (pri > PRI_MAX_TIMESHARE)
|
|
|
|
pri = PRI_MAX_TIMESHARE;
|
|
|
|
else if (pri < PRI_MIN_TIMESHARE)
|
|
|
|
pri = PRI_MIN_TIMESHARE;
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_user_prio(td, pri);
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
return;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Calculate a time slice based on the properties of the process
|
|
|
|
* and the runq that we're on. This is only for PRI_TIMESHARE threads.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2003-04-02 06:46:43 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_slice(struct td_sched *ts)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
2006-10-26 21:42:22 +00:00
|
|
|
struct thread *td;
|
2003-04-02 06:46:43 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
td = ts->ts_thread;
|
|
|
|
tdq = TDQ_CPU(ts->ts_cpu);
|
2003-04-02 06:46:43 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
if (td->td_flags & TDF_BORROWING) {
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_MIN;
|
2004-12-14 10:34:27 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2003-04-02 06:46:43 +00:00
|
|
|
/*
|
|
|
|
* Rationale:
|
2006-12-06 06:34:57 +00:00
|
|
|
* Threads in interactive procs get a minimal slice so that we
|
2003-04-02 06:46:43 +00:00
|
|
|
* quickly notice if it abuses its advantage.
|
|
|
|
*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Threads in non-interactive procs are assigned a slice that is
|
|
|
|
* based on the procs nice value relative to the least nice procs
|
2003-04-02 06:46:43 +00:00
|
|
|
* on the run queue for this cpu.
|
|
|
|
*
|
2006-12-06 06:34:57 +00:00
|
|
|
* If the thread is less nice than all others it gets the maximum
|
|
|
|
* slice and other threads will adjust their slice relative to
|
2003-04-02 06:46:43 +00:00
|
|
|
* this when they first expire.
|
|
|
|
*
|
|
|
|
* There is 20 point window that starts relative to the least
|
2006-12-06 06:34:57 +00:00
|
|
|
* nice td_sched on the run queue. Slice size is determined by
|
|
|
|
* the td_sched distance from the last nice thread.
|
2003-04-02 06:46:43 +00:00
|
|
|
*
|
2006-12-06 06:34:57 +00:00
|
|
|
* If the td_sched is outside of the window it will get no slice
|
2003-11-02 04:10:15 +00:00
|
|
|
* and will be reevaluated each time it is selected on the
|
2006-12-06 06:34:57 +00:00
|
|
|
* run queue. The exception to this is nice 0 procs when
|
2003-11-02 04:10:15 +00:00
|
|
|
* a nice -20 is running. They are always granted a minimum
|
|
|
|
* slice.
|
2003-04-02 06:46:43 +00:00
|
|
|
*/
|
2006-10-26 21:42:22 +00:00
|
|
|
if (!SCHED_INTERACTIVE(td)) {
|
2003-04-02 06:46:43 +00:00
|
|
|
int nice;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2006-12-29 10:37:07 +00:00
|
|
|
nice = td->td_proc->p_nice + (0 - tdq->tdq_nicemin);
|
|
|
|
if (tdq->tdq_load_timeshare == 0 ||
|
|
|
|
td->td_proc->p_nice < tdq->tdq_nicemin)
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_MAX;
|
2003-11-02 04:10:15 +00:00
|
|
|
else if (nice <= SCHED_SLICE_NTHRESH)
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_NICE(nice);
|
2006-10-26 21:42:22 +00:00
|
|
|
else if (td->td_proc->p_nice == 0)
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_MIN;
|
2003-04-02 06:46:43 +00:00
|
|
|
else
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_MIN; /* 0 */
|
2003-04-02 06:46:43 +00:00
|
|
|
} else
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_INTERACTIVE;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-04-02 06:46:43 +00:00
|
|
|
return;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
2003-11-02 03:36:33 +00:00
|
|
|
/*
|
|
|
|
* This routine enforces a maximum limit on the amount of scheduling history
|
|
|
|
* kept. It is called after either the slptime or runtime is adjusted.
|
|
|
|
* This routine will not operate correctly when slp or run times have been
|
|
|
|
* adjusted to more than double their maximum.
|
|
|
|
*/
|
2003-06-17 06:39:51 +00:00
|
|
|
static void
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_interact_update(struct thread *td)
|
2003-06-17 06:39:51 +00:00
|
|
|
{
|
2003-11-02 03:36:33 +00:00
|
|
|
int sum;
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
sum = td->td_sched->skg_runtime + td->td_sched->skg_slptime;
|
2003-11-02 03:36:33 +00:00
|
|
|
if (sum < SCHED_SLP_RUN_MAX)
|
|
|
|
return;
|
|
|
|
/*
|
|
|
|
* If we have exceeded by more than 1/5th then the algorithm below
|
|
|
|
* will not bring us back into range. Dividing by two here forces
|
2004-08-10 07:52:21 +00:00
|
|
|
* us into the range of [4/5 * SCHED_INTERACT_MAX, SCHED_INTERACT_MAX]
|
2003-11-02 03:36:33 +00:00
|
|
|
*/
|
2004-04-04 19:12:56 +00:00
|
|
|
if (sum > (SCHED_SLP_RUN_MAX / 5) * 6) {
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_runtime /= 2;
|
|
|
|
td->td_sched->skg_slptime /= 2;
|
2003-11-02 03:36:33 +00:00
|
|
|
return;
|
|
|
|
}
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_runtime = (td->td_sched->skg_runtime / 5) * 4;
|
|
|
|
td->td_sched->skg_slptime = (td->td_sched->skg_slptime / 5) * 4;
|
2003-11-02 03:36:33 +00:00
|
|
|
}
|
2003-10-27 06:47:05 +00:00
|
|
|
|
2003-11-02 03:36:33 +00:00
|
|
|
static void
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_interact_fork(struct thread *td)
|
2003-11-02 03:36:33 +00:00
|
|
|
{
|
|
|
|
int ratio;
|
|
|
|
int sum;
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
sum = td->td_sched->skg_runtime + td->td_sched->skg_slptime;
|
2003-11-02 03:36:33 +00:00
|
|
|
if (sum > SCHED_SLP_RUN_FORK) {
|
|
|
|
ratio = sum / SCHED_SLP_RUN_FORK;
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_runtime /= ratio;
|
|
|
|
td->td_sched->skg_slptime /= ratio;
|
2003-06-17 06:39:51 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2003-03-04 02:45:59 +00:00
|
|
|
static int
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_interact_score(struct thread *td)
|
2003-03-04 02:45:59 +00:00
|
|
|
{
|
2003-06-15 02:18:29 +00:00
|
|
|
int div;
|
2003-03-04 02:45:59 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
if (td->td_sched->skg_runtime > td->td_sched->skg_slptime) {
|
|
|
|
div = max(1, td->td_sched->skg_runtime / SCHED_INTERACT_HALF);
|
2003-06-15 02:18:29 +00:00
|
|
|
return (SCHED_INTERACT_HALF +
|
2006-10-26 21:42:22 +00:00
|
|
|
(SCHED_INTERACT_HALF - (td->td_sched->skg_slptime / div)));
|
|
|
|
} if (td->td_sched->skg_slptime > td->td_sched->skg_runtime) {
|
|
|
|
div = max(1, td->td_sched->skg_slptime / SCHED_INTERACT_HALF);
|
|
|
|
return (td->td_sched->skg_runtime / div);
|
2003-03-04 02:45:59 +00:00
|
|
|
}
|
|
|
|
|
2003-06-15 02:18:29 +00:00
|
|
|
/*
|
|
|
|
* This can happen if slptime and runtime are 0.
|
|
|
|
*/
|
|
|
|
return (0);
|
2003-03-04 02:45:59 +00:00
|
|
|
|
|
|
|
}
|
|
|
|
|
2004-09-05 02:09:54 +00:00
|
|
|
/*
|
|
|
|
* Very early in the boot some setup of scheduler-specific
|
|
|
|
* parts of proc0 and of soem scheduler resources needs to be done.
|
|
|
|
* Called from:
|
|
|
|
* proc0_init()
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
schedinit(void)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Set up the scheduler specific parts of proc0.
|
|
|
|
*/
|
|
|
|
proc0.p_sched = NULL; /* XXX */
|
2006-12-06 06:34:57 +00:00
|
|
|
thread0.td_sched = &td_sched0;
|
|
|
|
td_sched0.ts_thread = &thread0;
|
|
|
|
td_sched0.ts_state = TSS_THREAD;
|
2004-09-05 02:09:54 +00:00
|
|
|
}
|
|
|
|
|
2003-04-11 03:47:14 +00:00
|
|
|
/*
|
|
|
|
* This is only somewhat accurate since given many processes of the same
|
|
|
|
* priority they will switch when their slices run out, which will be
|
|
|
|
* at most SCHED_SLICE_MAX.
|
|
|
|
*/
|
2003-01-26 05:23:15 +00:00
|
|
|
int
|
|
|
|
sched_rr_interval(void)
|
|
|
|
{
|
|
|
|
return (SCHED_SLICE_MAX);
|
|
|
|
}
|
|
|
|
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
static void
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_pctcpu_update(struct td_sched *ts)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Adjust counters and watermark for pctcpu calc.
|
2003-06-15 02:18:29 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_ltick > ticks - SCHED_CPU_TICKS) {
|
2003-09-20 02:05:58 +00:00
|
|
|
/*
|
|
|
|
* Shift the tick count out so that the divide doesn't
|
|
|
|
* round away our results.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_ticks <<= 10;
|
|
|
|
ts->ts_ticks = (ts->ts_ticks / (ticks - ts->ts_ftick)) *
|
2003-09-20 02:05:58 +00:00
|
|
|
SCHED_CPU_TICKS;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_ticks >>= 10;
|
2003-09-20 02:05:58 +00:00
|
|
|
} else
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_ticks = 0;
|
|
|
|
ts->ts_ltick = ticks;
|
|
|
|
ts->ts_ftick = ts->ts_ltick - SCHED_CPU_TICKS;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
sched_thread_priority(struct thread *td, u_char prio)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2004-12-26 00:15:33 +00:00
|
|
|
CTR6(KTR_SCHED, "sched_prio: %p(%s) prio %d newprio %d by %p(%s)",
|
|
|
|
td, td->td_proc->p_comm, td->td_priority, prio, curthread,
|
|
|
|
curthread->td_proc->p_comm);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
2003-01-26 05:23:15 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
if (td->td_priority == prio)
|
|
|
|
return;
|
2003-01-26 05:23:15 +00:00
|
|
|
if (TD_ON_RUNQ(td)) {
|
2003-10-27 06:47:05 +00:00
|
|
|
/*
|
|
|
|
* If the priority has been elevated due to priority
|
|
|
|
* propagation, we may have to move ourselves to a new
|
|
|
|
* queue. We still call adjustrunqueue below in case kse
|
|
|
|
* needs to fix things up.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (prio < td->td_priority && ts->ts_runq != NULL &&
|
|
|
|
(ts->ts_flags & TSF_ASSIGNED) == 0 &&
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq != TDQ_CPU(ts->ts_cpu)->tdq_curr) {
|
2006-12-06 06:34:57 +00:00
|
|
|
runq_remove(ts->ts_runq, ts);
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = TDQ_CPU(ts->ts_cpu)->tdq_curr;
|
2006-12-06 06:34:57 +00:00
|
|
|
runq_add(ts->ts_runq, ts, 0);
|
2003-10-27 06:47:05 +00:00
|
|
|
}
|
2004-08-12 07:56:33 +00:00
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Hold this td_sched on this cpu so that sched_prio() doesn't
|
2004-08-12 07:56:33 +00:00
|
|
|
* cause excessive migration. We only want migration to
|
|
|
|
* happen as the result of a wakeup.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags |= TSF_HOLD;
|
2003-08-26 11:33:15 +00:00
|
|
|
adjustrunqueue(td, prio);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags &= ~TSF_HOLD;
|
2003-10-27 06:47:05 +00:00
|
|
|
} else
|
2003-08-26 11:33:15 +00:00
|
|
|
td->td_priority = prio;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
/*
|
|
|
|
* Update a thread's priority when it is lent another thread's
|
|
|
|
* priority.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
sched_lend_prio(struct thread *td, u_char prio)
|
|
|
|
{
|
|
|
|
|
|
|
|
td->td_flags |= TDF_BORROWING;
|
|
|
|
sched_thread_priority(td, prio);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Restore a thread's priority when priority propagation is
|
|
|
|
* over. The prio argument is the minimum priority the thread
|
|
|
|
* needs to have to satisfy other possible priority lending
|
|
|
|
* requests. If the thread's regular priority is less
|
|
|
|
* important than prio, the thread will keep a priority boost
|
|
|
|
* of prio.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
sched_unlend_prio(struct thread *td, u_char prio)
|
|
|
|
{
|
|
|
|
u_char base_pri;
|
|
|
|
|
|
|
|
if (td->td_base_pri >= PRI_MIN_TIMESHARE &&
|
|
|
|
td->td_base_pri <= PRI_MAX_TIMESHARE)
|
2006-10-26 21:42:22 +00:00
|
|
|
base_pri = td->td_user_pri;
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
else
|
|
|
|
base_pri = td->td_base_pri;
|
|
|
|
if (prio >= base_pri) {
|
2004-12-30 22:17:00 +00:00
|
|
|
td->td_flags &= ~TDF_BORROWING;
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
sched_thread_priority(td, base_pri);
|
|
|
|
} else
|
|
|
|
sched_lend_prio(td, prio);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_prio(struct thread *td, u_char prio)
|
|
|
|
{
|
|
|
|
u_char oldprio;
|
|
|
|
|
|
|
|
/* First, update the base priority. */
|
|
|
|
td->td_base_pri = prio;
|
|
|
|
|
|
|
|
/*
|
2004-12-30 22:17:00 +00:00
|
|
|
* If the thread is borrowing another thread's priority, don't
|
Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
2004-12-30 20:52:44 +00:00
|
|
|
* ever lower the priority.
|
|
|
|
*/
|
|
|
|
if (td->td_flags & TDF_BORROWING && td->td_priority < prio)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* Change the real priority. */
|
|
|
|
oldprio = td->td_priority;
|
|
|
|
sched_thread_priority(td, prio);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the thread is on a turnstile, then let the turnstile update
|
|
|
|
* its state.
|
|
|
|
*/
|
|
|
|
if (TD_ON_LOCK(td) && oldprio != prio)
|
|
|
|
turnstile_adjust(td, oldprio);
|
|
|
|
}
|
2004-12-30 22:17:00 +00:00
|
|
|
|
2006-08-25 06:12:53 +00:00
|
|
|
void
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_user_prio(struct thread *td, u_char prio)
|
2006-08-25 06:12:53 +00:00
|
|
|
{
|
|
|
|
u_char oldprio;
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_base_user_pri = prio;
|
2006-12-06 06:55:59 +00:00
|
|
|
if (td->td_flags & TDF_UBORROWING && td->td_user_pri <= prio)
|
|
|
|
return;
|
2006-10-26 21:42:22 +00:00
|
|
|
oldprio = td->td_user_pri;
|
|
|
|
td->td_user_pri = prio;
|
2006-08-25 06:12:53 +00:00
|
|
|
|
|
|
|
if (TD_ON_UPILOCK(td) && oldprio != prio)
|
|
|
|
umtx_pi_adjust(td, oldprio);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_lend_user_prio(struct thread *td, u_char prio)
|
|
|
|
{
|
|
|
|
u_char oldprio;
|
|
|
|
|
|
|
|
td->td_flags |= TDF_UBORROWING;
|
|
|
|
|
2006-11-08 09:09:07 +00:00
|
|
|
oldprio = td->td_user_pri;
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_user_pri = prio;
|
2006-08-25 06:12:53 +00:00
|
|
|
|
|
|
|
if (TD_ON_UPILOCK(td) && oldprio != prio)
|
|
|
|
umtx_pi_adjust(td, oldprio);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_unlend_user_prio(struct thread *td, u_char prio)
|
|
|
|
{
|
|
|
|
u_char base_pri;
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
base_pri = td->td_base_user_pri;
|
2006-08-25 06:12:53 +00:00
|
|
|
if (prio >= base_pri) {
|
|
|
|
td->td_flags &= ~TDF_UBORROWING;
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_user_prio(td, base_pri);
|
2006-08-25 06:12:53 +00:00
|
|
|
} else
|
|
|
|
sched_lend_user_prio(td, prio);
|
|
|
|
}
|
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
void
|
2004-09-10 21:04:38 +00:00
|
|
|
sched_switch(struct thread *td, struct thread *newtd, int flags)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-29 12:55:32 +00:00
|
|
|
struct tdq *tdq;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
2006-12-29 12:55:32 +00:00
|
|
|
tdq = TDQ_SELF();
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2004-08-12 07:56:33 +00:00
|
|
|
td->td_lastcpu = td->td_oncpu;
|
2003-04-10 17:35:44 +00:00
|
|
|
td->td_oncpu = NOCPU;
|
2004-07-16 21:04:55 +00:00
|
|
|
td->td_flags &= ~TDF_NEEDRESCHED;
|
2005-04-08 03:37:53 +00:00
|
|
|
td->td_owepreempt = 0;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-12-11 04:00:49 +00:00
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* If the thread has been assigned it may be in the process of switching
|
2003-12-11 04:00:49 +00:00
|
|
|
* to the new cpu. This is the case in sched_bind().
|
|
|
|
*/
|
2004-12-26 22:56:08 +00:00
|
|
|
if (td == PCPU_GET(idlethread)) {
|
|
|
|
TD_SET_CAN_RUN(td);
|
2006-12-06 06:34:57 +00:00
|
|
|
} else if ((ts->ts_flags & TSF_ASSIGNED) == 0) {
|
2004-12-26 22:56:08 +00:00
|
|
|
/* We are ending our run so make our slot available again */
|
2006-12-29 12:55:32 +00:00
|
|
|
tdq_load_rem(tdq, ts);
|
2004-12-26 22:56:08 +00:00
|
|
|
if (TD_IS_RUNNING(td)) {
|
|
|
|
/*
|
|
|
|
* Don't allow the thread to migrate
|
|
|
|
* from a preemption.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags |= TSF_HOLD;
|
2004-12-26 22:56:08 +00:00
|
|
|
setrunqueue(td, (flags & SW_PREEMPT) ?
|
|
|
|
SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED :
|
|
|
|
SRQ_OURSELF|SRQ_YIELDING);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags &= ~TSF_HOLD;
|
2006-10-26 21:42:22 +00:00
|
|
|
}
|
2003-10-16 20:32:57 +00:00
|
|
|
}
|
2004-10-05 21:10:44 +00:00
|
|
|
if (newtd != NULL) {
|
2004-10-05 22:03:10 +00:00
|
|
|
/*
|
2005-06-07 02:59:16 +00:00
|
|
|
* If we bring in a thread account for it as if it had been
|
|
|
|
* added to the run queue and then chosen.
|
2004-10-05 22:03:10 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
newtd->td_sched->ts_flags |= TSF_DIDRUN;
|
2006-12-29 12:55:32 +00:00
|
|
|
newtd->td_sched->ts_runq = tdq->tdq_curr;
|
2004-10-05 22:14:02 +00:00
|
|
|
TD_SET_RUNNING(newtd);
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_load_add(TDQ_SELF(), newtd->td_sched);
|
2004-10-05 21:10:44 +00:00
|
|
|
} else
|
2004-08-10 07:52:21 +00:00
|
|
|
newtd = choosethread();
|
2005-04-19 04:01:25 +00:00
|
|
|
if (td != newtd) {
|
|
|
|
#ifdef HWPMC_HOOKS
|
|
|
|
if (PMC_PROC_IS_USING_PMCS(td->td_proc))
|
|
|
|
PMC_SWITCH_CONTEXT(td, PMC_FN_CSW_OUT);
|
|
|
|
#endif
|
2006-10-26 21:42:22 +00:00
|
|
|
|
2003-10-16 08:53:46 +00:00
|
|
|
cpu_switch(td, newtd);
|
2005-04-19 04:01:25 +00:00
|
|
|
#ifdef HWPMC_HOOKS
|
|
|
|
if (PMC_PROC_IS_USING_PMCS(td->td_proc))
|
|
|
|
PMC_SWITCH_CONTEXT(td, PMC_FN_CSW_IN);
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2003-10-16 08:53:46 +00:00
|
|
|
sched_lock.mtx_lock = (uintptr_t)td;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-04-10 17:35:44 +00:00
|
|
|
td->td_oncpu = PCPU_GET(cpuid);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2004-06-16 00:26:31 +00:00
|
|
|
sched_nice(struct proc *p, int nice)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
2003-01-26 05:23:15 +00:00
|
|
|
struct thread *td;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2004-06-16 00:26:31 +00:00
|
|
|
PROC_LOCK_ASSERT(p, MA_OWNED);
|
2003-04-22 20:50:38 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2003-04-11 03:47:14 +00:00
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* We need to adjust the nice counts for running threads.
|
2003-04-11 03:47:14 +00:00
|
|
|
*/
|
2006-10-26 21:42:22 +00:00
|
|
|
FOREACH_THREAD_IN_PROC(p, td) {
|
|
|
|
if (td->td_pri_class == PRI_TIMESHARE) {
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
|
|
|
if (ts->ts_runq == NULL)
|
2006-10-26 21:42:22 +00:00
|
|
|
continue;
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_CPU(ts->ts_cpu);
|
|
|
|
tdq_nice_rem(tdq, p->p_nice);
|
|
|
|
tdq_nice_add(tdq, nice);
|
2003-04-11 03:47:14 +00:00
|
|
|
}
|
2004-06-16 00:26:31 +00:00
|
|
|
}
|
|
|
|
p->p_nice = nice;
|
2006-10-26 21:42:22 +00:00
|
|
|
FOREACH_THREAD_IN_PROC(p, td) {
|
|
|
|
sched_priority(td);
|
|
|
|
td->td_flags |= TDF_NEEDRESCHED;
|
2004-06-16 00:26:31 +00:00
|
|
|
}
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
Switch the sleep/wakeup and condition variable implementations to use the
sleep queue interface:
- Sleep queues attempt to merge some of the benefits of both sleep queues
and condition variables. Having sleep qeueus in a hash table avoids
having to allocate a queue head for each wait channel. Thus, struct cv
has shrunk down to just a single char * pointer now. However, the
hash table does not hold threads directly, but queue heads. This means
that once you have located a queue in the hash bucket, you no longer have
to walk the rest of the hash chain looking for threads. Instead, you have
a list of all the threads sleeping on that wait channel.
- Outside of the sleepq code and the sleep/cv code the kernel no longer
differentiates between cv's and sleep/wakeup. For example, calls to
abortsleep() and cv_abort() are replaced with a call to sleepq_abort().
Thus, the TDF_CVWAITQ flag is removed. Also, calls to unsleep() and
cv_waitq_remove() have been replaced with calls to sleepq_remove().
- The sched_sleep() function no longer accepts a priority argument as
sleep's no longer inherently bump the priority. Instead, this is soley
a propery of msleep() which explicitly calls sched_prio() before
blocking.
- The TDF_ONSLEEPQ flag has been dropped as it was never used. The
associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been
dropped and replaced with a single explicit clearing of td_wchan.
TD_SET_ONSLEEPQ() would really have only made sense if it had taken
the wait channel and message as arguments anyway. Now that that only
happens in one place, a macro would be overkill.
2004-02-27 18:52:44 +00:00
|
|
|
sched_sleep(struct thread *td)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
td->td_sched->ts_slptime = ticks;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_wakeup(struct thread *td)
|
|
|
|
{
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
|
|
|
|
|
|
|
/*
|
2006-12-06 06:34:57 +00:00
|
|
|
* Let the procs know how long we slept for. This is because process
|
|
|
|
* interactivity behavior is modeled in the procs.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (td->td_sched->ts_slptime) {
|
2003-04-11 03:47:14 +00:00
|
|
|
int hzticks;
|
2003-03-03 04:11:40 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
hzticks = (ticks - td->td_sched->ts_slptime) << 10;
|
2003-11-02 03:36:33 +00:00
|
|
|
if (hzticks >= SCHED_SLP_RUN_MAX) {
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_slptime = SCHED_SLP_RUN_MAX;
|
|
|
|
td->td_sched->skg_runtime = 1;
|
2003-11-02 03:36:33 +00:00
|
|
|
} else {
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_slptime += hzticks;
|
|
|
|
sched_interact_update(td);
|
2003-11-02 03:36:33 +00:00
|
|
|
}
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_priority(td);
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_slice(td->td_sched);
|
|
|
|
td->td_sched->ts_slptime = 0;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
2004-09-01 02:11:28 +00:00
|
|
|
setrunqueue(td, SRQ_BORING);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Penalize the parent for creating a new child and initialize the child's
|
|
|
|
* priority.
|
|
|
|
*/
|
|
|
|
void
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_fork(struct thread *td, struct thread *child)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_fork_thread(td, child);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_fork_thread(struct thread *td, struct thread *child)
|
|
|
|
{
|
|
|
|
struct td_sched *ts;
|
|
|
|
struct td_sched *ts2;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
child->td_sched->skg_slptime = td->td_sched->skg_slptime;
|
|
|
|
child->td_sched->skg_runtime = td->td_sched->skg_runtime;
|
|
|
|
child->td_user_pri = td->td_user_pri;
|
2006-11-08 09:09:07 +00:00
|
|
|
child->td_base_user_pri = td->td_base_user_pri;
|
2003-11-02 03:36:33 +00:00
|
|
|
sched_interact_fork(child);
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_runtime += tickincr;
|
|
|
|
sched_interact_update(td);
|
2004-09-05 02:09:54 +00:00
|
|
|
|
|
|
|
sched_newthread(child);
|
2006-10-26 21:42:22 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
|
|
|
ts2 = child->td_sched;
|
|
|
|
ts2->ts_slice = 1; /* Attempt to quickly learn interactivity. */
|
|
|
|
ts2->ts_cpu = ts->ts_cpu;
|
|
|
|
ts2->ts_runq = NULL;
|
2004-09-05 02:09:54 +00:00
|
|
|
|
|
|
|
/* Grab our parents cpu estimation information. */
|
2006-12-06 06:34:57 +00:00
|
|
|
ts2->ts_ticks = ts->ts_ticks;
|
|
|
|
ts2->ts_ltick = ts->ts_ltick;
|
|
|
|
ts2->ts_ftick = ts->ts_ftick;
|
2003-04-11 03:47:14 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_class(struct thread *td, int class)
|
2003-04-11 03:47:14 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct td_sched *ts;
|
2003-11-02 10:56:48 +00:00
|
|
|
int nclass;
|
|
|
|
int oclass;
|
2003-04-11 03:47:14 +00:00
|
|
|
|
2003-04-23 18:51:05 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-10-26 21:42:22 +00:00
|
|
|
if (td->td_pri_class == class)
|
2003-04-11 03:47:14 +00:00
|
|
|
return;
|
|
|
|
|
2003-11-02 10:56:48 +00:00
|
|
|
nclass = PRI_BASE(class);
|
2006-10-26 21:42:22 +00:00
|
|
|
oclass = PRI_BASE(td->td_pri_class);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
|
|
|
if (!((ts->ts_state != TSS_ONRUNQ &&
|
|
|
|
ts->ts_state != TSS_THREAD) || ts->ts_runq == NULL)) {
|
|
|
|
tdq = TDQ_CPU(ts->ts_cpu);
|
2003-04-11 03:47:14 +00:00
|
|
|
|
2003-11-02 10:56:48 +00:00
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
/*
|
|
|
|
* On SMP if we're on the RUNQ we must adjust the transferable
|
|
|
|
* count because could be changing to or from an interrupt
|
|
|
|
* class.
|
|
|
|
*/
|
|
|
|
if (ts->ts_state == TSS_ONRUNQ) {
|
|
|
|
if (THREAD_CAN_MIGRATE(ts)) {
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_transferable--;
|
|
|
|
tdq->tdq_group->tdg_transferable--;
|
2006-12-06 06:34:57 +00:00
|
|
|
}
|
|
|
|
if (THREAD_CAN_MIGRATE(ts)) {
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_transferable++;
|
|
|
|
tdq->tdq_group->tdg_transferable++;
|
2006-12-06 06:34:57 +00:00
|
|
|
}
|
2003-11-15 07:32:07 +00:00
|
|
|
}
|
2006-10-26 21:42:22 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
if (oclass == PRI_TIMESHARE) {
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_load_timeshare--;
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_nice_rem(tdq, td->td_proc->p_nice);
|
|
|
|
}
|
|
|
|
if (nclass == PRI_TIMESHARE) {
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_load_timeshare++;
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_nice_add(tdq, td->td_proc->p_nice);
|
|
|
|
}
|
2006-10-26 21:42:22 +00:00
|
|
|
}
|
2003-01-28 09:28:20 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_pri_class = class;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return some of the child's priority and interactivity to the parent.
|
|
|
|
*/
|
|
|
|
void
|
2006-12-06 06:55:59 +00:00
|
|
|
sched_exit(struct proc *p, struct thread *child)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:55:59 +00:00
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
CTR3(KTR_SCHED, "sched_exit: %p(%s) prio %d",
|
2006-12-06 06:55:59 +00:00
|
|
|
child, child->td_proc->p_comm, child->td_priority);
|
2006-10-26 21:42:22 +00:00
|
|
|
|
2006-12-06 06:55:59 +00:00
|
|
|
sched_exit_thread(FIRST_THREAD_IN_PROC(p), child);
|
2006-12-06 06:34:57 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2006-12-06 06:55:59 +00:00
|
|
|
sched_exit_thread(struct thread *td, struct thread *child)
|
2006-12-06 06:34:57 +00:00
|
|
|
{
|
2006-12-06 06:55:59 +00:00
|
|
|
CTR3(KTR_SCHED, "sched_exit_thread: %p(%s) prio %d",
|
|
|
|
child, childproc->p_comm, child->td_priority);
|
|
|
|
|
|
|
|
td->td_sched->skg_runtime += child->td_sched->skg_runtime;
|
|
|
|
sched_interact_update(td);
|
|
|
|
tdq_load_rem(TDQ_CPU(child->td_sched->ts_cpu), child->td_sched);
|
2006-12-06 06:34:57 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_userret(struct thread *td)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* XXX we cheat slightly on the locking here to avoid locking in
|
|
|
|
* the usual case. Setting td_priority here is essentially an
|
|
|
|
* incomplete workaround for not setting it properly elsewhere.
|
|
|
|
* Now that some interrupt handlers are threads, not setting it
|
|
|
|
* properly elsewhere can clobber it in the window between setting
|
|
|
|
* it here and returning to user mode, so don't waste time setting
|
|
|
|
* it perfectly here.
|
|
|
|
*/
|
|
|
|
KASSERT((td->td_flags & TDF_BORROWING) == 0,
|
|
|
|
("thread with borrowed priority returning to userland"));
|
|
|
|
if (td->td_priority != td->td_user_pri) {
|
|
|
|
mtx_lock_spin(&sched_lock);
|
|
|
|
td->td_priority = td->td_user_pri;
|
|
|
|
td->td_base_pri = td->td_user_pri;
|
|
|
|
mtx_unlock_spin(&sched_lock);
|
|
|
|
}
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2003-10-16 08:39:15 +00:00
|
|
|
sched_clock(struct thread *td)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct td_sched *ts;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2004-06-02 05:46:48 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_SELF();
|
2004-06-02 05:46:48 +00:00
|
|
|
#ifdef SMP
|
2004-12-26 22:56:08 +00:00
|
|
|
if (ticks >= bal_tick)
|
2004-06-02 05:46:48 +00:00
|
|
|
sched_balance();
|
2004-12-26 22:56:08 +00:00
|
|
|
if (ticks >= gbal_tick && balance_groups)
|
2004-06-02 05:46:48 +00:00
|
|
|
sched_balance_groups();
|
2004-08-10 07:52:21 +00:00
|
|
|
/*
|
|
|
|
* We could have been assigned a non real-time thread without an
|
|
|
|
* IPI.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_assigned)
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_assign(tdq); /* Potentially sets NEEDRESCHED */
|
2004-06-02 05:46:48 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-01-29 07:00:51 +00:00
|
|
|
/* Adjust ticks for pctcpu */
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_ticks++;
|
|
|
|
ts->ts_ltick = ticks;
|
2003-04-03 00:29:28 +00:00
|
|
|
|
2003-01-28 09:30:17 +00:00
|
|
|
/* Go up to one second beyond our max and then trim back down */
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_ftick + SCHED_CPU_TICKS + hz < ts->ts_ltick)
|
|
|
|
sched_pctcpu_update(ts);
|
2003-01-28 09:30:17 +00:00
|
|
|
|
2003-05-02 06:18:55 +00:00
|
|
|
if (td->td_flags & TDF_IDLETD)
|
2003-01-26 05:23:15 +00:00
|
|
|
return;
|
2003-04-11 03:47:14 +00:00
|
|
|
/*
|
2006-10-26 21:42:22 +00:00
|
|
|
* We only do slicing code for TIMESHARE threads.
|
2003-04-11 03:47:14 +00:00
|
|
|
*/
|
2006-10-26 21:42:22 +00:00
|
|
|
if (td->td_pri_class != PRI_TIMESHARE)
|
2003-10-27 06:47:05 +00:00
|
|
|
return;
|
2003-01-26 05:23:15 +00:00
|
|
|
/*
|
2006-10-26 21:42:22 +00:00
|
|
|
* We used a tick charge it to the thread so that we can compute our
|
2003-04-11 03:47:14 +00:00
|
|
|
* interactivity.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_sched->skg_runtime += tickincr;
|
|
|
|
sched_interact_update(td);
|
2003-02-10 14:03:45 +00:00
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
/*
|
|
|
|
* We used up one time slice.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (--ts->ts_slice > 0)
|
2003-04-11 03:47:14 +00:00
|
|
|
return;
|
2003-01-26 05:23:15 +00:00
|
|
|
/*
|
2003-04-11 03:47:14 +00:00
|
|
|
* We're out of time, recompute priorities and requeue.
|
2003-01-26 05:23:15 +00:00
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_load_rem(tdq, ts);
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_priority(td);
|
2006-12-06 06:34:57 +00:00
|
|
|
sched_slice(ts);
|
|
|
|
if (SCHED_CURR(td, ts))
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_curr;
|
2003-04-11 03:47:14 +00:00
|
|
|
else
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_next;
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_load_add(tdq, ts);
|
2003-04-11 03:47:14 +00:00
|
|
|
td->td_flags |= TDF_NEEDRESCHED;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
sched_runnable(void)
|
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
2003-06-08 00:47:33 +00:00
|
|
|
int load;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-06-08 00:47:33 +00:00
|
|
|
load = 1;
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_SELF();
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#ifdef SMP
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_assigned) {
|
2003-11-05 05:30:12 +00:00
|
|
|
mtx_lock_spin(&sched_lock);
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_assign(tdq);
|
2003-11-05 05:30:12 +00:00
|
|
|
mtx_unlock_spin(&sched_lock);
|
|
|
|
}
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#endif
|
2003-10-27 06:47:05 +00:00
|
|
|
if ((curthread->td_flags & TDF_IDLETD) != 0) {
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_load > 0)
|
2003-10-27 06:47:05 +00:00
|
|
|
goto out;
|
|
|
|
} else
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_load - 1 > 0)
|
2003-10-27 06:47:05 +00:00
|
|
|
goto out;
|
2003-06-08 00:47:33 +00:00
|
|
|
load = 0;
|
|
|
|
out:
|
|
|
|
return (load);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *
|
2003-01-28 09:28:20 +00:00
|
|
|
sched_choose(void)
|
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct td_sched *ts;
|
2003-01-28 09:28:20 +00:00
|
|
|
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_SELF();
|
2003-04-11 03:47:14 +00:00
|
|
|
#ifdef SMP
|
2003-12-11 03:57:10 +00:00
|
|
|
restart:
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_assigned)
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_assign(tdq);
|
2003-04-11 03:47:14 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = tdq_choose(tdq);
|
|
|
|
if (ts) {
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_thread->td_pri_class == PRI_IDLE)
|
|
|
|
if (tdq_idled(tdq) == 0)
|
2003-12-11 03:57:10 +00:00
|
|
|
goto restart;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_runq_rem(tdq, ts);
|
|
|
|
ts->ts_state = TSS_THREAD;
|
|
|
|
ts->ts_flags &= ~TSF_PREEMPTED;
|
|
|
|
return (ts);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
2003-01-28 09:28:20 +00:00
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
if (tdq_idled(tdq) == 0)
|
2003-12-11 03:57:10 +00:00
|
|
|
goto restart;
|
2003-01-28 09:28:20 +00:00
|
|
|
#endif
|
2003-04-11 03:47:14 +00:00
|
|
|
return (NULL);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2004-09-01 02:11:28 +00:00
|
|
|
sched_add(struct thread *td, int flags)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct td_sched *ts;
|
2004-12-26 22:56:08 +00:00
|
|
|
int preemptive;
|
2004-08-10 07:52:21 +00:00
|
|
|
int canmigrate;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
int class;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2004-12-26 00:15:33 +00:00
|
|
|
CTR5(KTR_SCHED, "sched_add: %p(%s) prio %d by %p(%s)",
|
|
|
|
td, td->td_proc->p_comm, td->td_priority, curthread,
|
|
|
|
curthread->td_proc->p_comm);
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
2004-12-26 22:56:08 +00:00
|
|
|
canmigrate = 1;
|
|
|
|
preemptive = !(flags & SRQ_YIELDING);
|
2006-10-26 21:42:22 +00:00
|
|
|
class = PRI_BASE(td->td_pri_class);
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq = TDQ_SELF();
|
|
|
|
ts->ts_flags &= ~TSF_INTERNAL;
|
2004-12-26 22:56:08 +00:00
|
|
|
#ifdef SMP
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_flags & TSF_ASSIGNED) {
|
|
|
|
if (ts->ts_flags & TSF_REMOVED)
|
|
|
|
ts->ts_flags &= ~TSF_REMOVED;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
return;
|
2004-12-13 13:09:33 +00:00
|
|
|
}
|
2006-12-06 06:34:57 +00:00
|
|
|
canmigrate = THREAD_CAN_MIGRATE(ts);
|
2005-08-19 11:51:41 +00:00
|
|
|
/*
|
|
|
|
* Don't migrate running threads here. Force the long term balancer
|
|
|
|
* to do it.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_flags & TSF_HOLD) {
|
|
|
|
ts->ts_flags &= ~TSF_HOLD;
|
2005-08-19 11:51:41 +00:00
|
|
|
canmigrate = 0;
|
|
|
|
}
|
2004-12-26 22:56:08 +00:00
|
|
|
#endif
|
2006-12-06 06:34:57 +00:00
|
|
|
KASSERT(ts->ts_state != TSS_ONRUNQ,
|
|
|
|
("sched_add: thread %p (%s) already in run queue", td,
|
2006-10-26 21:42:22 +00:00
|
|
|
td->td_proc->p_comm));
|
|
|
|
KASSERT(td->td_proc->p_sflag & PS_INMEM,
|
2003-02-03 05:30:07 +00:00
|
|
|
("sched_add: process swapped out"));
|
2006-12-06 06:34:57 +00:00
|
|
|
KASSERT(ts->ts_runq == NULL,
|
|
|
|
("sched_add: thread %p is still assigned to a run queue", td));
|
2005-08-08 14:20:10 +00:00
|
|
|
if (flags & SRQ_PREEMPTED)
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_flags |= TSF_PREEMPTED;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
switch (class) {
|
2003-04-03 00:29:28 +00:00
|
|
|
case PRI_ITHD:
|
|
|
|
case PRI_REALTIME:
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_curr;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_MAX;
|
2004-12-26 22:56:08 +00:00
|
|
|
if (canmigrate)
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_cpu = PCPU_GET(cpuid);
|
2003-04-03 00:29:28 +00:00
|
|
|
break;
|
|
|
|
case PRI_TIMESHARE:
|
2006-12-06 06:34:57 +00:00
|
|
|
if (SCHED_CURR(td, ts))
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_curr;
|
2003-04-12 07:28:36 +00:00
|
|
|
else
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_next;
|
2003-04-11 03:47:14 +00:00
|
|
|
break;
|
2003-04-03 00:29:28 +00:00
|
|
|
case PRI_IDLE:
|
2003-04-11 03:47:14 +00:00
|
|
|
/*
|
|
|
|
* This is for priority prop.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_thread->td_priority < PRI_MIN_IDLE)
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = tdq->tdq_curr;
|
2003-04-11 03:47:14 +00:00
|
|
|
else
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq = &tdq->tdq_idle;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_slice = SCHED_SLICE_MIN;
|
2003-04-11 03:47:14 +00:00
|
|
|
break;
|
|
|
|
default:
|
2003-11-02 03:36:33 +00:00
|
|
|
panic("Unknown pri class.");
|
2003-04-03 00:29:28 +00:00
|
|
|
break;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#ifdef SMP
|
2004-08-10 07:52:21 +00:00
|
|
|
/*
|
|
|
|
* If this thread is pinned or bound, notify the target cpu.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (!canmigrate && ts->ts_cpu != PCPU_GET(cpuid) ) {
|
|
|
|
ts->ts_runq = NULL;
|
|
|
|
tdq_notify(ts, ts->ts_cpu);
|
2003-12-11 03:57:10 +00:00
|
|
|
return;
|
|
|
|
}
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
/*
|
2003-12-20 14:03:14 +00:00
|
|
|
* If we had been idle, clear our bit in the group and potentially
|
|
|
|
* the global bitmap. If not, see if we should transfer this thread.
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
*/
|
2003-12-11 03:57:10 +00:00
|
|
|
if ((class == PRI_TIMESHARE || class == PRI_REALTIME) &&
|
2006-12-29 10:37:07 +00:00
|
|
|
(tdq->tdq_group->tdg_idlemask & PCPU_GET(cpumask)) != 0) {
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
/*
|
2003-12-11 03:57:10 +00:00
|
|
|
* Check to see if our group is unidling, and if so, remove it
|
|
|
|
* from the global idle mask.
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
if (tdq->tdq_group->tdg_idlemask ==
|
|
|
|
tdq->tdq_group->tdg_cpumask)
|
|
|
|
atomic_clear_int(&tdq_idle, tdq->tdq_group->tdg_mask);
|
2003-12-11 03:57:10 +00:00
|
|
|
/*
|
|
|
|
* Now remove ourselves from the group specific idle mask.
|
|
|
|
*/
|
2006-12-29 10:37:07 +00:00
|
|
|
tdq->tdq_group->tdg_idlemask &= ~PCPU_GET(cpumask);
|
|
|
|
} else if (canmigrate && tdq->tdq_load > 1 && class != PRI_ITHD)
|
2006-12-06 06:34:57 +00:00
|
|
|
if (tdq_transfer(tdq, ts, class))
|
2003-12-20 14:03:14 +00:00
|
|
|
return;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_cpu = PCPU_GET(cpuid);
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
#endif
|
2004-08-12 07:56:33 +00:00
|
|
|
if (td->td_priority < curthread->td_priority &&
|
2006-12-29 10:37:07 +00:00
|
|
|
ts->ts_runq == tdq->tdq_curr)
|
2004-08-12 07:56:33 +00:00
|
|
|
curthread->td_flags |= TDF_NEEDRESCHED;
|
2004-07-08 21:45:04 +00:00
|
|
|
if (preemptive && maybe_preempt(td))
|
2004-07-02 20:21:44 +00:00
|
|
|
return;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_state = TSS_ONRUNQ;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
tdq_runq_add(tdq, ts, flags);
|
|
|
|
tdq_load_add(tdq, ts);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2003-10-16 08:39:15 +00:00
|
|
|
sched_rem(struct thread *td)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct tdq *tdq;
|
|
|
|
struct td_sched *ts;
|
2003-10-16 08:39:15 +00:00
|
|
|
|
2004-12-26 00:15:33 +00:00
|
|
|
CTR5(KTR_SCHED, "sched_rem: %p(%s) prio %d by %p(%s)",
|
|
|
|
td, td->td_proc->p_comm, td->td_priority, curthread,
|
|
|
|
curthread->td_proc->p_comm);
|
2004-12-26 22:56:08 +00:00
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
|
|
|
ts->ts_flags &= ~TSF_PREEMPTED;
|
|
|
|
if (ts->ts_flags & TSF_ASSIGNED) {
|
|
|
|
ts->ts_flags |= TSF_REMOVED;
|
- Add static to local functions and data where it was missing.
- Add an IPI based mechanism for migrating kses. This mechanism is
broken down into several components. This is intended to reduce cache
thrashing by eliminating most cases where one cpu touches another's
run queues.
- kseq_notify() appends a kse to a lockless singly linked list and
conditionally sends an IPI to the target processor. Right now this is
protected by sched_lock but at some point I'd like to get rid of the
global lock. This is why I used something more complicated than a
standard queue.
- kseq_assign() processes our list of kses that have been assigned to us
by other processors. This simply calls sched_add() for each item on the
list after clearing the new KEF_ASSIGNED flag. This flag is used to
indicate that we have been appeneded to the assigned queue but not
added to the run queue yet.
- In sched_add(), instead of adding a KSE to another processor's queue we
use kse_notify() so that we don't touch their queue. Also in sched_add(),
if KEF_ASSIGNED is already set return immediately. This can happen if
a thread is removed and readded so that the priority is recorded properly.
- In sched_rem() return immediately if KEF_ASSIGNED is set. All callers
immediately readd simply to adjust priorites etc.
- In sched_choose(), if we're running an IDLE task or the per cpu idle thread
set our cpumask bit in 'kseq_idle' so that other processors may know that
we are idle. Before this, make a single pass through the run queues of
other processors so that we may find work more immediately if it is
available.
- In sched_runnable(), don't scan each processor's run queue, they will IPI
us if they have work for us to do.
- In sched_add(), if we're adding a thread that can be migrated and we have
plenty of work to do, try to migrate the thread to an idle kseq.
- Simplify the logic in sched_prio() and take the KEF_ASSIGNED flag into
consideration.
- No longer use kseq_choose() to steal threads, it can lose it's last
argument.
- Create a new function runq_steal() which operates like runq_choose() but
skips threads based on some criteria. Currently it will not steal
PRI_ITHD threads. In the future this will be used for CPU binding.
- Create a kseq_steal() that checks each run queue with runq_steal(), use
kseq_steal() in the places where we used kseq_choose() to steal with
before.
2003-10-31 11:16:04 +00:00
|
|
|
return;
|
2004-12-13 13:09:33 +00:00
|
|
|
}
|
2006-12-06 06:34:57 +00:00
|
|
|
KASSERT((ts->ts_state == TSS_ONRUNQ),
|
|
|
|
("sched_rem: thread not on run queue"));
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_state = TSS_THREAD;
|
|
|
|
tdq = TDQ_CPU(ts->ts_cpu);
|
|
|
|
tdq_runq_rem(tdq, ts);
|
|
|
|
tdq_load_rem(tdq, ts);
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
fixpt_t
|
2003-10-16 08:39:15 +00:00
|
|
|
sched_pctcpu(struct thread *td)
|
2003-01-26 05:23:15 +00:00
|
|
|
{
|
|
|
|
fixpt_t pctcpu;
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
pctcpu = 0;
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
|
|
|
if (ts == NULL)
|
2003-10-20 19:55:21 +00:00
|
|
|
return (0);
|
2003-01-26 05:23:15 +00:00
|
|
|
|
2003-06-08 00:47:33 +00:00
|
|
|
mtx_lock_spin(&sched_lock);
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_ticks) {
|
2003-01-26 05:23:15 +00:00
|
|
|
int rtick;
|
|
|
|
|
2003-06-15 02:18:29 +00:00
|
|
|
/*
|
|
|
|
* Don't update more frequently than twice a second. Allowing
|
|
|
|
* this causes the cpu usage to decay away too quickly due to
|
|
|
|
* rounding errors.
|
|
|
|
*/
|
2006-12-06 06:34:57 +00:00
|
|
|
if (ts->ts_ftick + SCHED_CPU_TICKS < ts->ts_ltick ||
|
|
|
|
ts->ts_ltick < (ticks - (hz / 2)))
|
|
|
|
sched_pctcpu_update(ts);
|
2003-01-26 05:23:15 +00:00
|
|
|
/* How many rtick per second ? */
|
2006-12-06 06:34:57 +00:00
|
|
|
rtick = min(ts->ts_ticks / SCHED_CPU_TIME, SCHED_CPU_TICKS);
|
2003-02-02 08:24:32 +00:00
|
|
|
pctcpu = (FSCALE * ((FSCALE * rtick)/realstathz)) >> FSHIFT;
|
2003-01-26 05:23:15 +00:00
|
|
|
}
|
|
|
|
|
2006-12-06 06:34:57 +00:00
|
|
|
td->td_proc->p_swtime = ts->ts_ltick - ts->ts_ftick;
|
2003-04-22 19:48:25 +00:00
|
|
|
mtx_unlock_spin(&sched_lock);
|
2003-01-26 05:23:15 +00:00
|
|
|
|
|
|
|
return (pctcpu);
|
|
|
|
}
|
|
|
|
|
2003-11-04 07:45:41 +00:00
|
|
|
void
|
|
|
|
sched_bind(struct thread *td, int cpu)
|
|
|
|
{
|
2006-12-06 06:34:57 +00:00
|
|
|
struct td_sched *ts;
|
2003-11-04 07:45:41 +00:00
|
|
|
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
ts = td->td_sched;
|
|
|
|
ts->ts_flags |= TSF_BOUND;
|
2003-12-11 03:57:10 +00:00
|
|
|
#ifdef SMP
|
|
|
|
if (PCPU_GET(cpuid) == cpu)
|
2003-11-04 07:45:41 +00:00
|
|
|
return;
|
|
|
|
/* sched_rem without the runq_remove */
|
2006-12-06 06:34:57 +00:00
|
|
|
ts->ts_state = TSS_THREAD;
|
|
|
|
tdq_load_rem(TDQ_CPU(ts->ts_cpu), ts);
|
|
|
|
tdq_notify(ts, cpu);
|
2003-11-04 07:45:41 +00:00
|
|
|
/* When we return from mi_switch we'll be on the correct cpu. */
|
2004-07-03 16:57:51 +00:00
|
|
|
mi_switch(SW_VOL, NULL);
|
2003-11-04 07:45:41 +00:00
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sched_unbind(struct thread *td)
|
|
|
|
{
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
td->td_sched->ts_flags &= ~TSF_BOUND;
|
2003-11-04 07:45:41 +00:00
|
|
|
}
|
|
|
|
|
2005-04-19 04:01:25 +00:00
|
|
|
int
|
|
|
|
sched_is_bound(struct thread *td)
|
|
|
|
{
|
|
|
|
mtx_assert(&sched_lock, MA_OWNED);
|
2006-12-06 06:34:57 +00:00
|
|
|
return (td->td_sched->ts_flags & TSF_BOUND);
|
2005-04-19 04:01:25 +00:00
|
|
|
}
|
|
|
|
|
2006-06-15 06:37:39 +00:00
|
|
|
void
|
|
|
|
sched_relinquish(struct thread *td)
|
|
|
|
{
|
|
|
|
mtx_lock_spin(&sched_lock);
|
2006-10-26 21:42:22 +00:00
|
|
|
if (td->td_pri_class == PRI_TIMESHARE)
|
2006-06-15 06:37:39 +00:00
|
|
|
sched_prio(td, PRI_MAX_TIMESHARE);
|
|
|
|
mi_switch(SW_VOL, NULL);
|
|
|
|
mtx_unlock_spin(&sched_lock);
|
|
|
|
}
|
|
|
|
|
2004-02-01 02:48:36 +00:00
|
|
|
int
|
|
|
|
sched_load(void)
|
|
|
|
{
|
|
|
|
#ifdef SMP
|
|
|
|
int total;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
total = 0;
|
2006-12-29 10:37:07 +00:00
|
|
|
for (i = 0; i <= tdg_maxid; i++)
|
|
|
|
total += TDQ_GROUP(i)->tdg_load;
|
2004-02-01 02:48:36 +00:00
|
|
|
return (total);
|
|
|
|
#else
|
2006-12-29 10:37:07 +00:00
|
|
|
return (TDQ_SELF()->tdq_sysload);
|
2004-02-01 02:48:36 +00:00
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2003-01-26 05:23:15 +00:00
|
|
|
int
|
|
|
|
sched_sizeof_proc(void)
|
|
|
|
{
|
|
|
|
return (sizeof(struct proc));
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
sched_sizeof_thread(void)
|
|
|
|
{
|
|
|
|
return (sizeof(struct thread) + sizeof(struct td_sched));
|
|
|
|
}
|
Add scheduler CORE, the work I have done half a year ago, recent,
I picked it up again. The scheduler is forked from ULE, but the
algorithm to detect an interactive process is almost completely
different with ULE, it comes from Linux paper "Understanding the
Linux 2.6.8.1 CPU Scheduler", although I still use same word
"score" as a priority boost in ULE scheduler.
Briefly, the scheduler has following characteristic:
1. Timesharing process's nice value is seriously respected,
timeslice and interaction detecting algorithm are based
on nice value.
2. per-cpu scheduling queue and load balancing.
3. O(1) scheduling.
4. Some cpu affinity code in wakeup path.
5. Support POSIX SCHED_FIFO and SCHED_RR.
Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler
uses 256 priority queues. Unlike ULE which using pull and push, the
scheduelr uses pull method, the main reason is to let relative idle
cpu do the work, but current the whole scheduler is protected by the
big sched_lock, so the benefit is not visible, it really can be worse
than nothing because all other cpu are locked out when we are doing
balancing work, which the 4BSD scheduelr does not have this problem.
The scheduler does not support hyperthreading very well, in fact,
the scheduler does not make the difference between physical CPU and
logical CPU, this should be improved in feature. The scheduler has
priority inversion problem on MP machine, it is not good for
realtime scheduling, it can cause realtime process starving.
As a result, it seems the MySQL super-smack runs better on my
Pentium-D machine when using libthr, despite on UP or SMP kernel.
2006-06-13 13:12:56 +00:00
|
|
|
|
|
|
|
void
|
|
|
|
sched_tick(void)
|
|
|
|
{
|
|
|
|
}
|
2004-09-05 02:09:54 +00:00
|
|
|
#define KERN_SWITCH_INCLUDE 1
|
|
|
|
#include "kern/kern_switch.c"
|