1
0
mirror of https://git.FreeBSD.org/src.git synced 2025-01-20 15:43:16 +00:00
Commit Graph

11754 Commits

Author SHA1 Message Date
David Xu
8b873a2328 Remove unused functions. 2008-04-02 08:33:42 +00:00
David Xu
d6e0eb0a48 Replace function _umtx_op with _umtx_op_err, the later function directly
returns errno, because errno can be mucked by user's signal handler and
most of pthread api heavily depends on errno to be correct, this change
should improve stability of the thread library.
2008-04-02 07:41:25 +00:00
David Xu
8bf1a48cb3 Replace userland rwlock with a pure kernel based rwlock, the new
implementation does not switch pointers when it resumes waiters.

Asked by: jeff
2008-04-02 04:32:31 +00:00
David Xu
ad4a96ba13 Normally, we are often reading local time rather than setting time zone,
replace mutex with rwlock, this should eliminate lock contention in
most cases.
2008-04-01 06:56:11 +00:00
David Xu
18967c1918 Restore normal pthread_cond_signal path to avoid some obscure races. 2008-04-01 06:23:08 +00:00
David Xu
f5bc4f9930 return EAGAIN early rather than running bunch of code later, micro optimize
static branch prediction.
2008-04-01 00:21:49 +00:00
David Schultz
8087c515ab Remove a (bogus) remnant of debugging this on sparc64. 2008-03-31 13:11:45 +00:00
Konstantin Belousov
ba2983e5b3 Add the libc glue and headers definitions for the *at() syscalls.
Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:14:04 +00:00
Tim Kientzle
4b7d286a5b Include an extra byte for the trailing NUL. <sigh>
Pointy hat: Me
2008-03-31 06:24:39 +00:00
David Xu
5ab512bb8e Rewrite rwlock to user atomic operations to change rwlock state, this
eliminates internal mutex lock contention when most rwlock operations
are read.

Orignal patch provided by: jeff
2008-03-31 02:55:49 +00:00
David Schultz
074fb64d9a Add assembly versions of remquol() and remainderl(). 2008-03-30 21:21:53 +00:00
David Schultz
c7392feecc Hook remquol() and remainderl() up to the build. 2008-03-30 20:48:02 +00:00
David Schultz
a2e5f27559 Implement remainderl() as a wrapper around remquol(). The extra work
remquol() performs to compute the quotient is negligible.
2008-03-30 20:47:42 +00:00
David Schultz
cef56f9d6d Implement remquol() based on remquo(). 2008-03-30 20:47:26 +00:00
David Schultz
511dd36b32 Implement csqrtl(). 2008-03-30 20:07:15 +00:00
David Schultz
84c1c0a1ca Hook hypotl() and cabsl() up to the build. 2008-03-30 20:03:46 +00:00
David Schultz
01a13522ad Document hypotl().
Submitted by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2008-03-30 20:03:29 +00:00
David Schultz
a641fc76eb Alias hypotl() and cabsl() for platforms where long double is the same
as double.
2008-03-30 20:03:06 +00:00
David Schultz
2264157a42 Implement cabsl() in terms of hypotl().
Submitted by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2008-03-30 20:02:03 +00:00
David Schultz
d23166b015 Implement hypotl(). This is bde's conversion of fdlibm hypot(), with minor
fixes for ld128 by me.
2008-03-30 20:01:50 +00:00
Bruce Evans
42ee187c3c Use fabs[f]() instead of bit fiddling for setting absolute values.
This makes little difference in float precision, but in double
precision gives a speedup of about 30% on amd64 (A64 CPU) and i386
(A64).  This depends on fabs[f]() being inline and efficient.  The
bit fiddling (or any use of SET_HIGH_WORD(), which libm does too
much because it was best on old 32-bit machines) always causes
packing overheads and sometimes causes stalls in the packing, since
it operates on only part of a variable in the double precision case.
It apparently did cause stalls in a critical path here.
2008-03-30 18:07:12 +00:00
Bruce Evans
c0c7ddd3a8 Use the expression fabs(x+0.0)-fabs(y+0.0) instead of
fabs(x+0.0)+fabs(y+0.0) when mixing NaNs.  This improves
consistency of the result by making it harder for the compiler to reorder
the operands.  (FP addition is not necessarily commutative because the
order of operands makes a difference on some machines iff the operands are
both NaNs.)
2008-03-30 17:28:27 +00:00
Bruce Evans
f94997c8d7 Fix a missing mask in a hi+lo decomposition. Thus bug made the extra
precision in software useless, so hypotf() had some errors in the 1-2
ulp range unless there is extra precision in hardware (as happens on
i386).
2008-03-30 17:17:42 +00:00
Doug Rabson
ecc03b80f1 Don't call xdrrec_skiprecord in the non-blocking case. If
__xdrrec_getrec has returned TRUE, then we have a complete request in
the buffer - calling xdrrec_skiprecord is not necessary. In particular,
if there is another record already buffered on the stream,
xdrrec_skiprecord will discard both this request and the next
one, causing the call to xdr_callmsg to fail and the stream to be
closed.

Sponsored by:	Isilon Systems
2008-03-30 09:36:17 +00:00
Doug Rabson
7ea7cc4bab Don't assume that there is readable data on the stream after the
fragment header.
2008-03-30 09:35:04 +00:00
Ruslan Ermilov
dbdb679c6f Remove options MK_LIBKSE and DEFAULT_THREAD_LIB now that we no longer
build libkse.  This should fix WITHOUT_LIBTHR builds as a side effect.
2008-03-29 17:44:40 +00:00
David Schultz
a1af0d70da Include math.h for the fmaf() prototype. 2008-03-29 16:38:29 +00:00
David Schultz
ee0730e61e Fix some rather obscene code that has ambiguous if...if...else...
constructs in it.
2008-03-29 16:37:59 +00:00
David Schultz
838200ff96 Document modff() and modfl(). Technically, modff() and modfl()
live in libm, while modf() lives in libc due to historical
mistakes. I'm claiming in the manpage that they all live in libm,
since programmers should not rely on the mistake.
2008-03-29 16:19:35 +00:00
Jeff Roberson
d1317e00b8 - Add a man page for cpuset_getaffinity() and cpuset_setaffinity() and
hook it up to the build.

Reviewed by:	brueffer (skeleton and formatting assistance)
2008-03-29 10:26:29 +00:00
Jeff Roberson
329356f9f2 - Add a man page for cpuset(), cpuset_setid(), and cpuset_getid() and hook
it up to the build.

Reviewed by:	brueffer (skeleton and formatting assistance)
2008-03-29 10:06:30 +00:00
Paul Saab
6e7534b8c8 Add support to mincore for detecting whether a page is part of a
"super" page or not.

Reviewed by:	alc, ups
2008-03-28 04:29:27 +00:00
Ruslan Ermilov
cbdcc7cb91 Removed no longer existing CTL_MACHDEP defines.
Inspired by:	phk
2008-03-26 23:02:17 +00:00
Doug Rabson
dfdcada31e Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00
Christian Brueffer
662cac9f23 Fix some "in in" typos in comments.
PR:		121490
Submitted by:	Anatoly Borodin <anatoly.borodin@gmail.com>
Approved by:	rwatson (mentor), jkoshy
MFC after:	3 days
2008-03-26 07:32:08 +00:00
Ruslan Ermilov
5a9926445a Compile libthr with warnings.
(Somehow this file sneaked from initial commit.)
2008-03-25 15:33:00 +00:00
Ruslan Ermilov
e03efb02bc Compile libthr with warnings. 2008-03-25 13:28:12 +00:00
Ruslan Ermilov
7e0e78248e Fixed mis-implementation of pthread_mutex_get{spin,yield}loops_np().
Reviewed by:	davidxu
2008-03-25 09:48:10 +00:00
Jeff Roberson
fbb275f59d - Restore kse.h in this directory so other tools don't find it by mistake.
- Restore the ability to debug kse coredumps in 8.0.

Suggested by:	marcel
2008-03-23 09:38:11 +00:00
David Xu
9939a13667 Add POSIX pthread API pthread_getcpuclockid() to get a thread's cpu
time clock id.
2008-03-22 09:59:20 +00:00
David Xu
20b94d8035 Use linker set to collection all target operations. 2008-03-22 05:40:44 +00:00
Kai Wang
7a36fb79f9 Add MLINK for archive_write_close.
Approved by:	jkoshy(mentor), kientzle
2008-03-21 11:10:20 +00:00
David Xu
04a57d2c83 Resolve __error()'s PLT early so that it needs not to be resolved again,
otherwise rwlock is recursivly called when signal happens and the __error
was never resolved before.
2008-03-21 02:31:55 +00:00
Ruslan Ermilov
a1292a02d3 pthread_mutexattr_destroy() was accidentally broken in last revision,
unbreak it.  We should really start compiling this with warnings.
2008-03-20 11:47:08 +00:00
Dag-Erling Smørgrav
5092cf0569 s/wait/delta/ to avoid namespace collision.
MFC after:	2 weeks
2008-03-20 09:55:27 +00:00
David Xu
8c38215f50 Preserve application code's errno in rtld locking code, it attemps to keep
any case safe.
2008-03-20 09:35:44 +00:00
David Xu
48ebe2ebc4 Make pthread_mutexattr_settype to return error number directly and
conformant to POSIX specification.

Bug reported by: modelnine at modelnine dt org
2008-03-20 08:27:14 +00:00
David Xu
c8a4eae56f don't reduce new thread's refcount if current thread can not set cpuset
for it, since the new thread will reduce it by itself.
2008-03-19 09:33:07 +00:00
David Xu
519e8d87bb - Trim trailing spaces.
- Use a different sigmask variable name to avoid confusing.
2008-03-19 08:13:04 +00:00
David Xu
86a06c6000 if passed thread pointer is equal to current thread, pass -1 to kernel
to speed up searching.
2008-03-19 06:38:21 +00:00
Joseph Koshy
b23372cd8e Ensure that the section header table is written out in an order
consistent with the section indices returned to the application by
elf_ndxscn().

Submitted by:		kaiw
2008-03-19 06:06:34 +00:00
Joseph Koshy
df7d1e2023 Clarify that the ELF library only sets the sh_entsize field of a
section header entry if the application is not taking charge of ELF
object layout.

Update (c) years, and bump the manual page's date.

Submitted by:		kaiw
2008-03-19 05:07:49 +00:00
Maksim Yevmenkin
07f8cd18c6 Add mandatory "security description" SDP parameter to the PANU profile
Pointed-out by:	Iain Hibbert < plunky at rya-online dot net >
MFC after:	3 days
2008-03-19 00:06:30 +00:00
Maksim Yevmenkin
13040bc96b Add PSM and Load Factor SDP parameters to the BNEP based profiles
(NAP, GN and PANU). No reason to not to support them.

Separate SDP parameters data structures for the BNEP based profiles.

Generalize Service Availability SDP parameter creation.

Requested by:	Iain Hibbert < plunky at rya-online dot net >
MFC after:	3 days
2008-03-18 18:21:39 +00:00
David Xu
2ea1f90a18 - Copy signal mask out before THR_UNLOCK(), because THR_UNLOCK() may call
_thr_suspend_check() which messes sigmask saved in thread structure.
- Don't suspend a thread has force_exit set.
- In pthread_exit(), if there is a suspension flag set, wake up waiting-
  thread after setting PS_DEAD, this causes waiting-thread to break loop
  in suspend_common().
2008-03-18 02:06:51 +00:00
Antoine Brodin
59e7781613 Don't allocate the constant array "props" on the stack in wctype.
PR:		74743
Submitted by:	knut st. osmundsen
Approved by:	rwatson (mentor)
MFC after:	1 month
2008-03-17 18:22:23 +00:00
David Schultz
18798c64f0 scandir(3) previously used st_size to obtain an initial estimate
of the array length needed to store all the directory entries.
Although BSD has historically guaranteed that st_size is the size
of the directory file, POSIX does not, and more to the point, some
recent filesystems such as ZFS use st_size to mean something else.

The fix is to not stat the directory at all, set the initial
array size to 32 entries, and realloc it in powers of 2 if that
proves insufficient.

PR:	113668
2008-03-16 19:08:53 +00:00
David Xu
a9a11568ff Actually delete SIGCANCEL mask for suspended thread, so the signal will not
be masked when it is resumed.
2008-03-16 03:22:38 +00:00
Tim Kientzle
409e319377 Update a comment: the format bid only runs once per archive; it no
longer runs once per entry.
2008-03-15 11:09:16 +00:00
Tim Kientzle
845aa4ab0a Free up the entry objects allocated during this test. 2008-03-15 11:06:15 +00:00
Tim Kientzle
adfb462fea Release the buffers used for exercising the compress code. 2008-03-15 11:05:49 +00:00
Tim Kientzle
0b315cd9ae Remove the duplicate "archive_format" and "archive_format_name" fields
from the private archive_write structure and fix up all writers to use
the format fields in the base "archive" structure.  This error made it
impossible to query the format after setting up a writer because the
write format was stored in an inaccessible place.
2008-03-15 11:04:45 +00:00
Tim Kientzle
c43d294189 Correct a sign mismatch that only showed up on 64-bit systems.
Pointy hat: me
2008-03-15 11:02:47 +00:00
Tim Kientzle
3010219939 Refactor the mtree code a bit to make the layering clearer: Each
"file" is described by multiple "lines" each possibly containing
multiple "keywords."  Incorporate some additions from Joerg Sonnenberger
to handle linked files and correctly deal with backing files on disk.
2008-03-15 07:10:24 +00:00
Tim Kientzle
d7740aea75 FreeBSD does have fstat().
Correct the nasty typo this uncovers.
2008-03-15 04:20:50 +00:00
Tim Kientzle
eb971f9524 Testability is more important than standards conformance.
Disable the use of PaxHeader.<pid> for the fake pax extension pathname
until I can make the name here settable.  Otherwise, tests that try
to compare output to static pre-generated reference files break.
2008-03-15 03:49:18 +00:00
Tim Kientzle
24f55a5963 Ignore a few more common files. 2008-03-15 02:31:28 +00:00
Tim Kientzle
80334b7d22 Resolve a minor nit in SUS compliance by including the PID in the
fake directory name used for pax extended headers.
2008-03-15 02:30:42 +00:00
Tim Kientzle
cde1a05218 GC a reference to the defunct TESTFILES variable. 2008-03-15 02:22:08 +00:00
Tim Kientzle
60617bf578 A subtle point: "pax interchange format" mandates that all strings
(including pathname, gname, uname) be stored in UTF-8.  This usually
doesn't cause problems on FreeBSD because the "C" locale on FreeBSD
can convert any byte to Unicode/wchar_t and from there to UTF-8.  In
other locales (including the "C" locale on Linux which is really
ASCII), you can get into trouble with pathnames that cannot be
converted to UTF-8.

Libarchive's pax writer truncated pathnames and other strings at the
first nonconvertible character.  (ouch!)  Other archivers have worked
around this by storing unconvertible pathnames as raw binary, a
practice which has been sanctioned by the Austin group.  However,
libarchive's pax reader would segfault reading headers that weren't
proper UTF-8.  (ouch!)  Since bsdtar defaults to pax format, this
affects bsdtar rather heavily.

To correctly support the new "hdrcharset" header that is going into
SUS and to handle conversion failures in general, libarchive's pax reader
and writer have been overhauled fairly extensively.  They used to do
most of the pax header processing using wchar_t (Unicode); they now do
most of it using char so that common logic applies to either UTF-8 or
"binary" strings.

As a bonus, a number of extraneous conversions to/from wchar_t have
been eliminated, which should speed things up just a tad.

Thanks to: Bjoern Jacke for originally reporting this to me
Thanks to: Joerg Sonnenberger for noting a bad typo in my first draft of this
Thanks to: Gunnar Ritter for getting the standard fixed
MFC after: 5 days
2008-03-15 01:43:59 +00:00
Tim Kientzle
3a6aaff135 Ignore some built files. 2008-03-15 00:52:22 +00:00
Tim Kientzle
408a822432 Don't lie. If a string can't be converted to a wide (Unicode) string,
return a NULL instead of an incomplete string.  Expand the test coverage
to verify the correct behavior here.
2008-03-14 23:19:46 +00:00
Tim Kientzle
6c8f54e991 Don't advertise the default block size as a constant; don't
rely on a deprecated value to set the default.  This is also
related to a longer-term goal of setting the default block
size based on format and possibly other factors, which makes
it a bad idea to tie this to a published constant.
2008-03-14 23:09:02 +00:00
Tim Kientzle
8e4bc81237 New public functions archive_entry_copy_link() and archive_entry_copy_link_w()
override the currently set link value, whether that's a hardlink
or a symlink.  Plus documentation update and tests.
2008-03-14 23:00:53 +00:00
Tim Kientzle
1051e364aa Update some comments, comment out argument names to guard against
namespace problems.
2008-03-14 22:47:38 +00:00
Tim Kientzle
871e5c0326 Since "length" computes the length of a string and is used as an
argument to malloc(3), it should be size_t, not int.
2008-03-14 22:44:07 +00:00
Tim Kientzle
d6f37be734 Let archive_entry_clear() accept a NULL pointer and simply do nothing.
In particular, this allows archive_entry_free() to work correctly
for a NULL pointer, which makes it parallel with free(3).
2008-03-14 22:40:36 +00:00
Tim Kientzle
42d1f7b4ba Rework the versioning implementation and test to match the
new interface.  Mark the functions that are going away in
libarchive 3.0.

In particular, archive_version_string() now computes the
string rather than assuming that it will be created by the
build infrastructure.  Eventually, this will allow some
simplification of the build infrastructure.
2008-03-14 22:31:57 +00:00
Tim Kientzle
0349d719b1 Rework the versioning information, hopefully for the last time.
* There are now only two public version identifiers:  "number" is
   a single integer that combines Major/minor/release in a single
   value of the form Mmmmrrr.  This is easy to compare against for
   checking feature support.  "string" is a displayable text string
   of the form "libarchive M.mm.rr".
 * The number is present both as a macro (version of the installed header)
   and a function (version of the shared library).  The string form
   is available only as a function.
 * Retain the older version definitions for now, but mark them all
   as deprecated, to disappear in libarchive 3.0 (whenever that happens).
 * Rework the various deprecation conditionals to use ARCHIVE_VERSION_NUMBER.

An ancillary goal is to reduce the number of @...@ substitutions that
are required.  Someday, I might even be able to avoid build-time
processing of archive.h entirely.
2008-03-14 22:19:50 +00:00
Tim Kientzle
45943bfd93 Add a useful sprintf()-style wrapper around
archive_string_vsprintf().  (Which is built
on top of libarchive's internal resizable string
support.)
2008-03-14 22:00:09 +00:00
Tim Kientzle
7c5b1173a5 Support for writing 'compress' format, thanks to Joerg Sonnenberger. 2008-03-14 20:35:38 +00:00
Tim Kientzle
20347f62e6 A block in a tar file is 512 bytes. Period.
Remove the entirely pointless symbolic constant
and sizeof(unsigned char).  (The constant
here is doubly wrong, since not only does
it obscure a basic format constant, it was
never intended to be a tar-specific value,
so could conceivably be changed at some point
in the future.)
2008-03-14 20:32:20 +00:00
Joseph Koshy
bcbe65a85f - Document Pentium and Pentium MMX events.
- Update (c) years and the manual page's date.
2008-03-14 06:22:03 +00:00
Ruslan Ermilov
53bbf5aa35 Fix bugs in previous revision (missing comma, misspelled syscall name). 2008-03-13 10:33:24 +00:00
Ruslan Ermilov
517d383637 Remove trailing whitespace. 2008-03-13 10:26:17 +00:00
Ruslan Ermilov
c130b9f1af Add missing section number. 2008-03-13 10:25:30 +00:00
David Xu
9fbfd54e8e In file sem_timewait.3, remove reference to SYSV semphore in SEE ALSO
section, sync it with sem_wait.3.
2008-03-13 01:53:28 +00:00
Kai Wang
a739eb8374 Current 'ar' read support in libarchive can only handle a GNU/SVR4
filename table whose size is less than 65536 bytes.

The original intention was to not consume the filename table, so the
client will have a chance to look at it. To achieve that, the library
call decompressor->read_ahead to read(look ahead) but do not call
decompressor->consume to consume the data, thus a limit was raised
since read_ahead call can only look ahead at most BUFFER_SIZE(65536)
bytes at the moment, and you can not "look any further" before you
consume what you already "saw".

This commit will turn GNU/SVR4 filename table into "archive format
data", i.e., filename table will be consumed by libarchive, so the
65536-bytes limit will be gone, but client can no longer have access
to the content of filename table.

'ar' support test suite is changed accordingly. BSD ar(1) is not
affected by this change since it doesn't look at the filename table.

Reported by:	erwin
Discussed with:	jkoshy, kientzle
Reviewed by:	jkoshy, kientzle
Approved by:	jkoshy(mentor), kientzle
2008-03-12 21:10:26 +00:00
Joseph Koshy
484202faab Bring the behaviour of pmc_capabilities() and pmc_width() in line with
documentation: set 'errno' and return -1 in case of an error.

Update (c) years.
2008-03-12 15:51:32 +00:00
Joseph Koshy
ef4ba9be47 Describe return values from pmc_ncpu() and pmc_npmc() better. 2008-03-12 15:48:59 +00:00
Paolo Pisati
ab0fcfd00a -Don't pass down the entire pkt to ProtoAliasIn, ProtoAliasOut, FragmentIn
and FragmentOut.
-Axe the old PacketAlias API: it has been deprecated since 5.x.
2008-03-12 11:58:29 +00:00
Jeff Roberson
7d4cbc3607 - Remove kse syscall symbols and man pages. 2008-03-12 10:12:22 +00:00
Jeff Roberson
1e71e49d12 - Don't inspect the P_SA flag. It's being removed. 2008-03-12 10:00:33 +00:00
Jeff Roberson
34147e4308 - Remove libkse and related support code in libpthread from the build.
Don't remove the files yet.  Kernel support will be removed shortly.
2008-03-12 09:49:39 +00:00
Tim Kientzle
df4691b984 Portability: Eliminate the need for uudecode by incorporating
uudecode into the main test driver and invoking it just-in-time
within the various tests.

Also, incorporate a number of improvements to the main test support
code that have proven useful on other projects where I've used this
framework.
2008-03-12 05:12:23 +00:00
Tim Kientzle
0b4793efb7 Remove some unused fields from the private archive_read structure
(left over from when the unified read/write structure was copied
to form separate read and write structures) and eliminate the
pointless initialization of a couple of the unused fields.
2008-03-12 04:58:32 +00:00
Tim Kientzle
c2247d3995 Tighten up the semantics of acl_next() and xattr_next() when you
hit the end of the ACL or xattr list.

Thanks to: Jeff Johnson for pointing out the obvious typo
2008-03-12 04:47:37 +00:00
Tim Kientzle
826055b6a8 Typo, thanks to: Jeff Johnson.
MFC after: 3 days
2008-03-12 04:26:44 +00:00
David Xu
e54cc1f0d5 Add missing comma. 2008-03-12 02:37:31 +00:00
David Xu
1dd273df59 Add manual for function sem_timedwait().
Reviewed by: ru, deischen
2008-03-12 02:33:17 +00:00
David Xu
150b71918c If a thread is cancelled, it may have already consumed a umtx_wake,
check waiter and semphore counter to see if we may wake up next thread.
2008-03-11 03:26:47 +00:00
Maksim Yevmenkin
e096d1e4ba Add structures to hold SDP parameters for the NAP, GN and PANU profiles.
It should be mentioned that a somewhat similar patch was submitted by
Rako < rako29 at gmail dot com >

MFC after:	1 week
2008-03-11 00:08:40 +00:00
Joseph Koshy
c7f03ab040 Use .Fo/.Fc and .Xo/.Xc to bring the line widths below 79 columns.
Correct a typo [a misplaced comma].

Reviewed by:		ru
2008-03-10 14:45:29 +00:00
Joseph Koshy
80c4d6eba3 Use .Fo/.Fc and .Xo/.Xc to bring the line widths below 79 columns.
Reviewed by:		ru
2008-03-10 14:44:41 +00:00
Robert Watson
4813b6af4b Add reference to kldunloadf system call, which was previously not
mentioned in the kldunload(2) man page.

MFC after:	3 days
Spotted by:	rink
2008-03-10 09:54:13 +00:00
Antoine Brodin
e3ad7f6626 Introduce a new F_DUP2FD command to fcntl(2), for compatibility with
Solaris and AIX.
fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent.
Document it.
Add some regression tests (identical to the dup2(2) regression tests).

PR:		120233
Submitted by:	Jukka Ukkonen
Approved by:	rwaston (mentor)
MFC after:	1 month
2008-03-08 22:02:21 +00:00
Antoine Brodin
6044f112a6 Merge changes from NetBSD on humanize_number.c, 1.8 -> 1.13
Significant changes:
- rev. 1.11: Use PRId64 instead of a cast to long long and %lld to print
an int64_t.
- rev. 1.12: Fix a bug that humanize_number() produces "1000" where it
should be "1.0G" or "1.0M".  The bug reported by Greg Troxel.

PR:		118461
PR:		102694
Approved by:	rwatson (mentor)
Obtained from:	NetBSD
MFC after:	1 month
2008-03-08 21:55:59 +00:00
Jason Evans
f2ec9c0c86 Remove stale #include <machine/atomic.h>, which as needed by lazy
deallocation.
2008-03-07 16:54:03 +00:00
Robert Watson
cee815cf77 Add __FBSDID() tags.
MFC after:	3 days
2008-03-07 15:25:56 +00:00
David Xu
8a18c0d3c8 Fix a bug when calculating remnant size. 2008-03-06 03:24:03 +00:00
David Xu
697b4b49be Don't report death event to debugger if it is a forced exit. 2008-03-06 02:07:18 +00:00
David Xu
70e79fbb0d Restore code setting new thread's scheduler parameters, I was thinking
that there might be starvations, but because we have already locked the
thread, the cpuset settings will always be done before the new thread
does real-world work.
2008-03-06 01:59:08 +00:00
David Xu
1cb51125aa Increase and decrease in_sigcancel_handler accordingly to avoid possible
error caused by nested SIGCANCEL stack, it is a bit complex.
2008-03-05 07:04:55 +00:00
David Xu
54dff16b26 Use cpuset defined in pthread_attr for newly created thread, for now,
we set scheduling parameters and cpu binding fully in userland, and
because default scheduling policy is SCHED_RR (time-sharing), we set
default sched_inherit to PTHREAD_SCHED_INHERIT, this saves a system
call.
2008-03-05 07:01:20 +00:00
David Xu
07bbb16640 Add more cpu affinity function's symbols. 2008-03-05 06:56:35 +00:00
David Xu
21845eb98d Check actual size of cpuset kernel is using and define underscore version
of API.
2008-03-05 06:55:48 +00:00
David Xu
76a9679f8e If a new thread is created, it inherits current thread's signal masks,
however if current thread is executing cancellation handler, signal
SIGCANCEL may have already been blocked, this is unexpected, unblock the
signal in new thread if this happens.

MFC after: 1 week
2008-03-04 04:28:59 +00:00
David Xu
54c9b47c2b Include cpuset.h, unbreak compiling. 2008-03-04 03:45:11 +00:00
David Xu
a759db946a implement pthread_attr_getaffinity_np and pthread_attr_setaffinity_np. 2008-03-04 03:03:24 +00:00
David Xu
57030e1071 Implement functions pthread_getaffinity_np and pthread_setaffinity_np to
get and set thread's cpu affinity mask.
2008-03-03 09:16:29 +00:00
Joseph Koshy
2a100d2353 - Fix an off-by-one bug in _libelf_insert_section(). [1]
- Update (c) years.

Submitted by:		kaiw [1]
2008-03-03 04:29:25 +00:00
David Schultz
3e13dd37ff 1 << 47 needs to be written 1ULL << 47. 2008-03-02 20:16:55 +00:00
Jeff Roberson
d7f687fc9b Add cpuset, an api for thread to cpu binding and cpu resource grouping
and assignment.
 - Add a reference to a struct cpuset in each thread that is inherited from
   the thread that created it.
 - Release the reference when the thread is destroyed.
 - Add prototypes for syscalls and macros for manipulating cpusets in
   sys/cpuset.h
 - Add syscalls to create, get, and set new numbered cpusets:
   cpuset(), cpuset_{get,set}id()
 - Add syscalls for getting and setting affinity masks for cpusets or
   individual threads: cpuid_{get,set}affinity()
 - Add types for the 'level' and 'which' parameters for the cpuset.  This
   will permit expansion of the api to cover cpu masks for other objects
   identifiable with an id_t integer.  For example, IRQs and Jails may be
   coming soon.
 - The root set 0 contains all valid cpus.  All thread initially belong to
   cpuset 1.  This permits migrating all threads off of certain cpus to
   reserve them for special applications.

Sponsored by:	Nokia
Discussed with:	arch, rwatson, brooks, davidxu, deischen
Reviewed by:	antoine
2008-03-02 07:39:22 +00:00
Joseph Koshy
48f9a2656a Translate the r_info field of ELF relocation records when converting
between 64 and 32 bit variants.

Submitted by:		kaiw
2008-03-02 06:33:10 +00:00
David Schultz
e43c8f6acc Hook up sqrtl() to the build. 2008-03-02 01:48:17 +00:00
David Schultz
c6f56f9f41 MD implementations of sqrtl(). 2008-03-02 01:48:08 +00:00
David Schultz
c6a4447b64 MI implementation of sqrtl(). This is very slow and should
be overridden when hardware sqrt is available.
2008-03-02 01:47:58 +00:00
Philip Paeps
db47316b5c Use the easily-greppable copyright notice template from
src/share/examples/mdoc/POSIX-copyright.

Requested by:	ru
2008-02-29 17:48:25 +00:00
Bruce Evans
a278d99026 Fix and improve some magic numbers for the "medium size" case.
e_rem_pio2.c:
This case goes up to about 2**20pi/2, but the comment about it said that
it goes up to about 2**19pi/2.

It went too far above 2**pi/2, giving a multiplier fn with 21 significant
bits in some cases.  This would be harmful except for a numerical
accident.  It happens that the terms of the approximation to pi/2,
when rounded to 33 bits so that multiplications by 20-bit fn's are
exact, happen to be rounded to 32 bits so multiplications by 21-bit
fn's are exact too, so the bug only complicates the error analysis (we
might lose a bit of accuracy but have bits to spare).

e_rem_pio2f.c:
The bogus comment in e_rem_pio2.c was copied and the code was changed
to be bug-for-bug compatible with it, except the limit was made 90
ulps smaller than necessary.  The approximation to pi/2 was not
modified except for discarding some of it.

The same rough error analysis that justifies the limit of 2**20pi/2
for double precision only justifies a limit of 2**18pi/2 for float
precision.  We depended on exhaustive testing to check the magic numbers
for float precision.  More exaustive testing shows that we can go up
to 2**28pi/2 using a 53+25 bit approximation to pi/2 for float precision,
with a the maximum error for cosf() and sinf() unchanged at 0.5009
ulps despite the maximum error in rem_pio2f being ~0.25 ulps.  Implement
this.
2008-02-28 16:22:36 +00:00
Sean Farley
7f08f0dd77 Replace the use of warnx() with direct output to stderr using _write().
This reduces the size of a statically-linked binary by approximately 100KB
in a trivial "return (0)" test application.  readelf -S was used to verify
that the .text section was reduced and that using strlen() saved a few
more bytes over using sizeof().  Since the section of code is only called
when environ is corrupt (program bug), I went with fewer bytes over fewer
cycles.

I made minor edits to the submitted patch to make the output resemble
warnx().

Submitted by:	kib bz
Approved by:	wes (mentor)
MFC after:	5 days
2008-02-28 04:09:08 +00:00
John Baldwin
fc9ab4f6da Add <limits.h> for SHRT_MAX.
Pointy hat to:	jhb
2008-02-27 21:25:19 +00:00
John Baldwin
c55d7e868a File descriptors are an int, but our stdio FILE object uses a short to hold
them.  Thus, any fd whose value is greater than SHRT_MAX is handled
incorrectly (the short value is sign-extended when converted to an int).
An unpleasant side effect is that if fopen() opens a file and gets a
backing fd that is greater than SHRT_MAX, fclose() will fail and the file
descriptor will be leaked.  Better handle this by fixing fopen(), fdopen(),
and freopen() to fail attempts to use a fd greater than SHRT_MAX with
EMFILE.

At some point in the future we should look at expanding the file descriptor
in FILE to an int, but that is a bit complicated due to ABI issues.

MFC after:	1 week
Discussed on:	arch
Reviewed by:	wollman
2008-02-27 19:02:02 +00:00
Tim Kientzle
e29c664a4c Spelling correction, thanks to Joerg Sonnenberger. 2008-02-27 06:16:41 +00:00
Tim Kientzle
a26e9253f6 Optimize skipping over Zip entries.
Thanks to: Dan Nelson, who sent me the patch
MFC after: 7 days
2008-02-27 06:05:59 +00:00
Garrett Wollman
6ca61b39bb stdio is currently limited to file descriptors not greater than
{SHRT_MAX}, so {STREAM_MAX} should be no greater than that.  (This
does not exactly meet the letter of POSIX but comes reasonably close
to it in spirit.)

MFC after:	14 days
2008-02-27 05:56:57 +00:00
Ruslan Ermilov
a059c409c2 Added the "restrict" type-qualifier to the readlink() prototype. 2008-02-26 20:33:52 +00:00
Tim Kientzle
35f4ae0981 Rename the archive_endian.h functions to avoid name clashes
with NetBSD's sys/endian.h file.

Pointed out by: Joerg Sonnenberger
2008-02-26 07:17:47 +00:00
Bruce Evans
e822ea5b2a Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this
gives an average speedup of about 12 cycles or 17% for
9pi/4 < |x| <= 2**19pi/2 and a smaller speedup for larger x, and a
small speeddown for |x| <= 9pi/4 (only 1-2 cycles average, but that
is 4%).

Inlining this is less likely to bust caches than inlining the float
version since it is much smaller (about 220 bytes text and rodata) and
has many fewer branches.  However, the float version was already large
due to its manual inlining of the branches and also the polynomial
evaluations.
2008-02-25 22:19:17 +00:00
Bruce Evans
c32951b16e Use a temporary array instead of the arg array y[] for calling
__kernel_rem_pio2().  This simplifies analysis of aliasing and thus
results in better code for the usual case where __kernel_rem_pio2()
is not called.  In particular, when __ieee854_rem_pio2[f]() is inlined,
it normally results in y[] being returned in registers.  I couldn't
get this to work using the restrict qualifier.

In float precision, this saves 2-3% in most cases on amd64 and i386
(A64) despite it not being inlined in float precision yet.  In double
precision, this has high variance, with an average gain of 2% for
amd64 and 0.7% for i386 (but a much larger gain for usual cases) and
some losses.
2008-02-25 18:28:58 +00:00
Bruce Evans
70d818a20e Change __ieee754_rem_pio2f() to return double instead of float so that
this function and its callers cosf(), sinf() and tanf() don't waste time
converting values from doubles to floats and back for |x| > 9pi/4.
All these functions were optimized a few years ago to mostly use doubles
internally and across the __kernel*() interfaces but not across the
__ieee754_rem_pio2f() interface.

This saves about 40 cycles in cosf(), sinf() and tanf() for |x| > 9pi/4
on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf()
and sinf() in the upper range).  40 cycles is about 35% for |x| < 9pi/4
<= 2**19pi/2 and about 5% for |x| > 2**19pi/2.  The saving is much
larger on amd64 than on i386 since the conversions are not easy to
optimize except on i386 where some of them are automatic and others
are optimized invalidly.  amd64 is still about 10% slower in cosf()
and tanf() in the lower range due to conversion overhead.

This also gives a tiny speedup for |x| <= 9pi/4 on amd64 (by simplifying
the code).  It also avoids compiler bugs and/or additional slowness
in the conversions on (not yet supported) machines where double_t !=
double.
2008-02-25 13:33:20 +00:00
Christian Brueffer
636133e3dd Add missing words.
MFC after:	3 days
2008-02-25 13:03:18 +00:00
Bruce Evans
0d1564b6c7 Fix some off-by-1 errors.
e_rem_pio2.c:
Float and double precision didn't work because init_jk[] was 1 too small.
It needs to be 2 larger than you might expect, and 1 larger than it was
for these precisions, since its test for recomputing needs a margin of
47 bits (almost 2 24-bit units).

init_jk[] seems to be barely enough for extended and quad precisions.
This hasn't been completely verified.  Callers now get about 24 bits
of extra precision for float, and about 19 for double, but only about
8 for extended and quad.  8 is not enough for callers that want to
produce extra-precision results, but current callers have rounding
errors of at least 0.8 ulps, so another 1/2**8 ulps of error from the
reduction won't affect them much.

Add a comment about some of the magic for init_jk[].

e_rem_pio2.c:
Double precision worked in practice because of a compensating off-by-1
error here.  Extended precision was asked for, and it executed exactly
the same code as the unbroken double precision.

e_rem_pio2f.c:
Float precision worked in practice because of a compensating off-by-1
error here.  Double precision was asked for, and was almost needed,
since the cosf() and sinf() callers want to produce extra-precision
results, at least internally so that their error is only 0.5009 ulps.
However, the extra precision provided by unbroken float precision is
enough, and the double-precision code has extra overheads, so the
off-by-1 error cost about 5% in efficiency on amd64 and i386.
2008-02-25 11:43:20 +00:00
Rafal Jaworowski
56ae1bed48 Let PowerPC world optionally build with -msoft-float. For FPU-less PowerPC
variations (e500 currently), this provides a gcc-level FPU emulation and is an
alternative approach to the recently introduced kernel-level emulation
(FPU_EMU).

Approved by:	cognet (mentor)
MFp4:		e500
2008-02-24 19:22:53 +00:00
Bruce Evans
60a50c2585 Optimize the 9pi/2 < |x| <= 2**19pi/2 case some more by avoiding an
fabs(), a conditional branch, and sign adjustments of 3 variables for
x < 0 when the branch is taken.  In double precision, even when the
branch is perfectly predicted, this saves about 10 cycles or 10% on
amd64 (A64) and i386 (A64) for the negative half of the range, but
makes little difference for the positive half of the range.  In float
precision, it also saves about 4 cycles for the positive half of the
range on i386, and many more cycles in both halves on amd64 (28 in the
negative half and 11 in the positive half for tanf), but the amd64
times for float precision are anomalously slow so the larger
improvement is only a side effect.

Previous commits arranged for the x < 0 case to be handled simply:
- one part of the rounding method uses the magic number 0x1.8p52
  instead of the usual 0x1.0p52.  The latter is required for large |x|,
  but it doesn't work for negative x and we don't need it for large |x|.
- another part of the rounding method no longer needs to add `half'.
  It would have needed to add -half for negative x.
- removing the "quick check no cancellation" in the double precision
  case removed the need to take the absolute value of the quadrant
  number.

Add my noncopyright in e_rem_pio2.c
2008-02-23 12:53:21 +00:00
Bruce Evans
dbf10e45c4 Avoid using FP-to-integer conversion for !(amd64 || i386) too. Use the
FP-to-FP method to round to an integer on all arches, and convert this
to an int using FP-to-integer conversion iff irint() is not available.
This is cleaner and works well on at least ia64, where it saves 20-30
cycles or about 10% on average for 9Pi/4 < |x| <= 32pi/2 (should be
similar up to 2**19pi/2, but I only tested the smaller range).

After the previous commit to e_rem_pio2.c removed the "quick check no
cancellation" non-optimization, the result of the FP-to-integer
conversion is not needed so early, so using irint() became a much
smaller optimization than when it was committed.

An earlier commit message said that cos, cosf, sin and sinf were equally
fast on amd64 and i386 except for cos and sin on i386.  Actually, cos
and sin on amd64 are equally fast to cosf and sinf on i386 (~88 cycles),
while cosf and sinf on amd64 are not quite equally slow to cos and sin
on i386 (average 115 cycles with more variance).
2008-02-22 18:43:23 +00:00
Bruce Evans
7c1b5e7953 Remove the "quick check no cancellation" optimization for
9pi/2 < |x| < 32pi/2 since it is only a small or negative optimation
and it gets in the way of further optimizations.  It did one more
branch to avoid some integer operations and to use a different
dependency on previous results.  The branches are fairly predictable
so they are usually not a problem, so whether this is a good
optimization depends mainly on the timing for the previous results,
which is very machine-dependent.  On amd64 (A64), this "optimization"
is a pessimization of about 1 cycle or 1%; on ia64, it is an
optimization of about 2 cycles or 1%; on i386 (A64), it is an
optimization of about 5 cycles or 4%; on i386 (Celeron P2) it is an
optimization of about 4 cycles or 3% for cos but a pessimization of
about 5 cycles for sin and 1 cycle for tan.  I think the new i386
(A64) slowness is due to an pipeline stall due to an avoidable
load-store mismatch (so the old timing was better), and the i386
(Celeron) variance is due to its branch predictor not being too good.
2008-02-22 17:26:24 +00:00
Bruce Evans
43590b1517 Optimize the 9pi/2 < |x| <= 2**19pi/2 case on amd64 and i386 by avoiding
the the double to int conversion operation which is very slow on these
arches.  Assume that the current rounding mode is the default of
round-to-nearest and use rounding operations in this mode instead of
faking this mode using the round-towards-zero mode for conversion to
int.  Round the double to an integer as a double first and as an int
second since the double result is needed much earler.

Double rounding isn't a problem since we only need a rough approximation.
We didn't support other current rounding modes and produce much larger
errors than before if called in a non-default mode.

This saves an average about 10 cycles on amd64 (A64) and about 25 on
i386 (A64) for x in the above range.  In some cases the saving is over
25%.  Most cases with |x| < 1000pi now take about 88 cycles for cos
and sin (with certain CFLAGS, etc.), except on i386 where cos and sin
(but not cosf and sinf) are much slower at 111 and 121 cycles respectivly
due to the compiler only optimizing well for float precision.  A64
hardware cos and sin are slower at 105 cycles on i386 and 110 cycles
on amd64.
2008-02-22 15:55:14 +00:00
Bruce Evans
0ddfa46b44 Add an irint() function in inline asm for amd64 and i386. irint() is
the same as lrint() except it returns int instead of long.  Though the
extern lrint() is fairly fast on these arches, it still takes about
12 cycles longer than the inline version, and 12 cycles is a lot in
applications where [li]rint() is used to avoid slow conversions that
are only a couple of times slower.

This is only for internal use.  The libm versions of *rint*() should
also be inline, but that would take would take more header engineering.
Implementing irint() instead of lrint() also avoids a conflict with
the extern declaration of the latter.
2008-02-22 14:11:03 +00:00
Bruce Evans
f839bac29c Optimize the conversion to bits a little (by about 11 cycles or 16%
on i386 (A64), 5 cycles on amd64 (A64), and 3 cycles on ia64).  gcc
tends to generate very bad code for accessing floating point values
as bits except when the integer accesses have the same width as the
floating point values, and direct accesses to bit-fields (as is common
only for long double precision) always gives such accesses.  Use the
expsign access method, which is good for 80-bit long doubles and
hopefully no worse for 128-bit long doubles.  Now the generated code
is less bad.  There is still unnecessary copying of the arg on amd64
and i386 and mysterious extra slowness on amd64.
2008-02-22 11:59:05 +00:00
Bruce Evans
a7aa8cc980 Optimize the fixup for +-0 by using better classification for this case
and by using a table lookup to avoid a branch when this case occurs.
On i386, this saves 1-4 cycles out of about 64 for non-large args.
2008-02-22 10:04:53 +00:00