freebsd

mirror of https://git.FreeBSD.org/src.git synced 2025-01-20 15:43:16 +00:00

Author	SHA1	Message	Date
David Xu	8b873a2328	Remove unused functions.	2008-04-02 08:33:42 +00:00
David Xu	d6e0eb0a48	Replace function _umtx_op with _umtx_op_err, the later function directly returns errno, because errno can be mucked by user's signal handler and most of pthread api heavily depends on errno to be correct, this change should improve stability of the thread library.	2008-04-02 07:41:25 +00:00
David Xu	8bf1a48cb3	Replace userland rwlock with a pure kernel based rwlock, the new implementation does not switch pointers when it resumes waiters. Asked by: jeff	2008-04-02 04:32:31 +00:00
David Xu	ad4a96ba13	Normally, we are often reading local time rather than setting time zone, replace mutex with rwlock, this should eliminate lock contention in most cases.	2008-04-01 06:56:11 +00:00
David Xu	18967c1918	Restore normal pthread_cond_signal path to avoid some obscure races.	2008-04-01 06:23:08 +00:00
David Xu	f5bc4f9930	return EAGAIN early rather than running bunch of code later, micro optimize static branch prediction.	2008-04-01 00:21:49 +00:00
David Schultz	8087c515ab	Remove a (bogus) remnant of debugging this on sparc64.	2008-03-31 13:11:45 +00:00
Konstantin Belousov	ba2983e5b3	Add the libc glue and headers definitions for the *at() syscalls. Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:14:04 +00:00
Tim Kientzle	4b7d286a5b	Include an extra byte for the trailing NUL. <sigh> Pointy hat: Me	2008-03-31 06:24:39 +00:00
David Xu	5ab512bb8e	Rewrite rwlock to user atomic operations to change rwlock state, this eliminates internal mutex lock contention when most rwlock operations are read. Orignal patch provided by: jeff	2008-03-31 02:55:49 +00:00
David Schultz	074fb64d9a	Add assembly versions of remquol() and remainderl().	2008-03-30 21:21:53 +00:00
David Schultz	c7392feecc	Hook remquol() and remainderl() up to the build.	2008-03-30 20:48:02 +00:00
David Schultz	a2e5f27559	Implement remainderl() as a wrapper around remquol(). The extra work remquol() performs to compute the quotient is negligible.	2008-03-30 20:47:42 +00:00
David Schultz	cef56f9d6d	Implement remquol() based on remquo().	2008-03-30 20:47:26 +00:00
David Schultz	511dd36b32	Implement csqrtl().	2008-03-30 20:07:15 +00:00
David Schultz	84c1c0a1ca	Hook hypotl() and cabsl() up to the build.	2008-03-30 20:03:46 +00:00
David Schultz	01a13522ad	Document hypotl(). Submitted by: Steve Kargl <sgk@troutmask.apl.washington.edu>	2008-03-30 20:03:29 +00:00
David Schultz	a641fc76eb	Alias hypotl() and cabsl() for platforms where long double is the same as double.	2008-03-30 20:03:06 +00:00
David Schultz	2264157a42	Implement cabsl() in terms of hypotl(). Submitted by: Steve Kargl <sgk@troutmask.apl.washington.edu>	2008-03-30 20:02:03 +00:00
David Schultz	d23166b015	Implement hypotl(). This is bde's conversion of fdlibm hypot(), with minor fixes for ld128 by me.	2008-03-30 20:01:50 +00:00
Bruce Evans	42ee187c3c	Use fabs[f]() instead of bit fiddling for setting absolute values. This makes little difference in float precision, but in double precision gives a speedup of about 30% on amd64 (A64 CPU) and i386 (A64). This depends on fabs[f]() being inline and efficient. The bit fiddling (or any use of SET_HIGH_WORD(), which libm does too much because it was best on old 32-bit machines) always causes packing overheads and sometimes causes stalls in the packing, since it operates on only part of a variable in the double precision case. It apparently did cause stalls in a critical path here.	2008-03-30 18:07:12 +00:00
Bruce Evans	c0c7ddd3a8	Use the expression fabs(x+0.0)-fabs(y+0.0) instead of fabs(x+0.0)+fabs(y+0.0) when mixing NaNs. This improves consistency of the result by making it harder for the compiler to reorder the operands. (FP addition is not necessarily commutative because the order of operands makes a difference on some machines iff the operands are both NaNs.)	2008-03-30 17:28:27 +00:00
Bruce Evans	f94997c8d7	Fix a missing mask in a hi+lo decomposition. Thus bug made the extra precision in software useless, so hypotf() had some errors in the 1-2 ulp range unless there is extra precision in hardware (as happens on i386).	2008-03-30 17:17:42 +00:00
Doug Rabson	ecc03b80f1	Don't call xdrrec_skiprecord in the non-blocking case. If __xdrrec_getrec has returned TRUE, then we have a complete request in the buffer - calling xdrrec_skiprecord is not necessary. In particular, if there is another record already buffered on the stream, xdrrec_skiprecord will discard both this request and the next one, causing the call to xdr_callmsg to fail and the stream to be closed. Sponsored by: Isilon Systems	2008-03-30 09:36:17 +00:00
Doug Rabson	7ea7cc4bab	Don't assume that there is readable data on the stream after the fragment header.	2008-03-30 09:35:04 +00:00
Ruslan Ermilov	dbdb679c6f	Remove options MK_LIBKSE and DEFAULT_THREAD_LIB now that we no longer build libkse. This should fix WITHOUT_LIBTHR builds as a side effect.	2008-03-29 17:44:40 +00:00
David Schultz	a1af0d70da	Include math.h for the fmaf() prototype.	2008-03-29 16:38:29 +00:00
David Schultz	ee0730e61e	Fix some rather obscene code that has ambiguous if...if...else... constructs in it.	2008-03-29 16:37:59 +00:00
David Schultz	838200ff96	Document modff() and modfl(). Technically, modff() and modfl() live in libm, while modf() lives in libc due to historical mistakes. I'm claiming in the manpage that they all live in libm, since programmers should not rely on the mistake.	2008-03-29 16:19:35 +00:00
Jeff Roberson	d1317e00b8	- Add a man page for cpuset_getaffinity() and cpuset_setaffinity() and hook it up to the build. Reviewed by: brueffer (skeleton and formatting assistance)	2008-03-29 10:26:29 +00:00
Jeff Roberson	329356f9f2	- Add a man page for cpuset(), cpuset_setid(), and cpuset_getid() and hook it up to the build. Reviewed by: brueffer (skeleton and formatting assistance)	2008-03-29 10:06:30 +00:00
Paul Saab	6e7534b8c8	Add support to mincore for detecting whether a page is part of a "super" page or not. Reviewed by: alc, ups	2008-03-28 04:29:27 +00:00
Ruslan Ermilov	cbdcc7cb91	Removed no longer existing CTL_MACHDEP defines. Inspired by: phk	2008-03-26 23:02:17 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Christian Brueffer	662cac9f23	Fix some "in in" typos in comments. PR: 121490 Submitted by: Anatoly Borodin <anatoly.borodin@gmail.com> Approved by: rwatson (mentor), jkoshy MFC after: 3 days	2008-03-26 07:32:08 +00:00
Ruslan Ermilov	5a9926445a	Compile libthr with warnings. (Somehow this file sneaked from initial commit.)	2008-03-25 15:33:00 +00:00
Ruslan Ermilov	e03efb02bc	Compile libthr with warnings.	2008-03-25 13:28:12 +00:00
Ruslan Ermilov	7e0e78248e	Fixed mis-implementation of pthread_mutex_get{spin,yield}loops_np(). Reviewed by: davidxu	2008-03-25 09:48:10 +00:00
Jeff Roberson	fbb275f59d	- Restore kse.h in this directory so other tools don't find it by mistake. - Restore the ability to debug kse coredumps in 8.0. Suggested by: marcel	2008-03-23 09:38:11 +00:00
David Xu	9939a13667	Add POSIX pthread API pthread_getcpuclockid() to get a thread's cpu time clock id.	2008-03-22 09:59:20 +00:00
David Xu	20b94d8035	Use linker set to collection all target operations.	2008-03-22 05:40:44 +00:00
Kai Wang	7a36fb79f9	Add MLINK for archive_write_close. Approved by: jkoshy(mentor), kientzle	2008-03-21 11:10:20 +00:00
David Xu	04a57d2c83	Resolve __error()'s PLT early so that it needs not to be resolved again, otherwise rwlock is recursivly called when signal happens and the __error was never resolved before.	2008-03-21 02:31:55 +00:00
Ruslan Ermilov	a1292a02d3	pthread_mutexattr_destroy() was accidentally broken in last revision, unbreak it. We should really start compiling this with warnings.	2008-03-20 11:47:08 +00:00
Dag-Erling Smørgrav	5092cf0569	s/wait/delta/ to avoid namespace collision. MFC after: 2 weeks	2008-03-20 09:55:27 +00:00
David Xu	8c38215f50	Preserve application code's errno in rtld locking code, it attemps to keep any case safe.	2008-03-20 09:35:44 +00:00
David Xu	48ebe2ebc4	Make pthread_mutexattr_settype to return error number directly and conformant to POSIX specification. Bug reported by: modelnine at modelnine dt org	2008-03-20 08:27:14 +00:00
David Xu	c8a4eae56f	don't reduce new thread's refcount if current thread can not set cpuset for it, since the new thread will reduce it by itself.	2008-03-19 09:33:07 +00:00
David Xu	519e8d87bb	- Trim trailing spaces. - Use a different sigmask variable name to avoid confusing.	2008-03-19 08:13:04 +00:00
David Xu	86a06c6000	if passed thread pointer is equal to current thread, pass -1 to kernel to speed up searching.	2008-03-19 06:38:21 +00:00
Joseph Koshy	b23372cd8e	Ensure that the section header table is written out in an order consistent with the section indices returned to the application by elf_ndxscn(). Submitted by: kaiw	2008-03-19 06:06:34 +00:00
Joseph Koshy	df7d1e2023	Clarify that the ELF library only sets the sh_entsize field of a section header entry if the application is not taking charge of ELF object layout. Update (c) years, and bump the manual page's date. Submitted by: kaiw	2008-03-19 05:07:49 +00:00
Maksim Yevmenkin	07f8cd18c6	Add mandatory "security description" SDP parameter to the PANU profile Pointed-out by: Iain Hibbert < plunky at rya-online dot net > MFC after: 3 days	2008-03-19 00:06:30 +00:00
Maksim Yevmenkin	13040bc96b	Add PSM and Load Factor SDP parameters to the BNEP based profiles (NAP, GN and PANU). No reason to not to support them. Separate SDP parameters data structures for the BNEP based profiles. Generalize Service Availability SDP parameter creation. Requested by: Iain Hibbert < plunky at rya-online dot net > MFC after: 3 days	2008-03-18 18:21:39 +00:00
David Xu	2ea1f90a18	- Copy signal mask out before THR_UNLOCK(), because THR_UNLOCK() may call _thr_suspend_check() which messes sigmask saved in thread structure. - Don't suspend a thread has force_exit set. - In pthread_exit(), if there is a suspension flag set, wake up waiting- thread after setting PS_DEAD, this causes waiting-thread to break loop in suspend_common().	2008-03-18 02:06:51 +00:00
Antoine Brodin	59e7781613	Don't allocate the constant array "props" on the stack in wctype. PR: 74743 Submitted by: knut st. osmundsen Approved by: rwatson (mentor) MFC after: 1 month	2008-03-17 18:22:23 +00:00
David Schultz	18798c64f0	scandir(3) previously used st_size to obtain an initial estimate of the array length needed to store all the directory entries. Although BSD has historically guaranteed that st_size is the size of the directory file, POSIX does not, and more to the point, some recent filesystems such as ZFS use st_size to mean something else. The fix is to not stat the directory at all, set the initial array size to 32 entries, and realloc it in powers of 2 if that proves insufficient. PR: 113668	2008-03-16 19:08:53 +00:00
David Xu	a9a11568ff	Actually delete SIGCANCEL mask for suspended thread, so the signal will not be masked when it is resumed.	2008-03-16 03:22:38 +00:00
Tim Kientzle	409e319377	Update a comment: the format bid only runs once per archive; it no longer runs once per entry.	2008-03-15 11:09:16 +00:00
Tim Kientzle	845aa4ab0a	Free up the entry objects allocated during this test.	2008-03-15 11:06:15 +00:00
Tim Kientzle	adfb462fea	Release the buffers used for exercising the compress code.	2008-03-15 11:05:49 +00:00
Tim Kientzle	0b315cd9ae	Remove the duplicate "archive_format" and "archive_format_name" fields from the private archive_write structure and fix up all writers to use the format fields in the base "archive" structure. This error made it impossible to query the format after setting up a writer because the write format was stored in an inaccessible place.	2008-03-15 11:04:45 +00:00
Tim Kientzle	c43d294189	Correct a sign mismatch that only showed up on 64-bit systems. Pointy hat: me	2008-03-15 11:02:47 +00:00
Tim Kientzle	3010219939	Refactor the mtree code a bit to make the layering clearer: Each "file" is described by multiple "lines" each possibly containing multiple "keywords." Incorporate some additions from Joerg Sonnenberger to handle linked files and correctly deal with backing files on disk.	2008-03-15 07:10:24 +00:00
Tim Kientzle	d7740aea75	FreeBSD does have fstat(). Correct the nasty typo this uncovers.	2008-03-15 04:20:50 +00:00
Tim Kientzle	eb971f9524	Testability is more important than standards conformance. Disable the use of PaxHeader.<pid> for the fake pax extension pathname until I can make the name here settable. Otherwise, tests that try to compare output to static pre-generated reference files break.	2008-03-15 03:49:18 +00:00
Tim Kientzle	24f55a5963	Ignore a few more common files.	2008-03-15 02:31:28 +00:00
Tim Kientzle	80334b7d22	Resolve a minor nit in SUS compliance by including the PID in the fake directory name used for pax extended headers.	2008-03-15 02:30:42 +00:00
Tim Kientzle	cde1a05218	GC a reference to the defunct TESTFILES variable.	2008-03-15 02:22:08 +00:00
Tim Kientzle	60617bf578	A subtle point: "pax interchange format" mandates that all strings (including pathname, gname, uname) be stored in UTF-8. This usually doesn't cause problems on FreeBSD because the "C" locale on FreeBSD can convert any byte to Unicode/wchar_t and from there to UTF-8. In other locales (including the "C" locale on Linux which is really ASCII), you can get into trouble with pathnames that cannot be converted to UTF-8. Libarchive's pax writer truncated pathnames and other strings at the first nonconvertible character. (ouch!) Other archivers have worked around this by storing unconvertible pathnames as raw binary, a practice which has been sanctioned by the Austin group. However, libarchive's pax reader would segfault reading headers that weren't proper UTF-8. (ouch!) Since bsdtar defaults to pax format, this affects bsdtar rather heavily. To correctly support the new "hdrcharset" header that is going into SUS and to handle conversion failures in general, libarchive's pax reader and writer have been overhauled fairly extensively. They used to do most of the pax header processing using wchar_t (Unicode); they now do most of it using char so that common logic applies to either UTF-8 or "binary" strings. As a bonus, a number of extraneous conversions to/from wchar_t have been eliminated, which should speed things up just a tad. Thanks to: Bjoern Jacke for originally reporting this to me Thanks to: Joerg Sonnenberger for noting a bad typo in my first draft of this Thanks to: Gunnar Ritter for getting the standard fixed MFC after: 5 days	2008-03-15 01:43:59 +00:00
Tim Kientzle	3a6aaff135	Ignore some built files.	2008-03-15 00:52:22 +00:00
Tim Kientzle	408a822432	Don't lie. If a string can't be converted to a wide (Unicode) string, return a NULL instead of an incomplete string. Expand the test coverage to verify the correct behavior here.	2008-03-14 23:19:46 +00:00
Tim Kientzle	6c8f54e991	Don't advertise the default block size as a constant; don't rely on a deprecated value to set the default. This is also related to a longer-term goal of setting the default block size based on format and possibly other factors, which makes it a bad idea to tie this to a published constant.	2008-03-14 23:09:02 +00:00
Tim Kientzle	8e4bc81237	New public functions archive_entry_copy_link() and archive_entry_copy_link_w() override the currently set link value, whether that's a hardlink or a symlink. Plus documentation update and tests.	2008-03-14 23:00:53 +00:00
Tim Kientzle	1051e364aa	Update some comments, comment out argument names to guard against namespace problems.	2008-03-14 22:47:38 +00:00
Tim Kientzle	871e5c0326	Since "length" computes the length of a string and is used as an argument to malloc(3), it should be size_t, not int.	2008-03-14 22:44:07 +00:00
Tim Kientzle	d6f37be734	Let archive_entry_clear() accept a NULL pointer and simply do nothing. In particular, this allows archive_entry_free() to work correctly for a NULL pointer, which makes it parallel with free(3).	2008-03-14 22:40:36 +00:00
Tim Kientzle	42d1f7b4ba	Rework the versioning implementation and test to match the new interface. Mark the functions that are going away in libarchive 3.0. In particular, archive_version_string() now computes the string rather than assuming that it will be created by the build infrastructure. Eventually, this will allow some simplification of the build infrastructure.	2008-03-14 22:31:57 +00:00
Tim Kientzle	0349d719b1	Rework the versioning information, hopefully for the last time. * There are now only two public version identifiers: "number" is a single integer that combines Major/minor/release in a single value of the form Mmmmrrr. This is easy to compare against for checking feature support. "string" is a displayable text string of the form "libarchive M.mm.rr". * The number is present both as a macro (version of the installed header) and a function (version of the shared library). The string form is available only as a function. * Retain the older version definitions for now, but mark them all as deprecated, to disappear in libarchive 3.0 (whenever that happens). * Rework the various deprecation conditionals to use ARCHIVE_VERSION_NUMBER. An ancillary goal is to reduce the number of @...@ substitutions that are required. Someday, I might even be able to avoid build-time processing of archive.h entirely.	2008-03-14 22:19:50 +00:00
Tim Kientzle	45943bfd93	Add a useful sprintf()-style wrapper around archive_string_vsprintf(). (Which is built on top of libarchive's internal resizable string support.)	2008-03-14 22:00:09 +00:00
Tim Kientzle	7c5b1173a5	Support for writing 'compress' format, thanks to Joerg Sonnenberger.	2008-03-14 20:35:38 +00:00
Tim Kientzle	20347f62e6	A block in a tar file is 512 bytes. Period. Remove the entirely pointless symbolic constant and sizeof(unsigned char). (The constant here is doubly wrong, since not only does it obscure a basic format constant, it was never intended to be a tar-specific value, so could conceivably be changed at some point in the future.)	2008-03-14 20:32:20 +00:00
Joseph Koshy	bcbe65a85f	- Document Pentium and Pentium MMX events. - Update (c) years and the manual page's date.	2008-03-14 06:22:03 +00:00
Ruslan Ermilov	53bbf5aa35	Fix bugs in previous revision (missing comma, misspelled syscall name).	2008-03-13 10:33:24 +00:00
Ruslan Ermilov	517d383637	Remove trailing whitespace.	2008-03-13 10:26:17 +00:00
Ruslan Ermilov	c130b9f1af	Add missing section number.	2008-03-13 10:25:30 +00:00
David Xu	9fbfd54e8e	In file sem_timewait.3, remove reference to SYSV semphore in SEE ALSO section, sync it with sem_wait.3.	2008-03-13 01:53:28 +00:00
Kai Wang	a739eb8374	Current 'ar' read support in libarchive can only handle a GNU/SVR4 filename table whose size is less than 65536 bytes. The original intention was to not consume the filename table, so the client will have a chance to look at it. To achieve that, the library call decompressor->read_ahead to read(look ahead) but do not call decompressor->consume to consume the data, thus a limit was raised since read_ahead call can only look ahead at most BUFFER_SIZE(65536) bytes at the moment, and you can not "look any further" before you consume what you already "saw". This commit will turn GNU/SVR4 filename table into "archive format data", i.e., filename table will be consumed by libarchive, so the 65536-bytes limit will be gone, but client can no longer have access to the content of filename table. 'ar' support test suite is changed accordingly. BSD ar(1) is not affected by this change since it doesn't look at the filename table. Reported by: erwin Discussed with: jkoshy, kientzle Reviewed by: jkoshy, kientzle Approved by: jkoshy(mentor), kientzle	2008-03-12 21:10:26 +00:00
Joseph Koshy	484202faab	Bring the behaviour of pmc_capabilities() and pmc_width() in line with documentation: set 'errno' and return -1 in case of an error. Update (c) years.	2008-03-12 15:51:32 +00:00
Joseph Koshy	ef4ba9be47	Describe return values from pmc_ncpu() and pmc_npmc() better.	2008-03-12 15:48:59 +00:00
Paolo Pisati	ab0fcfd00a	-Don't pass down the entire pkt to ProtoAliasIn, ProtoAliasOut, FragmentIn and FragmentOut. -Axe the old PacketAlias API: it has been deprecated since 5.x.	2008-03-12 11:58:29 +00:00
Jeff Roberson	7d4cbc3607	- Remove kse syscall symbols and man pages.	2008-03-12 10:12:22 +00:00
Jeff Roberson	1e71e49d12	- Don't inspect the P_SA flag. It's being removed.	2008-03-12 10:00:33 +00:00
Jeff Roberson	34147e4308	- Remove libkse and related support code in libpthread from the build. Don't remove the files yet. Kernel support will be removed shortly.	2008-03-12 09:49:39 +00:00
Tim Kientzle	df4691b984	Portability: Eliminate the need for uudecode by incorporating uudecode into the main test driver and invoking it just-in-time within the various tests. Also, incorporate a number of improvements to the main test support code that have proven useful on other projects where I've used this framework.	2008-03-12 05:12:23 +00:00
Tim Kientzle	0b4793efb7	Remove some unused fields from the private archive_read structure (left over from when the unified read/write structure was copied to form separate read and write structures) and eliminate the pointless initialization of a couple of the unused fields.	2008-03-12 04:58:32 +00:00
Tim Kientzle	c2247d3995	Tighten up the semantics of acl_next() and xattr_next() when you hit the end of the ACL or xattr list. Thanks to: Jeff Johnson for pointing out the obvious typo	2008-03-12 04:47:37 +00:00
Tim Kientzle	826055b6a8	Typo, thanks to: Jeff Johnson. MFC after: 3 days	2008-03-12 04:26:44 +00:00
David Xu	e54cc1f0d5	Add missing comma.	2008-03-12 02:37:31 +00:00
David Xu	1dd273df59	Add manual for function sem_timedwait(). Reviewed by: ru, deischen	2008-03-12 02:33:17 +00:00
David Xu	150b71918c	If a thread is cancelled, it may have already consumed a umtx_wake, check waiter and semphore counter to see if we may wake up next thread.	2008-03-11 03:26:47 +00:00
Maksim Yevmenkin	e096d1e4ba	Add structures to hold SDP parameters for the NAP, GN and PANU profiles. It should be mentioned that a somewhat similar patch was submitted by Rako < rako29 at gmail dot com > MFC after: 1 week	2008-03-11 00:08:40 +00:00
Joseph Koshy	c7f03ab040	Use .Fo/.Fc and .Xo/.Xc to bring the line widths below 79 columns. Correct a typo [a misplaced comma]. Reviewed by: ru	2008-03-10 14:45:29 +00:00
Joseph Koshy	80c4d6eba3	Use .Fo/.Fc and .Xo/.Xc to bring the line widths below 79 columns. Reviewed by: ru	2008-03-10 14:44:41 +00:00
Robert Watson	4813b6af4b	Add reference to kldunloadf system call, which was previously not mentioned in the kldunload(2) man page. MFC after: 3 days Spotted by: rink	2008-03-10 09:54:13 +00:00
Antoine Brodin	e3ad7f6626	Introduce a new F_DUP2FD command to fcntl(2), for compatibility with Solaris and AIX. fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent. Document it. Add some regression tests (identical to the dup2(2) regression tests). PR: 120233 Submitted by: Jukka Ukkonen Approved by: rwaston (mentor) MFC after: 1 month	2008-03-08 22:02:21 +00:00
Antoine Brodin	6044f112a6	Merge changes from NetBSD on humanize_number.c, 1.8 -> 1.13 Significant changes: - rev. 1.11: Use PRId64 instead of a cast to long long and %lld to print an int64_t. - rev. 1.12: Fix a bug that humanize_number() produces "1000" where it should be "1.0G" or "1.0M". The bug reported by Greg Troxel. PR: 118461 PR: 102694 Approved by: rwatson (mentor) Obtained from: NetBSD MFC after: 1 month	2008-03-08 21:55:59 +00:00
Jason Evans	f2ec9c0c86	Remove stale #include <machine/atomic.h>, which as needed by lazy deallocation.	2008-03-07 16:54:03 +00:00
Robert Watson	cee815cf77	Add __FBSDID() tags. MFC after: 3 days	2008-03-07 15:25:56 +00:00
David Xu	8a18c0d3c8	Fix a bug when calculating remnant size.	2008-03-06 03:24:03 +00:00
David Xu	697b4b49be	Don't report death event to debugger if it is a forced exit.	2008-03-06 02:07:18 +00:00
David Xu	70e79fbb0d	Restore code setting new thread's scheduler parameters, I was thinking that there might be starvations, but because we have already locked the thread, the cpuset settings will always be done before the new thread does real-world work.	2008-03-06 01:59:08 +00:00
David Xu	1cb51125aa	Increase and decrease in_sigcancel_handler accordingly to avoid possible error caused by nested SIGCANCEL stack, it is a bit complex.	2008-03-05 07:04:55 +00:00
David Xu	54dff16b26	Use cpuset defined in pthread_attr for newly created thread, for now, we set scheduling parameters and cpu binding fully in userland, and because default scheduling policy is SCHED_RR (time-sharing), we set default sched_inherit to PTHREAD_SCHED_INHERIT, this saves a system call.	2008-03-05 07:01:20 +00:00
David Xu	07bbb16640	Add more cpu affinity function's symbols.	2008-03-05 06:56:35 +00:00
David Xu	21845eb98d	Check actual size of cpuset kernel is using and define underscore version of API.	2008-03-05 06:55:48 +00:00
David Xu	76a9679f8e	If a new thread is created, it inherits current thread's signal masks, however if current thread is executing cancellation handler, signal SIGCANCEL may have already been blocked, this is unexpected, unblock the signal in new thread if this happens. MFC after: 1 week	2008-03-04 04:28:59 +00:00
David Xu	54c9b47c2b	Include cpuset.h, unbreak compiling.	2008-03-04 03:45:11 +00:00
David Xu	a759db946a	implement pthread_attr_getaffinity_np and pthread_attr_setaffinity_np.	2008-03-04 03:03:24 +00:00
David Xu	57030e1071	Implement functions pthread_getaffinity_np and pthread_setaffinity_np to get and set thread's cpu affinity mask.	2008-03-03 09:16:29 +00:00
Joseph Koshy	2a100d2353	- Fix an off-by-one bug in _libelf_insert_section(). [1] - Update (c) years. Submitted by: kaiw [1]	2008-03-03 04:29:25 +00:00
David Schultz	3e13dd37ff	1 << 47 needs to be written 1ULL << 47.	2008-03-02 20:16:55 +00:00
Jeff Roberson	d7f687fc9b	Add cpuset, an api for thread to cpu binding and cpu resource grouping and assignment. - Add a reference to a struct cpuset in each thread that is inherited from the thread that created it. - Release the reference when the thread is destroyed. - Add prototypes for syscalls and macros for manipulating cpusets in sys/cpuset.h - Add syscalls to create, get, and set new numbered cpusets: cpuset(), cpuset_{get,set}id() - Add syscalls for getting and setting affinity masks for cpusets or individual threads: cpuid_{get,set}affinity() - Add types for the 'level' and 'which' parameters for the cpuset. This will permit expansion of the api to cover cpu masks for other objects identifiable with an id_t integer. For example, IRQs and Jails may be coming soon. - The root set 0 contains all valid cpus. All thread initially belong to cpuset 1. This permits migrating all threads off of certain cpus to reserve them for special applications. Sponsored by: Nokia Discussed with: arch, rwatson, brooks, davidxu, deischen Reviewed by: antoine	2008-03-02 07:39:22 +00:00
Joseph Koshy	48f9a2656a	Translate the r_info field of ELF relocation records when converting between 64 and 32 bit variants. Submitted by: kaiw	2008-03-02 06:33:10 +00:00
David Schultz	e43c8f6acc	Hook up sqrtl() to the build.	2008-03-02 01:48:17 +00:00
David Schultz	c6f56f9f41	MD implementations of sqrtl().	2008-03-02 01:48:08 +00:00
David Schultz	c6a4447b64	MI implementation of sqrtl(). This is very slow and should be overridden when hardware sqrt is available.	2008-03-02 01:47:58 +00:00
Philip Paeps	db47316b5c	Use the easily-greppable copyright notice template from src/share/examples/mdoc/POSIX-copyright. Requested by: ru	2008-02-29 17:48:25 +00:00
Bruce Evans	a278d99026	Fix and improve some magic numbers for the "medium size" case. e_rem_pio2.c: This case goes up to about 220pi/2, but the comment about it said that it goes up to about 219pi/2. It went too far above 2pi/2, giving a multiplier fn with 21 significant bits in some cases. This would be harmful except for a numerical accident. It happens that the terms of the approximation to pi/2, when rounded to 33 bits so that multiplications by 20-bit fn's are exact, happen to be rounded to 32 bits so multiplications by 21-bit fn's are exact too, so the bug only complicates the error analysis (we might lose a bit of accuracy but have bits to spare). e_rem_pio2f.c: The bogus comment in e_rem_pio2.c was copied and the code was changed to be bug-for-bug compatible with it, except the limit was made 90 ulps smaller than necessary. The approximation to pi/2 was not modified except for discarding some of it. The same rough error analysis that justifies the limit of 220pi/2 for double precision only justifies a limit of 218pi/2 for float precision. We depended on exhaustive testing to check the magic numbers for float precision. More exaustive testing shows that we can go up to 228pi/2 using a 53+25 bit approximation to pi/2 for float precision, with a the maximum error for cosf() and sinf() unchanged at 0.5009 ulps despite the maximum error in rem_pio2f being ~0.25 ulps. Implement this.	2008-02-28 16:22:36 +00:00
Sean Farley	7f08f0dd77	Replace the use of warnx() with direct output to stderr using _write(). This reduces the size of a statically-linked binary by approximately 100KB in a trivial "return (0)" test application. readelf -S was used to verify that the .text section was reduced and that using strlen() saved a few more bytes over using sizeof(). Since the section of code is only called when environ is corrupt (program bug), I went with fewer bytes over fewer cycles. I made minor edits to the submitted patch to make the output resemble warnx(). Submitted by: kib bz Approved by: wes (mentor) MFC after: 5 days	2008-02-28 04:09:08 +00:00
John Baldwin	fc9ab4f6da	Add <limits.h> for SHRT_MAX. Pointy hat to: jhb	2008-02-27 21:25:19 +00:00
John Baldwin	c55d7e868a	File descriptors are an int, but our stdio FILE object uses a short to hold them. Thus, any fd whose value is greater than SHRT_MAX is handled incorrectly (the short value is sign-extended when converted to an int). An unpleasant side effect is that if fopen() opens a file and gets a backing fd that is greater than SHRT_MAX, fclose() will fail and the file descriptor will be leaked. Better handle this by fixing fopen(), fdopen(), and freopen() to fail attempts to use a fd greater than SHRT_MAX with EMFILE. At some point in the future we should look at expanding the file descriptor in FILE to an int, but that is a bit complicated due to ABI issues. MFC after: 1 week Discussed on: arch Reviewed by: wollman	2008-02-27 19:02:02 +00:00
Tim Kientzle	e29c664a4c	Spelling correction, thanks to Joerg Sonnenberger.	2008-02-27 06:16:41 +00:00
Tim Kientzle	a26e9253f6	Optimize skipping over Zip entries. Thanks to: Dan Nelson, who sent me the patch MFC after: 7 days	2008-02-27 06:05:59 +00:00
Garrett Wollman	6ca61b39bb	stdio is currently limited to file descriptors not greater than {SHRT_MAX}, so {STREAM_MAX} should be no greater than that. (This does not exactly meet the letter of POSIX but comes reasonably close to it in spirit.) MFC after: 14 days	2008-02-27 05:56:57 +00:00
Ruslan Ermilov	a059c409c2	Added the "restrict" type-qualifier to the readlink() prototype.	2008-02-26 20:33:52 +00:00
Tim Kientzle	35f4ae0981	Rename the archive_endian.h functions to avoid name clashes with NetBSD's sys/endian.h file. Pointed out by: Joerg Sonnenberger	2008-02-26 07:17:47 +00:00
Bruce Evans	e822ea5b2a	Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this gives an average speedup of about 12 cycles or 17% for 9pi/4 < \|x\| <= 2**19pi/2 and a smaller speedup for larger x, and a small speeddown for \|x\| <= 9pi/4 (only 1-2 cycles average, but that is 4%). Inlining this is less likely to bust caches than inlining the float version since it is much smaller (about 220 bytes text and rodata) and has many fewer branches. However, the float version was already large due to its manual inlining of the branches and also the polynomial evaluations.	2008-02-25 22:19:17 +00:00
Bruce Evans	c32951b16e	Use a temporary array instead of the arg array y[] for calling __kernel_rem_pio2(). This simplifies analysis of aliasing and thus results in better code for the usual case where __kernel_rem_pio2() is not called. In particular, when __ieee854_rem_pio2[f]() is inlined, it normally results in y[] being returned in registers. I couldn't get this to work using the restrict qualifier. In float precision, this saves 2-3% in most cases on amd64 and i386 (A64) despite it not being inlined in float precision yet. In double precision, this has high variance, with an average gain of 2% for amd64 and 0.7% for i386 (but a much larger gain for usual cases) and some losses.	2008-02-25 18:28:58 +00:00
Bruce Evans	70d818a20e	Change __ieee754_rem_pio2f() to return double instead of float so that this function and its callers cosf(), sinf() and tanf() don't waste time converting values from doubles to floats and back for \|x\| > 9pi/4. All these functions were optimized a few years ago to mostly use doubles internally and across the __kernel() interfaces but not across the __ieee754_rem_pio2f() interface. This saves about 40 cycles in cosf(), sinf() and tanf() for \|x\| > 9pi/4 on amd64 (A64), and about 20 cycles on i386 (A64) (except for cosf() and sinf() in the upper range). 40 cycles is about 35% for \|x\| < 9pi/4 <= 219pi/2 and about 5% for \|x\| > 2*19pi/2. The saving is much larger on amd64 than on i386 since the conversions are not easy to optimize except on i386 where some of them are automatic and others are optimized invalidly. amd64 is still about 10% slower in cosf() and tanf() in the lower range due to conversion overhead. This also gives a tiny speedup for \|x\| <= 9pi/4 on amd64 (by simplifying the code). It also avoids compiler bugs and/or additional slowness in the conversions on (not yet supported) machines where double_t != double.	2008-02-25 13:33:20 +00:00
Christian Brueffer	636133e3dd	Add missing words. MFC after: 3 days	2008-02-25 13:03:18 +00:00
Bruce Evans	0d1564b6c7	Fix some off-by-1 errors. e_rem_pio2.c: Float and double precision didn't work because init_jk[] was 1 too small. It needs to be 2 larger than you might expect, and 1 larger than it was for these precisions, since its test for recomputing needs a margin of 47 bits (almost 2 24-bit units). init_jk[] seems to be barely enough for extended and quad precisions. This hasn't been completely verified. Callers now get about 24 bits of extra precision for float, and about 19 for double, but only about 8 for extended and quad. 8 is not enough for callers that want to produce extra-precision results, but current callers have rounding errors of at least 0.8 ulps, so another 1/2**8 ulps of error from the reduction won't affect them much. Add a comment about some of the magic for init_jk[]. e_rem_pio2.c: Double precision worked in practice because of a compensating off-by-1 error here. Extended precision was asked for, and it executed exactly the same code as the unbroken double precision. e_rem_pio2f.c: Float precision worked in practice because of a compensating off-by-1 error here. Double precision was asked for, and was almost needed, since the cosf() and sinf() callers want to produce extra-precision results, at least internally so that their error is only 0.5009 ulps. However, the extra precision provided by unbroken float precision is enough, and the double-precision code has extra overheads, so the off-by-1 error cost about 5% in efficiency on amd64 and i386.	2008-02-25 11:43:20 +00:00
Rafal Jaworowski	56ae1bed48	Let PowerPC world optionally build with -msoft-float. For FPU-less PowerPC variations (e500 currently), this provides a gcc-level FPU emulation and is an alternative approach to the recently introduced kernel-level emulation (FPU_EMU). Approved by: cognet (mentor) MFp4: e500	2008-02-24 19:22:53 +00:00
Bruce Evans	60a50c2585	Optimize the 9pi/2 < \|x\| <= 2**19pi/2 case some more by avoiding an fabs(), a conditional branch, and sign adjustments of 3 variables for x < 0 when the branch is taken. In double precision, even when the branch is perfectly predicted, this saves about 10 cycles or 10% on amd64 (A64) and i386 (A64) for the negative half of the range, but makes little difference for the positive half of the range. In float precision, it also saves about 4 cycles for the positive half of the range on i386, and many more cycles in both halves on amd64 (28 in the negative half and 11 in the positive half for tanf), but the amd64 times for float precision are anomalously slow so the larger improvement is only a side effect. Previous commits arranged for the x < 0 case to be handled simply: - one part of the rounding method uses the magic number 0x1.8p52 instead of the usual 0x1.0p52. The latter is required for large \|x\|, but it doesn't work for negative x and we don't need it for large \|x\|. - another part of the rounding method no longer needs to add `half'. It would have needed to add -half for negative x. - removing the "quick check no cancellation" in the double precision case removed the need to take the absolute value of the quadrant number. Add my noncopyright in e_rem_pio2.c	2008-02-23 12:53:21 +00:00
Bruce Evans	dbf10e45c4	Avoid using FP-to-integer conversion for !(amd64 \|\| i386) too. Use the FP-to-FP method to round to an integer on all arches, and convert this to an int using FP-to-integer conversion iff irint() is not available. This is cleaner and works well on at least ia64, where it saves 20-30 cycles or about 10% on average for 9Pi/4 < \|x\| <= 32pi/2 (should be similar up to 2**19pi/2, but I only tested the smaller range). After the previous commit to e_rem_pio2.c removed the "quick check no cancellation" non-optimization, the result of the FP-to-integer conversion is not needed so early, so using irint() became a much smaller optimization than when it was committed. An earlier commit message said that cos, cosf, sin and sinf were equally fast on amd64 and i386 except for cos and sin on i386. Actually, cos and sin on amd64 are equally fast to cosf and sinf on i386 (~88 cycles), while cosf and sinf on amd64 are not quite equally slow to cos and sin on i386 (average 115 cycles with more variance).	2008-02-22 18:43:23 +00:00
Bruce Evans	7c1b5e7953	Remove the "quick check no cancellation" optimization for 9pi/2 < \|x\| < 32pi/2 since it is only a small or negative optimation and it gets in the way of further optimizations. It did one more branch to avoid some integer operations and to use a different dependency on previous results. The branches are fairly predictable so they are usually not a problem, so whether this is a good optimization depends mainly on the timing for the previous results, which is very machine-dependent. On amd64 (A64), this "optimization" is a pessimization of about 1 cycle or 1%; on ia64, it is an optimization of about 2 cycles or 1%; on i386 (A64), it is an optimization of about 5 cycles or 4%; on i386 (Celeron P2) it is an optimization of about 4 cycles or 3% for cos but a pessimization of about 5 cycles for sin and 1 cycle for tan. I think the new i386 (A64) slowness is due to an pipeline stall due to an avoidable load-store mismatch (so the old timing was better), and the i386 (Celeron) variance is due to its branch predictor not being too good.	2008-02-22 17:26:24 +00:00
Bruce Evans	43590b1517	Optimize the 9pi/2 < \|x\| <= 2**19pi/2 case on amd64 and i386 by avoiding the the double to int conversion operation which is very slow on these arches. Assume that the current rounding mode is the default of round-to-nearest and use rounding operations in this mode instead of faking this mode using the round-towards-zero mode for conversion to int. Round the double to an integer as a double first and as an int second since the double result is needed much earler. Double rounding isn't a problem since we only need a rough approximation. We didn't support other current rounding modes and produce much larger errors than before if called in a non-default mode. This saves an average about 10 cycles on amd64 (A64) and about 25 on i386 (A64) for x in the above range. In some cases the saving is over 25%. Most cases with \|x\| < 1000pi now take about 88 cycles for cos and sin (with certain CFLAGS, etc.), except on i386 where cos and sin (but not cosf and sinf) are much slower at 111 and 121 cycles respectivly due to the compiler only optimizing well for float precision. A64 hardware cos and sin are slower at 105 cycles on i386 and 110 cycles on amd64.	2008-02-22 15:55:14 +00:00
Bruce Evans	0ddfa46b44	Add an irint() function in inline asm for amd64 and i386. irint() is the same as lrint() except it returns int instead of long. Though the extern lrint() is fairly fast on these arches, it still takes about 12 cycles longer than the inline version, and 12 cycles is a lot in applications where [li]rint() is used to avoid slow conversions that are only a couple of times slower. This is only for internal use. The libm versions of rint() should also be inline, but that would take would take more header engineering. Implementing irint() instead of lrint() also avoids a conflict with the extern declaration of the latter.	2008-02-22 14:11:03 +00:00
Bruce Evans	f839bac29c	Optimize the conversion to bits a little (by about 11 cycles or 16% on i386 (A64), 5 cycles on amd64 (A64), and 3 cycles on ia64). gcc tends to generate very bad code for accessing floating point values as bits except when the integer accesses have the same width as the floating point values, and direct accesses to bit-fields (as is common only for long double precision) always gives such accesses. Use the expsign access method, which is good for 80-bit long doubles and hopefully no worse for 128-bit long doubles. Now the generated code is less bad. There is still unnecessary copying of the arg on amd64 and i386 and mysterious extra slowness on amd64.	2008-02-22 11:59:05 +00:00
Bruce Evans	a7aa8cc980	Optimize the fixup for +-0 by using better classification for this case and by using a table lookup to avoid a branch when this case occurs. On i386, this saves 1-4 cycles out of about 64 for non-large args.	2008-02-22 10:04:53 +00:00

1 2 3 4 5 ...

11754 Commits