freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-24 11:29:10 +00:00

Author	SHA1	Message	Date
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Robert Watson	237fdd787b	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
Pawel Jakub Dawidek	2b1c6615bc	Fix mmap(2) on ZFS after some changes in VM subsystem. Submitted by: alc Reported by: kris (originally) and many others Tested with: fsx MFC after: 1 week	2008-03-15 23:23:04 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	628f51d275	Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.	2008-02-24 16:38:58 +00:00
Pawel Jakub Dawidek	79bc018dd7	- Reduce how much ZFS caches by default. This is another change to mitigate 'kmem_map too small panics'. - Print two warnings if there is not enough memory and not enough address space. - Improve comment.	2008-01-24 11:24:16 +00:00
Pawel Jakub Dawidek	44ce1efd91	Change type of kmem_used() and kmem_size() functions to uint64_t, so it doesn't overflow in arc.c in this check: if (kmem_used() > (kmem_size() * 4) / 5) return (1); With this bug ZFS almost doesn't cache. Only 32bit machines are affected that have vm.kmem_size set to values >=1GB. Reported by: David Taylor <davidt@yadt.co.uk>	2008-01-24 11:21:54 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
John Birrell	35a04710d7	Remove some compatibility stuff that we now get from the Solaris header.	2007-11-29 00:15:08 +00:00
John Birrell	b468fe2bce	* Check endianness the FreeBSD way. * Use LBOLT rather than lbolt to avoid a clash with a FreeBSD global variable.	2007-11-28 22:16:00 +00:00
John Birrell	9587fed572	Fix a prototype definition.	2007-11-28 22:13:28 +00:00
John Birrell	da9085a1c0	Check endianness the FreeBSD way.	2007-11-28 22:12:21 +00:00
John Birrell	47b288c152	Include an extra header to get this to compile cleanly.	2007-11-28 22:11:39 +00:00
John Birrell	57438287ab	Add more OpenSolaris compatibility headers.	2007-11-28 21:50:40 +00:00
John Birrell	eca148b637	Remove an extern that is defined elsewhere.	2007-11-28 21:50:05 +00:00
John Birrell	edadde229a	Add compatibility cruft moved from under _SOLARIS_C_SOURCE in sys/types.h	2007-11-28 21:49:16 +00:00
John Birrell	35ba7f225f	Remove a typedef which was just a hack to avoid including vmem.h. That typedef breaks other Solaris code.	2007-11-28 21:48:25 +00:00
John Birrell	773f4e3849	Add a missing volatile so that the code compiles cleanly.	2007-11-28 21:47:09 +00:00
John Birrell	4fc8feafc7	Rename the definition of lbolt to LBOLT to avoid a clash with a global variable in FreeBSD. Until now lbolt in sys/proc.h has been #ifdef'ed out based on _SOLARIS_C_SOURCE, but that is going away now.	2007-11-28 21:44:17 +00:00
Pawel Jakub Dawidek	4d4daf5901	Warn if kmem_map size is set to less than 512MB. Previous warning was a bit pointless, because default is set to something around 300MB and also insufficient. MFC after: 3 days	2007-11-07 14:44:31 +00:00
Pawel Jakub Dawidek	232a80f675	Remove unused header. MFC after: 3 days	2007-11-05 22:18:34 +00:00
Pawel Jakub Dawidek	a33b7a8f5f	If setting a state to anything but open state, close access to vdev. This fixes replacing drive in place, eg. zpool replace tank da1 da1. Before it complained that device is already open. MFC after: 1 week	2007-11-05 21:30:48 +00:00
Pawel Jakub Dawidek	171eb887e9	Remove "zfs:" prefix from lock and condvar names and also skip non-letter characters (mostly "&"). Because top(1) shows only first six characters of wait channel, without this change we saw only one meaningful character. Requested by: kris & others MFC after: 1 week	2007-11-05 18:40:55 +00:00
Ulf Lilleengen	6509baf851	- Add sysctl for sizeof(znode_t), which will be used by fstat(1). Approved by: pjd (mentor)	2007-11-02 00:35:05 +00:00
Pawel Jakub Dawidek	ef2d58b58f	Call zil_commit() (if ZIL is not disabled) after every non-read request (BIO_WRITE and BIO_FLUSH) as it is done is Solaris. The difference is that Solaris calls it only for sync requests, but we can't say in GEOM is the request is sync or async, so we do it for every request. MFC after: 1 week	2007-11-01 11:04:21 +00:00
Pawel Jakub Dawidek	4f2398ea17	- Move crfree() outside MNT_ILOCK()/MNT_IUNLOCK() to eliminate a LOR: 1st 0xc4cea568 struct mount mtx (struct mount mtx) @ /usr/src/sys/modules/zfs/../../compat/opensolaris/kern/opensolaris_vfs.c:209 2nd 0xc3ee9010 sleep mtxpool (sleep mtxpool) @ /usr/src/sys/kern/kern_resource.c:1266 - Move crdup() outside MNT_ILOCK()/MNT_IUNLOCK(), as it can sleep. Reported by: Olli Hauer <ohauer@gmx.de> MFC after: 3 days	2007-11-01 08:58:29 +00:00
Julian Elischer	3745c395ec	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.	2007-10-20 23:23:23 +00:00
Andrew Thompson	1fe1be1535	ZFS_LOG adds a newline by itself. Pointed out by: pjd	2007-10-14 16:14:32 +00:00
Andrew Thompson	9528621759	Print the ZFS ereport to the console if vfs.zfs.debug is set to help diagnose problems with zfs-on-root since devd isnt running yet. Reviewed by: pjd	2007-10-14 07:58:50 +00:00
Pawel Jakub Dawidek	e8bd23b460	Fix lock leak leading to the 'System call <name> returning with 1 locks held' panic. Reported by: kris Approved by: re (kensmith)	2007-10-04 17:51:59 +00:00
Pawel Jakub Dawidek	a95a61fc19	Now that we have CDDLed code in the tree, add CDDL license. Discussed with: core Approved by: re (kensmith)	2007-09-23 07:04:50 +00:00
Pawel Jakub Dawidek	a3c8c2e60f	Reduce the limit of vnodes on i386 when ZFS is loaded to 3/4 of the original value, so we don't run out of KVA. The default vnodes limit fits better for UFS, but ZFS allocated more file system specific memory for a vnode than UFS. Don't touch vnodes limit if we detect it was tuned by system administrator and restore original value when ZFS is unloaded. This isn't final fix, but before we implement something better, this will help to stabilize ZFS under heavy load on i386. Approved by: re (bmah)	2007-09-10 19:58:14 +00:00
Pawel Jakub Dawidek	ef0ffc1c6f	After dfr@ vnode leak fix, we can allow ARC to consume more memory. Tested by: kris Approved by: re (bmah)	2007-09-10 18:12:27 +00:00
Pawel Jakub Dawidek	6bc581fcf0	Use CTLFLAG_RDTUN for tunable sysctls. Approved by: re (bmah)	2007-09-01 06:23:42 +00:00
Pawel Jakub Dawidek	70eaa4219c	Some ZFS threads needs stack larger than the default 8kB, so use 16kB of alternate stack if the default is smaller than 16kB. Approved by: re (rwatson)	2007-08-16 20:33:20 +00:00
Pawel Jakub Dawidek	aa222db26f	Update assertion after revision 1.23. Reviewed by: dfr Approved by: re (rwatson)	2007-07-24 15:00:43 +00:00
Doug Rabson	2dc26b36c8	Correct a reference-counting mistake in the ZFS code which led to abnormal memory usage and pessimal cache performance. Reviewed by: pjd Approved by: re (rwatson)	2007-07-09 09:03:49 +00:00
Doug Rabson	7761242694	In zfs_vget, if we fail to translate an inode number to the corresponding vnode, make sure we return an error code to the caller. Reviewed by: pjd Approved by: re	2007-06-27 12:00:24 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Marcel Moolenaar	6d63683c41	Add my copyright. Requested by: pjd@	2007-06-08 16:20:03 +00:00
Pawel Jakub Dawidek	3b7917d766	- Reduce number of atomic operations needed to be implemented in asm by implementing some of them using existing ones. - Allow to compile ZFS on all archs and use atomic operations surrounded by global mutex on archs we don't have or can't have all atomic operations needed by ZFS.	2007-06-08 12:35:47 +00:00
Pawel Jakub Dawidek	083c4dd695	Missing atomic operations for ZFS/ia64. Submitted by: marcel	2007-06-08 12:26:30 +00:00
David Malone	041b706b2f	Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.	2007-06-04 18:25:08 +00:00
Pawel Jakub Dawidek	b166b92692	Reimplement traverse() helper function: 1. Pass locking flags to VFS_ROOT(). 2. Check v_mountedhere while the vnode is locked. 3. Always return locked vnode on success. Change 1 fixes problem reported by Stephen M. Rumble - after zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode unconditionally and traverse() didn't pass right lock type to VFS_ROOT(). The result was that kernel paniced when .zfs/ directory was accessed via NFS.	2007-06-04 11:31:46 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Pawel Jakub Dawidek	5750956634	Adjust va_mask for setattr. FreeBSD doesn't have va_mask, so we initialize it based on individual fields beeing set. This doesn't work for setattr replay, because va_type is set there, so we add AT_TYPE flag to va_mask, which won't be accepted by zfs_setattr(). Reported by: kris	2007-05-28 02:37:43 +00:00
Pawel Jakub Dawidek	a906fff9c5	Because we allocate componentname structures on stack, bzero() them before use just in case.	2007-05-28 00:26:20 +00:00
Pawel Jakub Dawidek	0d99488ded	There are too many false positive LORs reported by WITNESS, so when ZFS debug is turned off, initialize locks with NOWITNESS flag. At some point I'll get back to them, we would probably need BLESSING functionality, which is currently turned off by default.	2007-05-26 21:37:14 +00:00
Pawel Jakub Dawidek	fbd08bbe6a	DNLC_NO_VNODE can't be NULL. Reported by: ru	2007-05-24 13:44:45 +00:00

1 2 3

125 Commits