freebsd

mirror of https://git.FreeBSD.org/src.git synced 2024-12-30 12:04:07 +00:00

Author	SHA1	Message	Date
Martin Matuska	62c55b6d08	Fix unable to remove a file over NFS after hitting refquota limit OpenSolaris onnv-revision: 8890:8c2bd5f17bf2 Obtained from: OpenSolaris (Bug ID 6798878) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:18:29 +00:00
John Baldwin	3aa6d94e0c	Update several places that iterate over CPUs to use CPU_FOREACH().	2010-06-11 18:46:34 +00:00
Martin Matuska	711bf9bcf1	Fix freeing space after deleting large files with holes. OpenSolaris onnv revision: 9950:78fc41aa9bc5 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6792701) MFC after: 3 days	2010-06-03 11:08:46 +00:00
Martin Matuska	dc5d34e454	Fix ZIL close when doing zfs rollback or zfs receive on a mounted dataset. The fix is a partial import and merge of OpenSolaris onnv revisions 8227:f7d7be9b1f56. and 9292:e112194b5b73 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6798298) MFC after: 3 days	2010-06-01 08:43:46 +00:00
Pawel Jakub Dawidek	510ec358c5	Fix a bug where resilver is not started automatically on pool import or load. If disk was missing on pool load or import and on next pool load or import it was present, resilver wasn't started automatically and ZFS reported all disks as ONLINE and healthy. Then, when another disk died, pool became unaccessible, because if it was 2-way mirror or RAIDZ1 two vdevs were out of sync. To fix the problem, start resilver automatically on pool load or import. Obtained from: OpenSolaris MFC after: 3 days	2010-05-31 23:17:45 +00:00
Pawel Jakub Dawidek	b1c7417cd8	Fix panic when reading label from provider with non power of 2 sector size. Reported by: James R. Van Artsdalen <james-freebsd-fs2@jrv.org> MFC after: 3 days	2010-05-31 23:11:43 +00:00
Martin Matuska	dd85b12982	Remove kstat.zfs.arcstats.l2_write_bytes_written The arcstats.l2_write_bytes_written kstat counter introduced in r205231 was duplicite with vendor's arcstats.l2_write_bytes counter imported in r208373 (OpenSolaris revision 8582:df9361868dbe) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-05-23 21:16:34 +00:00
Martin Matuska	5b170d55ae	Fix zfs receive temporarily changing unchanged stream properties. Fix possible panic with zfs_enable_datasets. OpenSolaris onnv revision: 8536:33bd5de3260e Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6748561, 6757075) MFC after: 3 days	2010-05-23 21:02:43 +00:00
Pawel Jakub Dawidek	4e8c7af455	Create UMA zones unconditionally. MFC after: 3 days	2010-05-23 19:10:06 +00:00
Pawel Jakub Dawidek	a95add4cf8	Remove ZIO_USE_UMA from arc.c as well. MFC after: 3 days	2010-05-23 18:42:33 +00:00
Konstantin Belousov	afe1a68827	Reorganize syscall entry and leave handling. Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_syscall pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month	2010-05-23 18:32:02 +00:00
Martin Matuska	55a381515b	Fix kernel panic when calling spa_tryimport() on a corrupted pool. OpenSolaris onnv revision: 8680:005fe27123ba Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6786321) MFC after: 1 day	2010-05-23 10:13:11 +00:00
Martin Matuska	e3fffd1a9f	Fix mutex_exit misorder that can cause a kernel panic. OpenSolaris onnv revision: 8667:5c308a17eb7c Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6795440) MFC after: 1 day	2010-05-23 10:08:05 +00:00
Martin Matuska	7838815ebb	Update L2ARC code and fix several bugs. - improve ARC memory consumption (Bug ID 6488341) - ARC/L2ARC metadata accounting (Bug ID 6748019) - L2ARC turbo warmup (Bud ID 6748023) - kstats for ARC content (Bug ID 6748023) - kstats for evicted bytes from ARC by L2ARC state (Bud ID 6871680) - fix panic on i386 systems (Bug ID 6821260) OpenSolaris onnv revisions: 8582:df9361868dbe, 8628:97dcded6e556, 9215:7c4584f76b47, 9274:a10f8bd993c1, 10357:29060492b29d OpenSolaris Bug IDs: 6748019, 6748023, 6748030, 6488341, 6798268, 6821260, 6790261, 6871680 Approved by: pjd, delphij (mentor) Obtained from: OpenSlaris (multiple bug IDs) MFC after: 3 days	2010-05-21 09:52:49 +00:00
Martin Matuska	370227d241	Reorder some already introduced locking variables. OpenSolaris onnv revision: 8214:d7abf7c1f1c1 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6747934) MFC after: 3 days	2010-05-21 09:35:28 +00:00
Martin Matuska	911e1f9b1d	Fix stack overflow in zfs send. OpenSolaris onnv-revision: 8012:8ea30813950f Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6765626) MFC after: 3 days	2010-05-21 08:55:18 +00:00
Martin Matuska	8b2bc083b9	Fix: vdev_reopen() can lead to failed allocations OpenSolaris onnv-revision: 7980:589f37f25048 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID `6764914`) MFC after: 3 days	2010-05-21 08:50:34 +00:00
Pawel Jakub Dawidek	2b3d97b81d	Fix userland build by making io_task available only for the kernel and by providing taskq_dispatch_safe() macro. MFC after: 1 week	2010-05-16 19:44:08 +00:00
Pawel Jakub Dawidek	ed3c664257	Allow to configure UMA usage for ZIO data via loader and turn it on by default for amd64. On i386 I saw performance degradation when UMA was used, but for amd64 it should help. MFC after: 3 days	2010-05-16 15:14:59 +00:00
Pawel Jakub Dawidek	cfb3e98d37	Add task structure to zio and use it instead of allocating one. This eliminates the only place where we can sleep when calling zio_interrupt(). As a side-effect this can actually improve performance a little as we allocate one less thing for every I/O. Prodded by: kib MFC after: 1 week	2010-05-16 15:12:34 +00:00
Pawel Jakub Dawidek	ea478cb1da	The whole point of having dedicated worker thread for each leaf VDEV was to avoid calling zio_interrupt() from geom_up thread context. It turns out that when provider is forcibly removed from the system and we kill worker thread there can still be some ZIOs pending. To complete pending ZIOs when there is no worker thread anymore we still have to call zio_interrupt() from geom_up context. To avoid this race just remove use of worker threads altogether. This should be more or less fine, because I also thought that zio_interrupt() does more work, but it only makes small UMA allocation with M_WAITOK. It also saves one context switch per I/O request. PR: kern/145339 Reported by: Alex Bakhtin <Alex.Bakhtin@gmail.com> MFC after: 1 week	2010-05-16 11:56:42 +00:00
Martin Matuska	ee56d88b76	Fix deadlock between zfs_dirent_lock and zfs_rmdir OpenSolaris onnv revision: 11321:506b7043a14c Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6847615) MFC after: 3 days	2010-05-16 07:46:03 +00:00
Martin Matuska	db708a6e2c	Fix perfomance problem with ZFS prefetch caching [1] Add statistics for ZFS prefetch (sysctl kstat.zfs.misc.zfetchstats) Partial import of OpenSolaris onnv revision 10474:0e96dd3b905a Reported by: jhell@dataix.net (private e-mail) [1] Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6859997, 6868951) MFC after: 3 days	2010-05-16 07:16:28 +00:00
Martin Matuska	bef629c14d	Fix ZIL-related panic on zfs rollback. OpenSolaris onnv-revision: 8746:e1d96ca6808c Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6796377) MCF after: 1 week	2010-05-13 20:55:58 +00:00
Martin Matuska	c43d127a9a	Import OpenSolaris revision 7837:001de5627df3 It includes the following changes: - parallel reads in traversal code (Bug ID 6333409) - faster traversal for zfs send (Bug ID 6418042) - traversal code cleanup (Bug ID 6725675) - fix for two scrub related bugs (Bug ID 6729696, 6730101) - fix assertion in dbuf_verify (Bug ID 6752226) - fix panic during zfs send with i/o errors (Bug ID 6577985) - replace P2CROSS with P2BOUNDARY (Bug ID 6725680) List of OpenSolaris Bug IDs: 6333409, 6418042, 6757112, 6725668, 6725675, 6725680, 6725698, 6729696, 6730101, 6752226, 6577985, 6755042 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 1 week	2010-05-13 20:32:56 +00:00
Edward Tomasz Napierala	4e28b70950	Add missing check to prevent local users from panicing the kernel by trying to set malformed ACL. MFC after: 3 days	2010-05-13 15:31:00 +00:00
Martin Matuska	f2d1218cbe	Fix possible hang when replaying large truncations. OpenSolaris onnv revision: 7904:6a124a4ca9c5 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6761624) MFC after: 3 days	2010-05-12 09:51:57 +00:00
Pawel Jakub Dawidek	c60c36a745	I added vfs_lowvnodes event, but it was only used for a short while and now it is totally unused. Remove it. MFC after: 3 days	2010-05-11 22:46:36 +00:00
Pawel Jakub Dawidek	8423e00b36	Eventhough r203504 eliminates taste traffic provoked by vdev_geom.c, ZFS still like to open all vdevs, close them and open them again, which in turn provokes taste traffic anyway. I don't know of any clean way to fix it, so do it the hard way - if we can't open provider for writing just retry 5 times with 0.5 pauses. This should elimitate accidental races caused by other classes tasting providers created on top of our vdevs. MFC after: 3 days Reported by: James R. Van Artsdalen <james-freebsd-fs2@jrv.org> Reported by: Yuri Pankov <yuri.pankov@gmail.com>	2010-05-11 22:29:00 +00:00
Pawel Jakub Dawidek	204b20d932	Add missing new line characters to the warnings. MFC after: 3 days	2010-05-11 22:23:35 +00:00
Martin Matuska	8c04b2242e	Fix failed assertion on destroying datasets from an older pool version. OpenSolaris onnv revision: 9390:887948510f80 PR: kern/146471 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID `6826861`) MFC after: 3 days	2010-05-11 09:26:46 +00:00
Martin Matuska	431905576e	Fix possible panic with zfs destroy. OpenSolaris onnv revision: 8779:f164e0e90508 PR: kern/146471 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6784924) MFC after: 3 days	2010-05-11 09:23:46 +00:00
Martin Matuska	bb8b966850	Fix zfs rename (may occasionally fail with dataset busy). OpenSolaris onnv revision: 8517:41a0783dde17 PR: kern/146471 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6784757) MFC after: 3 days	2010-05-11 09:19:41 +00:00
Martin Matuska	dbbd1505bf	Fix endianess bug in ZFS intent log (ZIL). OpenSolaris onnv revision: 8109:6147a1bdd359 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6760048) MFC after: 3 days	2010-05-11 07:25:13 +00:00
Edward Tomasz Napierala	dc510c105f	Enforce RLIMIT_FSIZE in ZFS. Reviewed by: pjd@	2010-05-07 14:30:21 +00:00
Marius Strobl	626b7c61f8	- Fix broken symlinks on cross platform zfs send/recv. [1] - Enable zfs_ace_byteswap() on FreeBSD as it works just fine (tested between amd64 and sparc64 in both directions by Michael Moll). PR: 146272 Approved by: mm, pjd Obtained from: OpenSolaris (onnv rev. 8283:1ca59f393041; Bug ID 6764193) [1] MFC after: 3 days	2010-05-05 22:15:20 +00:00
Martin Matuska	d75554ec04	Introduce hardforce export option (-F) for "zpool export". When exporting with this flag, zpool.cache remains untouched. OpenSolaris onnv revision: 8211:32722be6ad3b Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID: 6775357)	2010-05-05 18:22:29 +00:00
Martin Matuska	7d4daf9a10	Speed up ZFS list operation with objset prefetching. Partial import of OpenSolaris onnv revisions: 8415:8809e849f63e, 10474:0e96dd3b905a PR: kern/146297 Submitted by: myself Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6386929, 6755389, 6847118) MFC after: 2 weeks	2010-05-04 17:40:24 +00:00
Martin Matuska	77a7f64749	Fix deadlock during zfs receive. OpenSolaris onnv revision: 9299:8809e849f63e PR: kern/146296 Submitted by: myself Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6783818, 6826836) MFC after: 1 week	2010-05-04 17:30:07 +00:00
Martin Matuska	df04ddbaa6	Add sysctl and loader tunable vfs.zfs.txg.write_limit_override. This tunable improves fine-tuning of ZFS write throttling. PR: kern/146108 Suggested by: Nikolay Denev <ndenev at gmail.com> Approved by: pjd, delphij (mentor) MFC after: 2 weeks	2010-05-01 20:44:37 +00:00
Martin Matuska	9ccdc9600e	Change description of tunable group vfs.zfs.txg to be more understandable. Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-05-01 19:53:15 +00:00
Martin Matuska	d8665eb1f6	Fix improper pool write throughput calculation. OpenSolaris onnv revision: 9366:17553395a745 PR: kern/146108 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris, Bug ID 6817339 MFC after: 2 weeks	2010-04-30 07:48:29 +00:00
Pawel Jakub Dawidek	b19b0de471	Backport fix for 'zfs_znode_dmu_init: existing znode for dbuf' panic from OpenSolaris. PR: kern/144402 Reported by: Alex Bakhtin <alex.bakhtin@gmail.com> Tested by: Alex Bakhtin <alex.bakhtin@gmail.com> Obtained from: OpenSolaris, Bug ID 6895088 MFC after: 3 days	2010-04-28 18:29:48 +00:00
Pawel Jakub Dawidek	7af9c09a61	Allow to modify directory's content even if the ZFS_NOUNLINK (SF_NOUNLINK, sunlnk) flag is set. We only deny dirctory's removal or rename. PR: kern/143343 Reported by: marck MFC after: 3 days	2010-04-22 18:47:23 +00:00
Rui Paulo	ff569d8436	Rename the cyclic global variable lapic_cyclic_clock_func to just cyclic_clock_func. This will make more sense when we start developing non x86 cyclic version.	2010-04-20 17:03:30 +00:00
Rui Paulo	b527a2a4dc	The amd64 version of the cyclic dtrace module is a verbatim copy of the i386 version, so instead having a copy of the same file, use Makefile foo to include the i386 version on amd64.	2010-04-20 16:30:17 +00:00
Xin LI	0e568ab25c	Partially MFp4 #176265 by pjd@: - Properly initialize and destroy system_taskq. - Add a dummy implementation of taskq_create_proc(). Note: We do not currently use system_taskq in ZFS so this is mostly a no-op at this time. Proper system_taskq initialization is required by newer ZFS code. Ok'ed by: pjd MFC after: 2 weeks	2010-04-19 09:03:36 +00:00
Pawel Jakub Dawidek	bd7226a572	Restore previous order.	2010-04-18 12:43:33 +00:00
Pawel Jakub Dawidek	224329fb6b	Style fixes.	2010-04-18 12:36:53 +00:00
Pawel Jakub Dawidek	eb998be67d	Add missing list and lock destruction.	2010-04-18 12:27:07 +00:00
Pawel Jakub Dawidek	ad3cb80827	Extend locks scope to match OpenSolaris.	2010-04-18 12:25:40 +00:00
Pawel Jakub Dawidek	57a81a8bbc	Remove racy assertion. Obtained from: OpenSolaris	2010-04-18 12:21:52 +00:00
Pawel Jakub Dawidek	5195ca2307	Set ARC_L2_WRITING on L2ARC header creation. Obtained from: OpenSolaris	2010-04-18 12:20:33 +00:00
Pawel Jakub Dawidek	40c7da090f	Fix 3-way deadlock that can happen because of ZFS and vnode lock order reversal. thread0 (vfs_fhtovp) thread1 (vop_getattr) thread2 (zfs_recv) -------------------- --------------------- ------------------ vn_lock rrw_enter_read rrw_enter_write (hangs) rrw_enter_read (hangs) vn_lock (hangs) Submitted by: Attila Nagy <bra@fsn.hu> MFC after: 3 days	2010-04-15 16:40:54 +00:00
Pawel Jakub Dawidek	58804a192e	The same code is used to import and to create pool. The order of operations is the following: 1. Try to open vdev by remembered path and guid. 2. If 1 failed, try to find vdev which guid matches and ignore the path. 3. If 2 failed this means either that the vdev we're looking for is gone or that pool is being created and vdev doesn't contain proper guid yet. To be able to handle pool creation we open vdev by path anyway. Because of 3 it is possible that we open wrong vdev on import which can lead to confusions. The solution for this is to check spa_load_state. On pool creation it will be equal to SPA_LOAD_NONE and we can open vdev only by path immediately and if it is not equal to SPA_LOAD_NONE we first open by path+guid and when that fails, we open by guid. We no longer open wrong vdev on import. MFC after: 2 weeks	2010-03-19 20:14:27 +00:00
Kip Macy	e577b0b2e3	- cache line align arcs_lock array (h/t Marius Nuennerich) - fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE - cache line align buf_hash_table ht_locks array MFC after: 7 days	2010-03-17 21:10:09 +00:00
Kip Macy	07c5b1686e	use CACHE_LINE_SIZE instead of hardcoding 128 for lock pad pointed out by Marius Nuennerich and jhb@	2010-03-17 20:00:22 +00:00
Kip Macy	285738b6ad	- reduce contention by breaking up ARC state locks in to 16 for data and 16 for metadata - export L2ARC tunables as sysctls - add several kstats to track L2ARC state more precisely - avoid holding a contended lock when atomically incrementing a contended counter (no lock protection needed for atomics)	2010-03-16 22:17:21 +00:00
Kip Macy	03af82ac5e	fix compilation under ZIO_USE_UMA	2010-03-13 21:52:21 +00:00
Kip Macy	181c6ae3f0	Don't bottleneck on acquiring the stream locks - this avoids a massive drop off in throughput with large numbers of simultaneous reads MFC after: 7 days	2010-03-13 21:41:52 +00:00
Pawel Jakub Dawidek	3a98b0c4df	Remove bogus assertion. Reported by: Johan Ström <johan@stromnet.se> Obtained from: OpenSolaris, Bug ID 6827260 MFC after: 1 week	2010-03-12 12:07:21 +00:00
Pawel Jakub Dawidek	5b2e8d582f	Remove racy assertion. Reported by: Attila Nagy <bra@fsn.hu> Obtained from: OpenSolaris, Bug ID 6827260 MFC after: 1 week	2010-03-06 20:03:26 +00:00
Marcel Moolenaar	d49ebec408	Use mf and not mf.a. The latter doesn't force memory ordering and applies to sequential memory.	2010-02-22 01:24:34 +00:00
Pawel Jakub Dawidek	251294bca9	Don't set f_bsize to recordsize. It might confuse some software (like squid). Submitted by: Alexander Zagrebin <alexz@visp.ru> MFC after: 2 weeks	2010-02-19 20:18:16 +00:00
Pawel Jakub Dawidek	bbd388268e	Add tunable and sysctl to skip hostid check on pool import.	2010-02-18 22:31:43 +00:00
Xin LI	d0d444da07	Remove two files that are not needed by FreeBSD. Approved by: pjd MFC after: 2 weeks	2010-02-05 23:17:59 +00:00
Pawel Jakub Dawidek	9d3f36a309	Open provider for writting when we find the right one. Opening too much providers for writing provokes huge traffic related to taste events send by GEOM on close. This can lead to various problems with opening GEOM providers that are created on top of other GEOM providers. Reorted by: Kurt Touet <ktouet@gmail.com>, mr Tested by: mr, Baginski Darren <kickbsd@ya.ru> MFC after: 2 weeks	2010-02-04 21:11:44 +00:00
Xin LI	63243c5c71	On FreeBSD, time_t is 64-bit for all platforms except i386 and powerpc, where the type is 32-bit. ZFS can handle 64-bit timestamp internally but zfs_setattr() would check if the time value can fit, we change the checking macros to match 64-bit timestamp if the platform supports it. This change has some downsides like, while you can import zfs on 32-bit platforms, the timestamp would overflow if they are out of the range. This fixes the Y2.038K issue on platforms using 64-bit timestamps. Reviewed by: pjd MFC after: 1 month	2010-01-25 07:52:54 +00:00
Xin LI	51b1ec310c	Report ZFS filesystem version instead of the zpool version when we say it. Reported by: Yuri Pankov (on -fs@) Submitted by: delphij Approved by: pjd MFC after: 1 week	2010-01-11 23:15:11 +00:00
Xin LI	017b01f662	Re-apply onnv-gate revisions 7994 and 8986 (corresponds to FreeBSD revision 200726 and 200727). It looks like that the two revisions were not applied in the right sequence, I found this when comparing with the OpenSolaris code. MFC after: 3 days Reviewed by: mm@	2010-01-07 20:10:22 +00:00
Xin LI	03ec6213ee	Instead of assuming all vdevs are healthy, check the newest vdev label for each vdev's status. Booting from a degraded vdev should now be more robust. Submitted by: Matt Reimer <mattjreimer at gmail.com> Sponsored by: VPOP Technologies, Inc. MFC after: 2 weeks	2010-01-06 23:09:23 +00:00
Pawel Jakub Dawidek	49335ba853	Teach the (gpt)zfsboot and zfsloader raidz code to use its buffers more efficiently. Before this patch, in the worst case memory use would increase exponentially on the number of drives in the raidz vdev. Submitted by: Matt Reimer <mattjreimer@gmail.com> Sponsored by: VPOP Technologies, Inc. Silence from: dfr	2010-01-06 22:39:40 +00:00
Xin LI	1ee5de4482	Reduce diff against OpenSolaris - move Giant acquire/release to zfs_znode.c. As a side effect this also eliminates two potential Giant leaks. Approved by: pjd MFC after: 1 month	2010-01-02 23:38:03 +00:00
Xin LI	9189129097	Apply OpenSolaris revision 8012 which brings our zpool to version 14, making it possible for zpools created on OpenSolaris 2009.06 be used on FreeBSD. PR: kern/141800 Submitted by: mm Reviewed by: pjd, trasz Obtained from: OpenSolaris MFC after: 2 weeks	2009-12-28 22:15:11 +00:00
Xin LI	dd0c145752	Apply fix for Solaris bug 6462803: zfs snapshot -r failed because filesystem was busy (onnv revision 8989) Submitted by: mm Approved by: pjd Obtained from: OpenSolaris MFC after: 2 weeks	2009-12-19 11:49:20 +00:00
Xin LI	24a41d7ec6	Apply fix for Solaris bug 6801979: zfs recv can fail with E2BIG (onnv revision 8986) Requested by: mm Submitted by: pjd Obtained from: OpenSolaris MFC after: 2 weeks	2009-12-19 11:47:22 +00:00
Xin LI	775f802393	Apply fix Solaris bug 6462803 zfs snapshot -r failed because filesystem was busy. Submitted by: mm Approved by: pjd MFC after: 2 weeks	2009-12-19 11:43:39 +00:00
Konstantin Belousov	88f2d72947	Change VOP_FSYNC for zfs vnode from VOP_PANIC to zfs_freebsd_fsync(), both to not panic when fsync(2) is called for fifo on zfs filedescriptor, and to actually fsync fifo inode to permanent storage. PR: kern/141177 Reviewed by: pjd MFC after: 1 week	2009-12-05 20:36:42 +00:00
Pawel Jakub Dawidek	dfb903e852	We have to eventually look for provider without checking guid as this is need for attaching when there is no metadata yet. Before r200125 the order of looking for providers was wrong. It was: 1. Find provider by name. 2. Find provider by guid. 3. Find provider by name and guid. Where it should have been: 1. Find provider by name and guid. 2. Find provider by guid. 3. Find provider by name. MFC after: 1 week	2009-12-05 20:16:28 +00:00
Pawel Jakub Dawidek	6468cb2ce0	Fix deadlock when ZVOLs are present and we are replacing dead component or calling scrub when pool is in a degraded state. It will try to taste ZVOLs, which will lead to deadlock, as ZVOL will try to acquire the same locks as replace/scrub is holding already. We can't simply skip provider based on their GEOM class, because ZVOL can have providers build on top of it and we need to skip those as well. We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give us positive answer and we have to skip those providers. This way we remove possibility to create ZFS pools on top of ZVOLs, but it is not very useful anyway. I believe deadlock is still possible in some very complex situations like when we have MD provider on top of UFS file on top of ZVOL. When we try to replace dead component in the pool mentioned ZVOL is based on, there might be a deadlock when ZFS will try to taste MD provider. There is no easy way to detect that, but it isn't very common. MFC after: 1 week	2009-12-05 14:33:11 +00:00
Pawel Jakub Dawidek	ccba826977	Always check guid when opening by path, because we may end up with provider that does have the same name, but only by accident. MFC after: 1 week	2009-12-05 14:24:22 +00:00
Pawel Jakub Dawidek	29c8c85594	Avoid using additional variable for storing an error if we are not going to do anything with it.	2009-12-05 14:21:42 +00:00
Paul Saab	e44920da1a	Correct another case of not doing 64bit math. This allows mine and other raidz2 volumes to boot. Submitted by: Matt Reimer <mattjreimer@gmail.com>	2009-11-13 02:50:50 +00:00
Pawel Jakub Dawidek	fd9ee28bfc	Be careful which vattr fields are set during setattr replay. Without this fix strange things can appear after unclean shutdown like files with mode set to 07777. Reported by: des MFC after: 3 days	2009-11-10 22:27:33 +00:00
Pawel Jakub Dawidek	56697614cc	Avoid passing invalid mountpoint to getnewvnode(). Reported by: rwatson Tested by: rwatson MFC after: 3 days	2009-11-10 22:25:46 +00:00
Pawel Jakub Dawidek	fd66267ffb	- zfs_zaccess() can handle VAPPEND too, so map V_APPEND to VAPPEND and call zfs_access() instead of vaccess() in this case as well. - If VADMIN is specified with another V* flag (unlikely) call both zfs_access() and vaccess() after spliting V* flags. This fixes "dirtying snapshot!" panic. PR: kern/139806 Reported by: Carl Chave <carl@chave.us> In co-operation with: jh MFC after: 3 days	2009-10-30 23:33:06 +00:00
Robert Noland	14c436e101	Correct some issues with zfs boot. - Teach it to read gang blocks. (essentially untested) If you see "ZFS: gang block detected!", please let me know, so we can either remove the printf if it works, or fix it if it doesn't. - If multiple partitions exist on a disk, probe them all. We also need to reset dsk->start to 0 to read the right sector here. - With GPT, we can have 128 partitions. - If the bootfs property has ever been set on a pool it seems that it never goes away. zpool won't allow you to add to the pool with the bootfs property set. However, if you clear the property back to default we end up getting 0 for the object number and read a bogus block pointer and fail to boot. - Fix some error printfs. The printf in the loader is only capable of c,s and u formats. - Teach printf how to display %llu Reviewed by: dfr, jhb MFC after: 2 weeks	2009-10-23 18:44:53 +00:00
Pawel Jakub Dawidek	c217b20ef6	Allow file system owner to modify system flags if securelevel permits. MFC after: 3 days	2009-10-08 16:05:17 +00:00
Pawel Jakub Dawidek	68c53ef849	File system owner is when uid matches and jail matches. MFC after: 3 days	2009-10-08 16:03:19 +00:00
Pawel Jakub Dawidek	3a6c0cbf26	On FreeBSD it is enough to report provider removal when orphan event is received, we don't have to do it on every ENXIO error in I/O path. Solaris has no GEOM so they have to handle it in a less clean way. MFC after: 3 days	2009-10-07 20:56:15 +00:00
Pawel Jakub Dawidek	2ada529a14	Fix white-spaces. MFC after: 3 days	2009-10-07 20:54:07 +00:00
Pawel Jakub Dawidek	c0103003c0	Fix situation where Mac OS X NFS client creates a file and when it tries to set ownership and mode in the same setattr operation, the mode was overwritten by secpolicy_vnode_setattr(). PR: kern/118320 Submitted by: Mark Thompson <info-gentoo@mark.thompson.bz> MFC after: 3 days	2009-10-07 12:38:19 +00:00
Kip Macy	e6b112e274	Prevent paging pressure from draining arc too much - always drain arc if above arc_c_max - never drain arc if arc is below arc_c_max MFC after: 3 days	2009-10-06 21:40:50 +00:00
Xin LI	6f62807611	Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old format ZFS, as defined in the manual page. Submitted by: pjd (response of my original patch but bugs are mine) MFC after: 3 days	2009-10-01 18:58:26 +00:00
Pawel Jakub Dawidek	ab711589df	Handle cases where virtual (GFS) vnodes are referenced when doing forced unmount. In that case we cannot depend on the proper order of invalidating vnodes, so we have to free resources when we have a chance. PR: kern/139062 Reported by: trasz MFC after: 3 days	2009-09-26 00:10:45 +00:00
Pawel Jakub Dawidek	a0b238644a	On lookup error VFS expects *vpp to be set to NULL, be sure to do that. MFC after: 3 days	2009-09-26 00:08:44 +00:00
Pawel Jakub Dawidek	a99aaff645	Use traverse() function to find and return mount point's vnode instead of covered vnode when snapshot is already mounted. MFC after: 3 days	2009-09-26 00:07:14 +00:00
Pawel Jakub Dawidek	1aba32d9b4	- Don't depend on value returned by gfs_*_inactive(), it doesn't work well with forced unmounts when GFS vnodes are referenced. - Make other preparations to GFS for forced unmounts. PR: kern/139062 Reported by: trasz MFC after: 3 days	2009-09-26 00:04:30 +00:00
Pawel Jakub Dawidek	86758476b4	Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to be a bit weak and OpenSolaris also switched to fletcher4. PR: kern/139072 Reported by: Daniel Grund <bugs@dgrund.de> MFC after: 3 days	2009-09-25 18:19:50 +00:00
Pawel Jakub Dawidek	ad8294cf98	Before calling vflush(FORCECLOSE) mark file system as unmounted so the following vnops will fail. This is very important, because without this change vnode could be reclaimed at any point, even if we increased usecount. The only way to ensure that vnode won't be reclaimed was to lock it, which would be very hard to do in ZFS without changing a lot of code. With this change simply increasing usecount is enough to be sure vnode won't be reclaimed from under us. To be precise it can still be reclaimed but we won't be able to see it, because every try to enter ZFS through VFS will result in EIO. The only function that cannot return EIO, because it is needed for vflush() is zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks z_teardown_lock and never returns EIO. MFC after: 3 days	2009-09-24 15:56:26 +00:00
Pawel Jakub Dawidek	ab9bbf4a2b	Close race in zfs_zget(). We have to increase usecount first and then check for VI_DOOMED flag. Before this change vnode could be reclaimed between checking for the flag and increasing usecount. MFC after: 3 days	2009-09-24 15:49:15 +00:00
Edward Tomasz Napierala	c40502ccd0	In VOP_SETACL(9) and VOP_GETACL(9), specifying wrong ACL type should result in EINVAL, not EOPNOTSUPP.	2009-09-23 15:09:34 +00:00
Pawel Jakub Dawidek	eb03c3cdfb	Restore BSD behaviour - when creating new directory entry use parent directory gid to set group ownership and not process gid. This was overlooked during v6 -> v13 switch. PR: kern/139076 Reported by: Sean Winn <sean@gothic.net.au> MFC after: 3 days	2009-09-23 09:18:16 +00:00
Pawel Jakub Dawidek	c4be11d7fc	Purge namecache in the same place OpenSolaris does.	2009-09-20 13:28:29 +00:00
Pawel Jakub Dawidek	5469543c92	Purge file system namecache when receiving incremental stream and rolling back to it. MFC after: 3 days	2009-09-17 15:14:28 +00:00
Pawel Jakub Dawidek	3282c51713	Purge namecache for the file system being rolled back, so it doesn't point at invalid vnodes after the rollback resulting in EIO errors when trying to access files which are in the namecache. Reported by: des MFC after: 3 days	2009-09-17 14:58:21 +00:00
Pawel Jakub Dawidek	95f08808b6	Forced unmounts work just fine in my tests under heavy load. There might still be a problem, but it isn't worth a warning.	2009-09-15 11:42:08 +00:00
Pawel Jakub Dawidek	a4e6b460d3	We believe ZFS is ready for production use. Remove a warning about it being experimental. :)	2009-09-15 11:34:53 +00:00
Pawel Jakub Dawidek	63e1d3df27	- Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows ZFS route of not listing snapshots by default with 'zfs list' command. - Add UPDATING entry to note that ZFS snapshots are no longer visible in mount(8) and df(1) output by default. Reviewed by: kib MFC after: 3 days	2009-09-14 21:10:40 +00:00
Pawel Jakub Dawidek	85c171b2e1	Support both case: when snapshot is already mounted and when it is not yet mounted. MFC after: 3 days	2009-09-13 21:40:36 +00:00
Pawel Jakub Dawidek	8a2c4db0fe	Add missing \n. Reported by: marck	2009-09-13 17:30:56 +00:00
Pawel Jakub Dawidek	7746b6461d	Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories by just returning EOPNOTSUPP. This will allow NFS server to fall back to regular READDIR. Note that converting inode number to snapshot's vnode is expensive operation. Snapshots are stored in AVL tree, but based on their names, not inode numbers, so to convert inode to snapshot vnode we have to interate over all snalshots. This is not a problem in OpenSolaris, because in their READDIRPLUS implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on d_fileno as we do. PR: kern/125149 Reported by: Weldon Godfrey <wgodfrey@ena.com> Analysis by: Jaakko Heinonen <jh@saunalahti.fi> MFC after: 3 days	2009-09-13 16:05:20 +00:00
Pawel Jakub Dawidek	7b4a12379b	When zfs.ko is compiled with debug, make sure that znode and vnode point at each other. MFC after: 3 days	2009-09-13 10:33:51 +00:00
Pawel Jakub Dawidek	33a0ef82f2	Extend scope of the z_teardown_lock lock for consistency and "just in case". MFC after: 3 days	2009-09-13 10:29:51 +00:00
Pawel Jakub Dawidek	7dae3c4faf	Be sure not to overflow struct fid. MFC after: 3 days	2009-09-13 10:25:33 +00:00
Pawel Jakub Dawidek	f53901193d	There is a bug where mze_insert() can trigger an assert() of inserting the same entry twice. This bug is not fixed yet, but leads to situation where when try to access corrupted directory the kernel will panic. Until the bug is properly fixed, try to recover from it and log that it happened. Reported by: marck OpenSolaris bug: 6709336 MFC after: 3 days	2009-09-13 10:12:29 +00:00
Pawel Jakub Dawidek	f5516e3d1d	- Protect reclaim with z_teardown_inactive_lock. - Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if z_dbuf field is NULL - this might happen in case of rollback or forced unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete(). - On forced unmount wait for all znodes to be destroyed - destruction can be done asynchronously via zfs_reclaim_complete(). MFC after: 1 week	2009-09-12 19:53:31 +00:00
Pawel Jakub Dawidek	2a8e7dad33	Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain NULL, but also can point to dead vnode, take that into account. PR: kern/132068 Reported by: Edward Fisk" <7ogcg7g02@sneakemail.com>, kris Fix based on patch from: Jaakko Heinonen <jh@saunalahti.fi> MFC after: 1 week	2009-09-12 19:27:54 +00:00
Pawel Jakub Dawidek	3770996142	Only log successful commands! Without this fix we log even unsuccessful commands executed by unprivileged users. Action is not really taken, but it is logged to pool history, which might be confusing. Reported by: Denis Ahrens <denis@h3q.com> MFC after: 3 days	2009-09-08 16:40:08 +00:00
Pawel Jakub Dawidek	d6b8039292	We don't export individual snapshots, so mnt_export field in snapshot's mount point is NULL. That's why when we try to access snapshots over NFS use mnt_export field from the parent file system. MFC after: 1 week	2009-09-08 15:57:03 +00:00
Pawel Jakub Dawidek	f148fd9a4a	When we automatically mount snapshot we want to return vnode of the mount point from the lookup and not covered vnode. This is one of the fixes for using .zfs/ over NFS. MFC after: 1 week	2009-09-08 15:51:40 +00:00
Pawel Jakub Dawidek	2391003912	On FreeBSD we don't have to look for snapshot's mount point, because fhtovp method is already called with proper mount point. MFC after: 1 week	2009-09-08 15:42:55 +00:00
Pawel Jakub Dawidek	6f8e88e1da	Call ZFS_EXIT() after locking the vnode. MFC after: 1 week	2009-09-08 15:37:01 +00:00
Konstantin Belousov	211ddddce7	Lock Giant around vn_open_cred(). Remove innocent unnecessary call to NDFREE(). Reported by: marcel Reviewed and tested by: pjd MFC after: 3 days	2009-09-08 09:17:34 +00:00
Pawel Jakub Dawidek	1ea3566294	Fix reference count leak for a case where snapshot's mount point is updated. Such situation is not supported. This problem was triggered by something like this: # zpool create tank da0 # zfs snapshot tank@snap # cd /tank/.zfs/snapshot/snap (this will mount the snapshot) # cd # mount -u nosuid /tank/.zfs/snapshot/snap (refcount leak) # zpool export tank cannot export 'tank': pool is busy MFC after: 1 week	2009-09-08 08:54:15 +00:00
Pawel Jakub Dawidek	28e449adf2	If we have to use avl_find(), optimize a bit and use avl_insert() instead of avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()). Fix similar case in the code that is currently commented out.	2009-09-07 21:58:54 +00:00
Pawel Jakub Dawidek	3f6043a57d	When snapshot mount point is busy (for example we are still in it) we will fail to unmount it, but it won't be removed from the tree, so in that case there is no need to reinsert it. This fixes a panic reproducable in the following steps: # zfs create tank/foo # zfs snapshot tank/foo@snap # cd /tank/foo/.zfs/snapshot/snap # umount /tank/foo panic: avl_find() succeeded inside avl_add() Reported by: trasz MFC after: 3 days	2009-09-07 21:46:51 +00:00
Edward Tomasz Napierala	343775c0b4	Enable NFSv4 ACL support in ZFS. Reviewed by: pjd	2009-09-07 19:43:13 +00:00
Pawel Jakub Dawidek	08780916dd	Defer thread start until we set priority. Reviewed by: kib MFC after: 3 days	2009-09-07 19:22:44 +00:00
Pawel Jakub Dawidek	c739b7b22b	Don't recheck ownership on update mount. This will eliminate LOR between vfs_busy() and mount mutex. We check ownership in vfs_domount() anyway. Noticed by: kib Reviewed by: kib MFC after: 1 week	2009-09-07 18:54:55 +00:00
Pawel Jakub Dawidek	2ff6f0f89a	- Avoid holding mutex around M_WAITOK allocations. - Add locking for mnt_opt field. MFC after: 1 week	2009-09-07 18:23:26 +00:00
Edward Tomasz Napierala	900b1670c4	Prevent the line from wrapping.	2009-09-07 16:56:41 +00:00
Pawel Jakub Dawidek	841bcfea21	Changing provider size is not really supported by GEOM, but doing so when provider is closed should be ok. When administrator requests to change ZVOL size do it immediately if ZVOL is closed or do it on last ZVOL close. PR: kern/136942 Requested by: Bernard Buri <bsd@ask-us.at> MFC after: 1 week	2009-09-07 14:16:50 +00:00
Pawel Jakub Dawidek	5e65224daf	bzero() on-stack argument, so mutex_init() won't misinterpret that the lock is already initialized if we have some garbage on the stack. PR: kern/135480 Reported by: Emil Mikulic <emikulic@gmail.com> MFC after: 3 days	2009-09-07 11:38:43 +00:00
Edward Tomasz Napierala	a41422a93e	Improve wording. Discussed with: pjd, cperciva, rink, wkoszek and des, in order of appearance.	2009-09-05 15:08:58 +00:00
Pawel Jakub Dawidek	26d0605727	Backport the 'dirtying dbuf' panic fix from newer ZFS version. Reported by: Thomas Backman <serenity@exscape.org> MFC after: 1 week	2009-08-31 16:27:00 +00:00
Pawel Jakub Dawidek	575c1d371c	Add missing mountpoint vnode locking. This fixes panic on assertion with DEBUG_VFS_LOCKS and vfs.usermount=1 when regular user tries to mount dataset owned by him. MFC after: 1 week	2009-08-30 21:03:40 +00:00
Pawel Jakub Dawidek	5d5535163a	- Hide ZFS kernel threads under zfskern process. - Use better (shorter) threads names: 'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00' 'vdev:worker da0' -> 'vdev da0'	2009-08-23 11:33:46 +00:00
Pawel Jakub Dawidek	4ec2b0e7ce	Set priority of vdev_geom threads and zvol threads to PRIBIO.	2009-08-23 11:27:08 +00:00
Pawel Jakub Dawidek	1869987e42	- Give minclsyspri and maxclsyspri real values (consulted with kmacy). - Honour 'pri' argument for thread_create().	2009-08-23 11:22:46 +00:00
Pawel Jakub Dawidek	35ae9291c2	Our libc doesn't implement control method for XDR (only kernel does) and it will always return failure. Fix this by bringing userland implementation of xdrmem_control() back. This allow 'zpool import' to work again. Reported by: Thomas Backman <serenity@exscape.org> Reviewed by: kmacy Approved by: re (kib)	2009-08-20 00:05:29 +00:00
Pawel Jakub Dawidek	8e9fd65fbf	getcwd() (when __getcwd() fails) works by stating current directory, going up (..), calling readdir and looking for previous directory inode. In case of .zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it won't be visible in readdir output. Fix this by implementing VPTOCNP for snapshot directories, so __getcwd() doesn't fail and getcwd() doesn't have to use readdir method. This fixes /bin/pwd from within .zfs/snapshot/<name>/. Suggested by: kib Approved by: re (rwatson)	2009-08-17 10:00:18 +00:00
Pawel Jakub Dawidek	8461b0f043	Manage asynchronous vnode release just like Solaris. Discussed with: kmacy Approved by: re (kib)	2009-08-17 09:48:34 +00:00
Pawel Jakub Dawidek	e35eb914f4	- Reduce z_teardown_lock lock scope a bit. - The error variable is int, not bool. - Convert spaces to tabs where needed. Approved by: re (kib)	2009-08-17 09:28:15 +00:00
Pawel Jakub Dawidek	0330a5dc10	If z_buf is NULL, we should free znode immediately. Noticed by: avg Approved by: re (kib)	2009-08-17 09:25:37 +00:00
Pawel Jakub Dawidek	d83cfc37a4	- We need to recycle vnode instead of freeing znode. Submitted by: avg - Add missing vnode interlock unlock. - Remove redundant znode locking. Approved by: re (kib)	2009-08-17 09:21:39 +00:00
Pawel Jakub Dawidek	f820bc079f	Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have 0 usecount. Reported by: Thomas Backman <serenity@exscape.org> Approved by: re (kib)	2009-08-17 09:13:22 +00:00
Pawel Jakub Dawidek	159ef108e1	Remove OpenSolaris taskq port (it performs very poorly in our kernel) and replace it with wrappers around our taskqueue(9). To make it possible implement taskqueue_member() function which returns 1 if the given thread was created by the given taskqueue. Approved by: re (kib)	2009-08-17 09:01:20 +00:00
Pawel Jakub Dawidek	fddc954016	- Fix a race where /dev/zfs control device is created before ZFS is fully initialized. Also destroy /dev/zfs before doing other deinitializations. - Initialization through taskq is no longer needed and there is a race where one of the zpool/zfs command loads zfs.ko and tries to do some work immediately, but /dev/zfs is not there yet. Reported by: pav Approved by: re (kib)	2009-08-17 08:36:41 +00:00
Pawel Jakub Dawidek	830940567b	Remove files that are no longer used. Discussed with: kmacy Approved by: re (kib)	2009-08-17 08:03:02 +00:00
Marcel Moolenaar	538e86713d	Fix misalignment in nvpair_native_embedded() caused by the compiler replacing the bzero(). See also revision 195627, which fixed the misalignment in nvpair_native_embedded_array(). Approved by: re (kensmith)	2009-08-16 01:48:46 +00:00
Edward Tomasz Napierala	abd370a36b	Remove CDDL warning. Approved by: re (kib), core	2009-08-13 12:28:30 +00:00
Pawel Jakub Dawidek	abd8353f5d	We don't support ephemeral IDs in FreeBSD and without this fix ZFS can panic when in zfs_fuid_create_cred() when userid is negative. It is converted to unsigned value which makes IS_EPHEMERAL() macro to incorrectly report that this is ephemeral ID. The most reasonable solution for now is to always report that the given ID is not ephemeral. PR: kern/132337 Submitted by: Matthew West <freebsd@r.zeeb.org> Tested by: Thomas Backman <serenity@exscape.org>, Michael Reifenberger <mike@reifenberger.com> Approved by: re (kib) MFC after: 2 weeks	2009-07-27 14:52:34 +00:00
Edward Tomasz Napierala	d2ceff236a	Fix extattr_list_file(2) on ZFS in case the attribute directory doesn't exist and user doesn't have write access to the file. Without this fix, it returns bogus value instead of 0. For some reason this didn't manifest on my kernel compiled with -O0. PR: kern/136601 Submitted by: Jaakko Heinonen <jh at saunalahti dot fi> Approved by: re (kib)	2009-07-22 15:15:58 +00:00
Edward Tomasz Napierala	65588fd503	Fix permission handling for extended attributes in ZFS. Without this change, ZFS uses SunOS Alternate Data Streams semantics - each EA has its own permissions, which are set at EA creation time and - unlike SunOS - invisible to the user and impossible to change. From the user point of view, it's just broken: sometimes access is granted when it shouldn't be, sometimes it's denied when it shouldn't be. This patch makes it behave just like UFS, i.e. depend on current file permissions. Also, it fixes returned error codes (ENOATTR instead of ENOENT) and makes listextattr(2) return 0 instead of EPERM where there is no EA directory (i.e. the file never had any EA). Reviewed by: pjd (idea, not actual code) Approved by: re (kib)	2009-07-20 19:16:42 +00:00
Andriy Gapon	b064b6d1cd	dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds Currently dtrace_gethrtime uses formula similar to the following for converting TSC ticks to nanoseconds: rdtsc() * 10^9 / tsc_freq The dividend overflows 64-bit type and wraps-around every 2^64/10^9 = 18446744073 ticks which is just a few seconds on modern machines. Now we instead use precalculated scaling factor of 10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately for each 32-bit half. This allows to avoid overflow of the dividend described above. The idea is taken from OpenSolaris. This has an added feature of always scaling TSC with invariant value regardless of TSC frequency changes. Thus the timestamps will not be accurate if TSC actually changes, but they are always proportional to TSC ticks and thus monotonic. This should be much better than current formula which produces wildly different non-monotonic results on when tsc_freq changes. Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init() to make it identical to the i386 twin. PR: kern/127441 Tested by: Thomas Backman <serenity@exscape.org> Reviewed by: jhb Discussed with: current@, bde, gnn Silence from: jb Approved by: re (gnn) MFC after: 1 week	2009-07-15 17:07:39 +00:00
Konstantin Belousov	f33a947b56	Add new msleep(9) flag PBDY that shall be specified together with PCATCH, to indicate that thread shall not be stopped upon receipt of SIGSTOP until it reaches the kernel->usermode boundary. Also change thread_single(SINGLE_NO_EXIT) to only stop threads at the user boundary unconditionally. Tested by: pho Reviewed by: jhb Approved by: re (kensmith)	2009-07-14 22:52:46 +00:00
Marcel Moolenaar	dd8a461e83	In nvpair_native_embedded_array(), meaningless pointers are zeroed. The programmer was aware that alignment was not guaranteed in the packed structure and used bzero() to NULL out the pointers. However, on ia64, the compiler is quite agressive in finding ILP and calls to bzero() are often replaced by simple assignments (i.e. stores). Especially when the width or size in question corresponds with a store instruction (i.e. st1, st2, st4 or st8). The problem here is not a compiler bug. The address of the memory to zero-out was given by '&packed->nvl_priv' and given the type of the 'packed' pointer the compiler could assume proper alignment for the replacement of bzero() with an 8-byte wide store to be valid. The problem is with the programmer. The programmer knew that the address did not have the alignment guarantees needed for a regular assignment, but failed to inform the compiler of that fact. In fact, the programmer told the compiler the opposite: alignment is guaranteed. The fix is to avoid using a pointer of type "nvlist_t " and instead use a "char " pointer as the basis for calculating the address. This tells the compiler that only 1-byte alignment can be assumed and the compiler will either keep the bzero() call or instead replace it with a sequence of byte-wise stores. Both are valid. Approved by: re (kib)	2009-07-11 22:43:20 +00:00
Andriy Gapon	f340e9fe71	dtrace/amd64: fix virtual address checks On amd64 KERNBASE/kernbase does not mean start of kernel memory. This should fix a KASSERT panic in dtrace_copycheck when copyin*() is used in D program. Also make checks for user memory a bit stricter. Reported by: Thomas Backman <serenity@exscape.org> Submitted by: wxs (kaddr part) Tested by: Thomas Backman (prototype), wxs Reviewed by: alc (concept), jhb, current@ Aprroved by: jb (concept) MFC after: 2 weeks PR: kern/134408	2009-06-24 16:03:57 +00:00
Konstantin Belousov	a18a95db4a	O_NOFOLLOW shall be in flags, not in cmode. Noted by: bde	2009-06-22 10:08:48 +00:00
Konstantin Belousov	e0c161b89c	Add another flags argument to vn_open_cred. Use it to specify that some vn_open_cred invocations shall not audit namei path. In particular, specify VN_OPEN_NOAUDIT for dotdot lookup performed by default implementation of vop_vptocnp, and for the open done for core file. vn_fullpath is called from the audit code, and vn_open there need to disable audit to avoid infinite recursion. Core file is created on return to user mode, that, in particular, happens during syscall return. The creation of the core file is audited by direct calls, and we do not want to overwrite audit information for syscall. Reported, reviewed and tested by: rwatson	2009-06-21 13:41:32 +00:00
Jamie Gritton	c1f192193d	Rename the host-related prison fields to be the same as the host.* parameters they represent, and the variables they replaced, instead of abbreviated versions of them. Approved by: bz (mentor)	2009-06-13 15:39:12 +00:00
Kip Macy	f0c6b798a3	pjd has requested that I keep the tunable as zfs_prefetch_disable to minimize gratuitous differences with Opensolaris' ZFS Sorry for the churn	2009-06-11 22:24:08 +00:00
Kip Macy	e4e5e663e0	check against prefetch_enable	2009-06-11 09:51:21 +00:00
Kip Macy	3fa5485637	use default policy for enabling prefetching unless the TUNABLE is set	2009-06-10 21:05:37 +00:00
Kip Macy	107b659450	As far as I can tell systems that have less than 4GB are more often hurt by prefetched than helped. On i386 systems and systems with less than 4GB, prefetch is now disabled by default. I've added a prefetch enable tunable, to enable prefetching for those systems. The prefetch disable tunable will continue to unconditionally disable prefetching.	2009-06-10 01:21:32 +00:00
Paul Saab	a6d545d8ed	Support shared vnode locks for write operations when the offset is provided on filesystems that support it. This really improves mysql + innodb performance on ZFS. Reviewed by: jhb, kmacy, jeffr	2009-06-04 16:18:07 +00:00
Doug Rabson	8be608b58c	Allow the bootfs property to be set for raidz pools on FreeBSD. Reviewed by: pjd	2009-05-31 11:59:32 +00:00
Kip Macy	762169b50a	fix xdrmem_control to be safe in an if statement fix zfs to depend on krpc remove xdr from zfs makefile Submitted by: dchagin@freebsd.org	2009-05-30 22:23:58 +00:00
Kip Macy	139ccddec0	work around snapshot shutdown race reported by Henri Hennebert	2009-05-30 19:26:35 +00:00
Jamie Gritton	76ca6f88da	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
Attilio Rao	1ae1c2a3bd	Reverse the logic for ADAPTIVE_SX option and enable it by default. Introduce for this operation the reverse NO_ADAPTIVE_SX option. The flag SX_ADAPTIVESPIN to be passed to sx_init_flags(9) gets suppressed and the new flag, offering the reversed logic, SX_NOADAPTIVE is added. Additively implements adaptive spininning for sx held in shared mode. The spinning limit can be handled through sysctls in order to be tuned while the code doesn't reach the release, after which time they should be dropped probabilly. This change has made been necessary by recent benchmarks where it does improve concurrency of workloads in presence of high contention (ie. ZFS). KPI breakage is documented by __FreeBSD_version bumping, manpage and UPDATING updates. Requested by: jeff, kmacy Reviewed by: jeff Tested by: pho	2009-05-29 01:49:27 +00:00
Kip Macy	c334d2d544	MFdevbranch 192944 - add FreeBSD implementation of xdrmem_control needed by zfs - have zfs define xdr_ops using FreeBSD's definition - remove solaris xdr files from zfs compile	2009-05-28 08:18:12 +00:00
Stacey Son	a5aedd68b4	Add the OpenSolaris dtrace lockstat provider. The lockstat provider adds probes for mutexes, reader/writer and shared/exclusive locks to gather contention statistics and other locking information for dtrace scripts, the lockstat(1M) command and other potential consumers. Reviewed by: attilio jhb jb Approved by: gnn (mentor)	2009-05-26 20:28:22 +00:00
Edward Tomasz Napierala	b7014134a7	Change license to more bori^Wadul^Wcanonical. Submitted by: rwatson@	2009-05-26 11:42:06 +00:00
Edward Tomasz Napierala	0970b4bae0	MFp4 changes neccessary for NFSv4 ACLs support in ZFS. This is mostly about removing a few #ifdefs and providing compatibility wrappers and VOP implementations to get and set an ACL; ZFS does ACL enforcement all by itself. Note that the VOPs are ifdefed out for now, so this change should be a no-op. Reviewed by: pjd	2009-05-26 08:21:59 +00:00
Edward Tomasz Napierala	4076aa37dc	Don't allow non-owner to set SUID bit on a file. It doesn't make any difference now, but in NFSv4 ACLs, there is write_acl permission, which also affects mode changes. Reviewed by: pjd	2009-05-24 19:21:49 +00:00
Edward Tomasz Napierala	194f4d42de	Fix comment.	2009-05-24 15:48:48 +00:00
Dag-Erling Smørgrav	bba5cfd28b	Unexpand $FreeBSD$.	2009-05-23 16:01:58 +00:00
Dag-Erling Smørgrav	6feca53bed	Remove svn:keywords on a file that had fbsd:nokeywords (though I don't understand the reason for the latter)	2009-05-23 16:00:16 +00:00
Kip Macy	e95d34711b	- back out direct map hack - it is no longer needed	2009-05-19 01:14:37 +00:00
Kip Macy	0fe5460dbd	set createtxg prop name PR: bin/130105	2009-05-17 04:04:25 +00:00
Kip Macy	ea41c77517	SAVESTART implies SAVENAME	2009-05-17 01:31:28 +00:00
Kip Macy	2e9c90d55b	enable adaptive spinning on zfs locks	2009-05-16 23:56:45 +00:00
Kip Macy	be08aa8b59	- allow forced unmounts - don't assume snapshot was auto-mounted	2009-05-16 20:33:13 +00:00
Kip Macy	71bc1ce36e	only use direct map if system has more than 2GB	2009-05-16 20:09:07 +00:00
Kip Macy	32237d8492	apply band-aid to x86_64 systems with more physical memory than kmem by allocating from the direct map	2009-05-16 19:17:15 +00:00
Doug Rabson	e1899ef6c8	Add support for booting from raidz1 and raidz2 pools.	2009-05-16 10:48:20 +00:00
Attilio Rao	dfd233edd5	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
Kip Macy	469ef3e563	rename xdr support files to avoid conflicts when linking in to the kernel	2009-05-11 04:18:58 +00:00
Kip Macy	8569258bf8	- rename atomic.S and crc32.c to avoid collisions when linking zfs in to the kernel - update Makefile - ifdef out acl_{alloc, free}, they aren't used by zfs and conflict with existing in-kernel routines	2009-05-09 01:45:55 +00:00
Marko Zec	29b02909eb	Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)	2009-05-08 14:11:06 +00:00
Kip Macy	a6827463ad	don't call vn_rele_async_fini in the !_KERNEL case	2009-05-07 23:34:41 +00:00
Kip Macy	c20fd07777	move VN_RELE_ASYNC to the compatibility layer with the rest of the VN_* defines	2009-05-07 23:02:15 +00:00
Kip Macy	6ef1a81d6e	avoid LOR and gratuitous extra lock acquisitions by moving user_evict list buffers to a temporary list	2009-05-07 21:51:13 +00:00
Kip Macy	77d0162c70	Allow the VM to provide backpressure on the ARC cache as it does on Solaris.	2009-05-07 20:57:06 +00:00
Kip Macy	62fa227ccd	Asynchronously release vnodes to avoid blocking on range locks when calling back in to zfs. This is based on a fix that went in to opensolaris on March 9th. However, it uses a dedicated thread instead of a Solaris' taskq to avoid doing a blocking memory allocation with the vnode interlock held. This fixes a long-time deadlock in ZFS. This is not, strictly speaking, an LOR. The spa_zio thread releases a vnode, this calls in to vn_reclaim which in turn needs to acquire range locks to sync dirty data out to disk. The range locks are already held by a user-level process waiting on a condition variable that it the process is waiting on a spa_zio thread to signal it on. The process could not be signalled because the spa_zio thread could not proceed. The nature of this problem was not apparent due to ZFS locks opting out of witness which meant that DDB did not know about the locks that were held by ZFS. Reviewed by: pjd MFC after: 7 days	2009-05-07 20:28:06 +00:00
Jamie Gritton	b38ff370e4	Introduce the extensible jail framework, using the same "name=value" interface as nmount(2). Three new system calls are added: * jail_set, to create jails and change the parameters of existing jails. This replaces jail(2). * jail_get, to read the parameters of existing jails. This replaces the security.jail.list sysctl. * jail_remove to kill off a jail's processes and remove the jail. Most jail parameters may now be changed after creation, and jails may be set to exist without any attached processes. The current jail(2) system call still exists, though it is now a stub to jail_set(2). Approved by: bz (mentor)	2009-04-29 21:14:15 +00:00
Robert Watson	885868cd8f	Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon	2009-04-10 10:52:19 +00:00
Andrew Thompson	853a10a581	Revert r190676,190677 The geom and CAM changes for root_hold are the wrong solution for USB design quirks. Requested by: scottl	2009-04-10 04:08:34 +00:00
Andrew Thompson	626fc9fe3d	Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called in situations where sleeping isnt allowed.	2009-04-03 19:46:12 +00:00
Robert Watson	455f3aa24f	Move dtnfsclient.c in the cddl tree to nfs_kdtrace.c in the nfsclient directory, since it's under a BSD license, and this keeps NFS internals- aware tracing parts close to NFS. MFC after: 1 month Suggested by: jhb	2009-03-25 17:47:22 +00:00
Robert Watson	10263f0832	Add DTrace probes to the NFS access and attribute caches. Access cache events are: nfsclient:accesscache:flush:done nfsclient:accesscache:get:hit nfsclient:accesscache:get:miss nfsclient:accesscache:load:done They pass the vnode, uid, and requested or loaded access mode (if any); the load event may also report a load error if the RPC fails. The attribute cache events are: nfsclient:attrcache:flush:done nfsclient:attrcache:get:hit nfsclient:attrcache:get:miss nfsclient:attrcache:load:done They pass the vnode, optionally the vattr if one is present (hit or load), and in the case of a load event, also a possible RPC error. MFC after: 1 month Sponsored by: Google, Inc.	2009-03-24 17:14:34 +00:00
Robert Watson	47294818f9	Add dtnfsclient, a first cut at an NFSv2/v3 client reuest DTrace provider. The NFS client exposes 'start' and 'done' probes for NFSv2 and NFSv3 RPCs when using the new RPC implementation, passing in the vnode, mbuf chain, credential, and NFSv2 or NFSv3 procedure number. For 'done' probes, the error number is also available. Probes are named in the following way: ... nfsclient:nfs2:write:start nfsclient:nfs2:write:done ... nfsclient:nfs3:access:start nfsclient:nfs3:access:done ... Access to the unmarshalled arguments is not easily available at this point in the stack, but the passed probe arguments are sufficient to to a lot of interesting things in practice. Technically, these probes may cover multiple RPC retransmits, and even transactions if the transaction ID change as a result of authentication failure or a jukebox error from the server, but usefully capture the intent of a single NFS request, such as access, getattr, write, etc. Typical use might involve profiling RPC latency by system call, number of RPCs, how often a getattr leads to a call to access, when failed access control checks occur, etc. More detailed RPC information might best be provided by adding a krpc provider. It would also be useful to add NFS client probes for events such as the access cache or attribute cache satisfying requests without an RPC. Sponsored by: Google, Inc. MFC after: 1 month	2009-03-22 22:07:52 +00:00
John Baldwin	9fca7a854c	The zfs_get_xattrdir() function is used to find the extended attribute directory for a znode. When the directory already exists, it returns a referenced but unlocked vnode. When a directory does not yet exist, it calls zfs_make_xattrdir() to create a new one. zfs_make_xattrdir() returns the vnode both referenced and and locked and zfs_get_xattrdir() was leaking this vnode lock to its callers. Fix this by dropping the vnode lock if zfs_make_xattrdir() successfully creates a new extended attribute directory. Reviewed by: pjd	2009-03-18 16:19:44 +00:00
John Baldwin	33fc362512	Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF. Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month	2009-03-11 14:13:47 +00:00
Jamie Gritton	f86bce5ed0	Extend the "vfsopt" mount options for more general use. Make struct vfsopt and the vfs_buildopts function public, and add some new fields to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and vfs_opterror. Further extend the interface to allow reading options from the kernel in addition to sending them to the kernel, with vfs_setopt and related functions. While this allows the "name=value" option interface to be used for more than just FS mounts (planned use is for jails), it retains the current "vfsopt" name and <sys/mount.h> requirement. Approved by: bz (mentor)	2009-03-02 23:26:30 +00:00
Ed Schouten	802cb57e34	Add memmove() to the kernel, making the kernel compile with Clang. When copying big structures, LLVM generates calls to memmove(), because it may not be able to figure out whether structures overlap. This caused linker errors to occur. memmove() is now implemented using bcopy(). Ideally it would be the other way around, but that can be solved in the future. On ARM we don't do add anything, because it already has memmove(). Discussed on: arch@ Reviewed by: rdivacky	2009-02-28 16:21:25 +00:00
John Baldwin	ea77ff0a15	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
Ed Schouten	a4611ab612	Last step of splitting up minor and unit numbers: remove minor(). Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.	2009-01-28 17:57:16 +00:00
Warner Losh	78bc7eec0d	Put the MIPS support back in after it was removed in r185029.	2008-12-04 16:31:08 +00:00
Pawel Jakub Dawidek	35a15332f3	MFp4: Remove assertion that is no longer valid - we now use VOP_CLOSE() in more places (ie vdev_file.c).	2008-11-29 12:32:42 +00:00
Edward Tomasz Napierala	38cc5da78e	MFp4: We don't support TX_CREATE_ACL_ATTR nor TX_MKDIR_ACL_ATTR; code found in zfs_replay.c will panic if it encounters transactions of this type. Make sure we don't put these into the ZIL. Approved by: rwatson (mentor), pjd	2008-11-25 23:05:46 +00:00
Pawel Jakub Dawidek	ad35ee04f4	Fix locking (file descriptor table and Giant around VFS). Most submitted by: kib Reviewed by: kib	2008-11-25 21:14:00 +00:00
Ganbold Tsagaankhuu	79dae0aa0b	Remove unused variable. Found with: Coverity Prevent(tm) CID: 3669,3671 Approved by: jb	2008-11-25 19:25:54 +00:00
Pawel Jakub Dawidek	83080c1ece	Don't use PRIV_ROOT. Here we check if user can share ZFS file system, so PRIV_NFS_DAEMON seems best choice. Discussed with: rwatson	2008-11-23 20:14:19 +00:00
Pawel Jakub Dawidek	bcfbcdca9c	IFp4: Don't rely on disk IDs and always use vdev guids, which means always look up for components by reading metadata. This might be slower when there are big number of disks in the system, but is definiately more reliable.	2008-11-22 13:33:06 +00:00
Pawel Jakub Dawidek	74303ba55c	IFp4: Finish implemnetation of chflags(2) for ZFS. While doing this I found that zfs_access() can only handle VREAD, VWRITE and VEXEC, for the rest we need to use vaccess(9).	2008-11-22 13:24:44 +00:00
Pawel Jakub Dawidek	5189bf22c0	IFp4: Don't free pathname too soon, debugging code is still using it.	2008-11-22 13:22:24 +00:00
Doug Rabson	786895f6ba	Add definitions for ZFS pool version 13.	2008-11-21 09:10:35 +00:00
Doug Rabson	0d16312b46	Some zfsboot fixes from Norikatsu Shigemura: 1. zfsboot2 (boot2) doesn't %d (printf), so change %d to %u. 2. chase new zpool versioning as SPA_VERSION. Obtained from: sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Submitted by: nork	2008-11-19 16:59:19 +00:00
Pawel Jakub Dawidek	1ba4a712dd	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00
Edward Tomasz Napierala	4bdaada206	Require write access on a directory being moved from one parent directory to another in ZFS. Approved by: rwatson (mentor), pjd	2008-11-08 19:56:32 +00:00
Edward Tomasz Napierala	36d227d9ed	Backoff the last patch. It was overly restrictive - we want to check for write permission on target only when moving the target between two directories. Approved by: rwatson (mentor)	2008-11-06 22:28:04 +00:00
Edward Tomasz Napierala	b92eda309d	Change ZFS behaviour to match UFS: when moving (rename(2)) a subdirectory from one parent directory to another, in addition to the usual access checks one also needs write access to the subdirectory being moved. Approved by: rwatson (mentor), pjd	2008-11-06 19:17:58 +00:00
Craig Rodrigues	6a73ed4f46	Remove definition of KMEM_DEBUG accidentally brought in by latest DTrace import. Noticed by: thompsa	2008-11-05 20:32:13 +00:00
Craig Rodrigues	f5a97d1bcb	Merge latest DTrace changes from Perforce.	2008-11-05 19:39:11 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Attilio Rao	0d7935fd01	Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-10 21:23:50 +00:00
John Birrell	fd4cdfbf46	Disable use of the user credentials until there is code to set the levels that DTrace uses. This fixes a bug that would have affected kernels built with MAC and all kernels built after the mpsafetty integration. The bug will be apparent in RELENG7 on MAC kernels. Reported by: kan	2008-09-27 17:52:48 +00:00
Ed Schouten	6bfa9a2d66	Replace all calls to minor() with dev2unit(). After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere. This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware. Reviewed by: kib	2008-09-27 08:51:18 +00:00
Ed Schouten	d3ce832719	Remove unit2minor() use from kernel code. When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops. We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit(). Reviewed by: kib	2008-09-26 14:19:52 +00:00
Warner Losh	6e1a9d1739	Mips needs the same treatment for atomic_or_8 as the other RISCy architectures.	2008-09-18 19:57:06 +00:00
Pawel Jakub Dawidek	062ea27ee4	Add missing ZFS_EXIT(). PR: kern/124899 Submitted by: Masakazu Asama <m-asama@ginzado.ne.jp>	2008-09-15 11:27:25 +00:00
Edward Tomasz Napierala	dfa7fd1d70	Remove VSVTX, VSGID and VSUID. This should be a no-op, as VSVTX == S_ISVTX, VSGID == S_ISGID and VSUID == S_ISUID. Approved by: rwatson (mentor)	2008-09-10 13:16:41 +00:00
Pawel Jakub Dawidek	1b856fa491	Initialize vp, so we don't call VOP_UNLOCK() with NULL vnode pointer. Confirmed by: marcus	2008-09-07 07:55:12 +00:00
Pawel Jakub Dawidek	433751bb50	Lock vnode exclusively around insmntque().	2008-09-06 17:24:07 +00:00
Pawel Jakub Dawidek	7fa1f32a7e	Catch up after last insmntque() changes: - The vnode has to be locked exclusively before calling insmntque(). - Until I find a way to handle insmntque() failures use VV_FORCEINSMQ flag to force insmntque() to always succeed. Reported by: kris, trasz, des, others Suggested by: kib Tested by: trasz	2008-09-05 07:00:40 +00:00
Attilio Rao	59d4932531	Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions. Manpages are updated accordingly. Tested by: Diego Sardina <siarodx at gmail dot com>	2008-08-31 14:26:08 +00:00
Scott Long	a25cb00747	Ensure that the padding calcualtion doesn't return a negative value. Submitted by: kib Approved by: jb	2008-08-29 15:55:49 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Warner Losh	e6b3a7a9c1	Add MIPS support. Reviewed by: jb@	2008-08-23 04:58:11 +00:00
John Birrell	ac80559536	Add calls to callout_drain() to ensure the callouts are flushed before we free memory from underneath them. This fixes an occasional panic I've been seeing in softclock() where a bad pointer would be encountered when pushing DTrace hard.	2008-08-19 21:28:58 +00:00
Pawel Jakub Dawidek	37876323b1	We want to use LBOLT instead of lbolt on FreeBSD. I've this already fixed in p4, but the fix was never integrated into HEAD. Reported by: ed	2008-07-21 14:35:48 +00:00
Pawel Jakub Dawidek	28814ddbe8	We want to check new options given, not the current ones. This fixes 'zpool import -o <mntopt> <name>' not working properly.	2008-07-21 09:45:44 +00:00
Ed Schouten	3f7eea97fd	Remove the $FreeBSD$ tag again, now I know fbsd:nokeywords exists. Requested by: pjd Approved by: philip (mentor)	2008-06-12 08:53:54 +00:00
Ed Schouten	0f03ce1bb8	Turn dev2unit(), minor(), unit2minor() and minor2unit() into macro's. Now that we got rid of the minor-to-unit conversion and the constraints on device minor numbers, we can convert the functions that operate on minor and unit numbers to simple macro's. The unit2minor() and minor2unit() macro's are now no-ops. The ZFS code als defined a macro named `minor'. Change the ZFS code to use umajor() and uminor() here, as it is the correct approach to do this. Also add $FreeBSD$ to keep SVN happy. Approved by: philip (mentor), pjd	2008-06-12 08:30:54 +00:00
Ed Schouten	29d4cb241b	Don't enforce unique device minor number policy anymore. Except for the case where we use the cloner library (clone_create() and friends), there is no reason to enforce a unique device minor number policy. There are various drivers in the source tree that allocate unr pools and such to provide minor numbers, without using them themselves. Because we still need to support unique device minor numbers for the cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's that are used in combination with the cloner library should be marked with this flag to make the cloning work. This means drivers can now freely use si_drv0 to store their own flags and state, making it effectively the same as si_drv1 and si_drv2. We still keep the minor() and dev2unit() routines around to make drivers happy. The NTFS code also used the minor number in its hash table. We should not do this anymore. If the si_drv0 field would be changed, it would no longer end up in the same list. Approved by: philip (mentor)	2008-06-11 18:55:19 +00:00
John Birrell	4ca07625aa	Merge a recent change from the OpenSolaris source tree. (Don't ask for a vendor import of this yet, we're in the early days of svn) Instead of using cyclic timers to call the state clean and deadman callbacks, use a callout on FreeBSD to avoid the deadlock on FreeBSD due to trying to send interprocessor interrupts with interrupts disabled. Reported by: ps, jhb, peter, thompsa	2008-06-01 01:46:37 +00:00
Pawel Jakub Dawidek	ed5a2ac45c	Fix namespace collision after src/sys/sys/file.h:1.78.	2008-05-25 22:34:17 +00:00

... 3 4 5 6 7 ...

593 Commits