1
0
mirror of https://git.FreeBSD.org/src.git synced 2024-12-20 11:11:24 +00:00
Commit Graph

1607 Commits

Author SHA1 Message Date
Ulf Lilleengen
de02b15928 - Check flag with the bitwise operator, not the logical operator.
Submitted by:	arundel
MFC after:	1 week
2010-10-01 06:12:13 +00:00
Andrey V. Elsukov
b1da166ef1 Some schemes can allocate memory for internal purposes but when
GEOM does withering this memory doesn't freed. Add G_PART_DESTROY
call to g_part_wither. Also add missed g_free() call to G_PART_READ
method for MBR and PC98 schemes.

Submitted by:	jh (previous version)
Reviewed by:	pjd
Approved by:	kib (mentor)
2010-09-25 18:27:29 +00:00
Pawel Jakub Dawidek
f95168e08d Change g_eli_debug to int, so one can turn off any GELI output by setting
kern.geom.eli.debug sysctl to -1.

MFC after:	2 weeks
2010-09-25 10:32:04 +00:00
Pawel Jakub Dawidek
350e8df8de Ignore errors from BIO_FLUSH. It might confuse users that provider wasn't
really killed. What we really care about are write errors only.

MFC after:	2 weeks
2010-09-25 10:31:05 +00:00
Pawel Jakub Dawidek
cec283baf4 Allow to configure GPT attributes. It shouldn't be allowed to set bootfailed
attribute (it should be allowed only to unset it), but for test purposes it
might be useful, so the current code allows it.

Reviewed by:	arch@ (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>)
MFC after:	2 weeks
2010-09-24 19:33:47 +00:00
Pawel Jakub Dawidek
9839c97b4d Update copyright years.
MFC after:	1 week
2010-09-23 12:02:08 +00:00
Pawel Jakub Dawidek
9a5a1d1e1e Add support for AES-XTS. This will be the default now.
MFC after:	1 week
2010-09-23 11:58:36 +00:00
Pawel Jakub Dawidek
c6a26d4c88 Implement switching of data encryption key every 2^20 blocks.
This ensures the same encryption key won't be used for more than
2^20 blocks (sectors). This will be the default now.

MFC after:	1 week
2010-09-23 11:49:47 +00:00
Pawel Jakub Dawidek
1f0fb66f30 Make the code similar to the code in g_eli_integrity.c.
MFC after:	1 week
2010-09-23 11:23:10 +00:00
Pawel Jakub Dawidek
b35bfe7e10 Define default overwrite count, so that userland can use it.
MFC after:	1 week
2010-09-23 11:19:48 +00:00
Pawel Jakub Dawidek
5e6dce4bf0 When trashing metadata, flush after each write.
MFC after:	1 week
2010-09-23 10:43:37 +00:00
Brian Somers
0f81f3046d Support attaching version 4 metadata
Reviewed by:	pjd
2010-09-19 10:45:53 +00:00
Alexander Motin
659f684ea0 Add support for dumping kernel to gconcat.
Dumping goes to the component, where dump partition begins.
2010-09-16 17:24:25 +00:00
Pawel Jakub Dawidek
2738b715ea Change message when setting or unsetting attribute less confusing.
Before:

	ada0 has <attrib> set

After:

	<attrib> set on ada0

MFC after:	2 weeks
2010-09-15 21:15:00 +00:00
Pawel Jakub Dawidek
2f4e9a099b Make the message that informs about bootcode being written to disk less
confusing.

Note there is still no information about 'partcode' being written to disk
(gpart bootcode -p <partcode> <disk>).

Maybe in the future all the messages printed by gpart(8) on success could be
hidden under -v?

PR:		bin/150239
Reported by:	Roddi <roddi@me.com>
Submitted by:	arundel
MFC after:	2 weeks
2010-09-15 20:59:13 +00:00
Pawel Jakub Dawidek
8107ecf892 - Change all places where G_TYPE_ASCNUM is used to G_TYPE_NUMBER.
It turns out the new type wasn't really needed.
- Reorganize code a little bit.
2010-09-14 16:21:13 +00:00
Pawel Jakub Dawidek
b312136354 Simplify the code a bit. 2010-09-14 11:42:07 +00:00
Pawel Jakub Dawidek
946e2f3595 - Remove gc_argname field. It was introduced for gpart(8), but if I
understand everything correctly, we don't really need it.
- Provide default numeric value as strings. This allows to simplify
  a lot of code.
- Bump version number.
2010-09-13 13:48:18 +00:00
Pawel Jakub Dawidek
a478ea7490 - Allow to specify value as const pointers.
- Make optional string values always an empty string.
2010-09-13 08:56:07 +00:00
Justin T. Gibbs
f03f7a0ca3 Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic.
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.

The barrier semantics of bioq_insert_tail() were broken in two ways:

 o In bioq_disksort(), an added bio could be inserted at the head of
   the queue, even when a barrier was present, if the sort key for
   the new entry was less than that of the last queued barrier bio.

 o The last_offset used to generate the sort key for newly queued bios
   did not stay at the position of the barrier until either the
   barrier was de-queued, or a new barrier (which updates last_offset)
   was queued.  When a barrier is in effect, we know that the disk
   will pass through the barrier position just before the
   "blocked bios" are released, so using the barrier's offset for
   last_offset is the optimal choice.

sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
	o Update last_offset in bioq_insert_tail().

	o Only update last_offset in bioq_remove() if the removed bio is
	  at the head of the queue (typically due to a call via
	  bioq_takefirst()) and no barrier is active.

	o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
	  set prev to the barrier and cur to it's next element.  Now that
	  last_offset is kept at the barrier position, this change isn't
	  strictly necessary, but since we have to take a decision branch
	  anyway, it does avoid one, no-op, loop iteration in the while
	  loop that immediately follows.

	o In bioq_disksort(), bypass the normal sort for bios with the
	  BIO_ORDERED attribute and instead insert them into the queue
	  with bioq_insert_tail().  bioq_insert_tail() not only gives
	  the desired command order during insertion, but also provides
	  barrier semantics so that commands disksorted in the future
	  cannot pass the just enqueued transaction.

sys/sys/bio.h:
	Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.

sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
	Use an ordered command for SCSI/ATA-NCQ commands issued in
	response to bios with the BIO_ORDERED flag set.

sys/cam/scsi/scsi_da.c
	Use an ordered tag when issuing a synchronize cache command.

	Wrap some lines to 80 columns.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
	Mark bios with the BIO_FLUSH command as BIO_ORDERED.

Sponsored by:	Spectra Logic Corporation
MFC after:	1 month
2010-09-02 19:40:28 +00:00
Pawel Jakub Dawidek
efb46508ce Correct offset conversion to little endian. It was implemented in version 2,
but because of a bug it was a no-op, so we were still using offsets in native
byte order for the host. Do it properly this time, bump version to 4 and set
the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4.

MFC after:	2 weeks
2010-08-28 08:30:20 +00:00
Alexander Motin
3d7cfb15f5 Remove bintime_cmp() function, unused since r200086.
MFC after:	1 week
2010-08-18 15:38:10 +00:00
Andrey V. Elsukov
d02dc4cd41 Check that gsp is not NULL before access. It can be NULL
for some cases.

Approved by:	kib (mentor)
MFC after:	1 week
2010-08-03 11:21:17 +00:00
Andrey V. Elsukov
a45f4c6e2c Check that table is not NULL before access, it can be NULL
for some cases.

Approved by:	mav (mentor)
MFC after:	2 weeks
2010-08-03 09:10:48 +00:00
Andrey V. Elsukov
a80f05bb73 Forward ioctl requests to original geom.
PR:		148540
Silence from:	luigi
Reviewed by:	pjd
Approved by:	mav (mentor)
MFC after:	2 weeks
2010-08-02 10:30:49 +00:00
Andrey V. Elsukov
b6d4028166 Release access for consumers that are opened, but will be destroyed
indirectly by orphan method.

PR:		148688
Silence from:	marcel
Approved by:	mav (mentor)
MFC after: 	2 weeks
2010-08-02 10:26:15 +00:00
Alexander Motin
8edcf69406 Export PCI IDs of ATA/SATA controllers through CAM and ata(4) layers to
GEOM. This information needed for proper soft-RAID's on-disk metadata
reading and writing.
2010-07-25 15:43:52 +00:00
Andrey V. Elsukov
733a9e2783 Prevent access after free to table entry in case when
user deletes partition that not yet created (changes doesn't
committed to disk).

PR:		148687
Approved by:	mav (mentor)
MFC after:	7 days
2010-07-23 06:30:01 +00:00
Ruslan Ermilov
cf1457e4fd Fixed cache size decoding read from a label.
PR:		kern/144732
Submitted by:	Eugene Grosbein
MFC after:	3 days
2010-07-14 08:22:00 +00:00
Rui Paulo
c6b2b6fce6 Add NTFS partition type to GEOM_MBR. 2010-06-26 13:20:40 +00:00
Pawel Jakub Dawidek
2aa15ffdab 'unit' can be negative, so use signed type for it.
Found by:	Coverity Prevent
CID:		3731
MFC after:	3 days
2010-06-14 21:58:55 +00:00
Pawel Jakub Dawidek
15725379d0 BIO_DELETE contains range we want to delete and doesn't provide any useful
data, so there is no need to copy it to userland.

MFC after:	3 days
2010-06-14 21:56:24 +00:00
Andriy Gapon
1bdfff2252 fix a few cases where a string is passed via format argument instead of
via %s

Most of the cases looked harmless, but this is done for the sake of
correctness.  In one case it even allowed to drop an intermediate buffer.

Found by:	clang
MFC after:	2 week
2010-06-11 19:27:21 +00:00
Edward Tomasz Napierala
7ce513a52a Untangle g_print_bio(), silencing Coverity.
Found with:	Coverity Prevent
CID:		3566, 3567
2010-06-10 17:49:36 +00:00
Matt Jacob
59ccfe8176 Try and narrow the gap in which you act on an event that has been canceled.
Obtained from:	Jaako Heinonen
MFC after:	1 month
2010-06-08 22:40:02 +00:00
Edward Tomasz Napierala
c01eb2f36b Make sure not to pass NULL to g_orphan_provider().
Found with:	Coverity Prevent
CID:		3411
2010-06-05 08:00:52 +00:00
Marius Strobl
36066952e5 Don't leak memory on destruction.
Reviewed by:	marcel
MFC after:	3 days
2010-06-02 17:17:11 +00:00
Andriy Gapon
56b3acd001 g_label: fix possible NULL pointer dereference
in case glabel debug level is >= 1 and gp->provider list is empty
for some reason

Found by:	clang static analyzer
MFC after:	4 days
2010-05-31 09:10:39 +00:00
Marius Strobl
785c3f7ea4 Fix some whitespace nits. 2010-05-24 17:33:02 +00:00
Nathan Whitehorn
0532c3a5a5 Teach gpart about bootcode on APM. 2010-05-16 22:21:33 +00:00
Matt Jacob
87e7f7be89 Yet another potential dereference of a dead provider.
Sponsored by:   Panasas
MFC after:	1 week
2010-05-14 21:27:39 +00:00
Matt Jacob
1371a457d9 Make sure to check that the active provider pointer points to something before
dereferencing the pointer.

Sponsored by:   Pansas
MFC after:	1 week
2010-05-14 16:56:18 +00:00
Jaakko Heinonen
3535526b15 - Don't return EAGAIN from gv_unload(). It was used to work around the
deadlock fixed in r207671.
- Wait for worker process to exit at class unload. The worker process
  was not guaranteed to exit before the linker unloaded the module.
- Use 0 as the worker process exit status instead of ENXIO and style
  the NOTREACHED comment.

Reviewed by:	lulf
X-MFC after:	r207671
2010-05-10 19:12:23 +00:00
Jaakko Heinonen
5a279fc5fc In g_zero_destroy_geom(), return 0 instead of EBUSY in the success case.
EBUSY was probably used as a workaround for the deadlock fixed in r207671.

Approved by:	pjd
X-MFC after:	r207671
2010-05-10 19:08:53 +00:00
Ulf Lilleengen
42a9ad6697 - Remove obsolete flags.
MFC after:	1 week
2010-05-08 16:19:17 +00:00
Jaakko Heinonen
9061251f9a Fix deadlock between GEOM class unloading and withering. Withering can't
proceed while g_unload_class() blocks the event thread. Fix this by not
running g_unload_class() as a GEOM event and dropping the topology lock
when withering needs to proceed.

PR:		kern/139847
Silence on:	freebsd-geom
2010-05-05 18:53:24 +00:00
Marcel Moolenaar
c74f160cb0 Re-calculate a geometry when reprobing as well.
PR:		kern/145452
Reported by:	"Andrey V. Elsukov" <bu7cher@yandex.ru>
2010-04-25 01:56:39 +00:00
Marcel Moolenaar
6f702278e6 Fix undo for schemes that have internal partitions. Internal partitions
do not constitute user-visible or active partitions and as such should
not prevent undoing pending operations.

While here, initialize the last usable sector for the placeholder geom
based on the null scheme, created to allow undoing the destruction of
a scheme. This gives consistent output with "gpart show".

Based on a patch from:	"Andrey V. Elsukov" <bu7cher@yandex.ru>
2010-04-25 00:54:11 +00:00
Marcel Moolenaar
3f71c319f4 Implement the resize verb and add support for resizing partitions
for all schemes but EBR. Quality work by Andrey!

Submitted by:	"Andrey V. Elsukov" <bu7cher@yandex.ru>
2010-04-23 03:11:39 +00:00
Jaakko Heinonen
002d1d1c38 Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don't
assert that the topology lock is held when g_valid_obj() is called from
debugger.

MFC after:	1 week
2010-04-19 20:07:35 +00:00
Pawel Jakub Dawidek
31c4cef715 Use lower priority for GELI worker threads. This improves system
responsiveness under heavy GELI load.

MFC after:	3 days
2010-04-15 16:34:06 +00:00
Andriy Gapon
2a842317eb g_io_check: respond to zero pp->mediasize with ENXIO
Previsouly this condition was reported with EIO by bio_offset > mediasize
check.
Perhaps that check should be extended to bio_offset+bio_length > mediasize.

MFC after:	1 week
2010-04-15 08:39:56 +00:00
Luigi Rizzo
83f8218814 fix copyright format, as requested by Joel Dahl 2010-04-13 09:56:17 +00:00
Luigi Rizzo
c36cf6fbbc make code compile with KTR 2010-04-13 09:53:08 +00:00
Luigi Rizzo
1831a90ac5 Bring in geom_sched, support for scheduling disk I/O requests
in a device independent manner. Also include an example anticipatory
scheduler, gsched_rr, which gives very nice performance improvements
in presence of competing random access patterns.

This is joint work with Fabio Checconi, developed last year
and presented at BSDCan 2009. You can find details in the
README file or at

http://info.iet.unipi.it/~luigi/geom_sched/
2010-04-12 16:37:45 +00:00
Andriy Gapon
8f128ff559 g_vfs_open: allow only one mount per device vnode
In other words, deny multiple read-only mounts of the same device.
Shared read-only mounts should theoretically be possible, but,
unfortunately, can not be implemented correctly using current
buffer cache code/interface and results in an eventual system crash.
Also, using nullfs seems to be a more efficient way to achieve the same
goal.

This gets us back to where we were before GEOM and where other BSDs are.

Submitted by:	pjd (idea for checking for shared mounting)
Discussed with:	phk, pjd
Silence from:	fs@, geom@
MFC after:	2 weeks
2010-04-03 08:53:53 +00:00
Andriy Gapon
1b4bc5f851 bo_bsize: revert r205860 and take an alternative approch in getblk
In r205860 I missed the fact that there is code that strongly assumes
that devvp bo_bsize is equal to underlying provider's sectorsize.
In those places it is hard to obtain the sectorsize in an alternative
way if devvp bo_bsize is set to something else.
So, I am reverting bo_bsize assigment in g_vfs_open.
Instead, in getblk I use DEV_BSIZE block size for b_offset calculation
if vp is a disk vp as reported by vn_isdisk.  This should coinside with
vp being a devvp.

Reported by:	Mykola Dzham <i@levsha.me>
Tested by:	Mykola Dzham <i@levsha.me>
Pointyhat to:	avg
MFC after:	2 weeks
X-ToDo:		convert bread(devvp) in all fs to use bo_bsize-d blocks
2010-04-02 15:12:31 +00:00
Andriy Gapon
0c04f06072 g_vfs_open: correctly set devvp.v_bufobj.bo_bsize to DEV_BSIZE
Because of how breadn -> bufstrategy -> g_vfs_strategy are currently
implemented, bread on devvp always expects DEV_BSIZE block size.
Thus, devvp bo_bsize must always be DEV_BSIZE irrespective of media
properties or filesystem implementation details.

Reviewed by:	mckusick
MFC after:	2 weeks
2010-03-29 20:34:25 +00:00
Matt Jacob
2b4969ff9e Change how multipath labels are created and managed. This makes it easier
to support various storage boxes which really aren't active-active.

We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.

A usage implication is that you should specificy the currently active
storage path as the first provider.

Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).

Sponsored by:	Panasas
MFC after:	1 month
2010-03-29 18:04:06 +00:00
Alexander Motin
a5be8eb530 Do not fetch precise time of request start when stats collection disabled.
Reviewed by:	pjd, phk
2010-03-24 18:04:25 +00:00
Matt Jacob
b5dce617d8 Add 'rotate' and 'getactive' verbs to provide some control and information
about what the currently active path is.

Sponsored by:	Panasas
MFC after:	1 month
2010-03-21 15:02:47 +00:00
Jaakko Heinonen
a41aa4a789 Escape characters unsafe for XML output in GEOM class, instance and
provider names.

- Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced
  with '?'. Those characters are disallowed in XML.
- '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are
  replaced with XML numeric character reference.

If the kern.geom.confxml sysctl provides invalid XML, libgeom
geom_xml2tree() fails and utilities using it do not work. Unsafe
characters are common in msdosfs and cd9660 labels.

PR:		kern/104389
Submitted by:	Doug Steinwand (original version)
Reviewed by:	pjd
Discussed on:	freebsd-geom
MFC after:	3 weeks
2010-03-20 16:16:13 +00:00
Pawel Jakub Dawidek
b0990a1dae Simplify loops. 2010-03-18 13:11:43 +00:00
Ulf Lilleengen
77d2a01ea8 - Set missing flag when initiating a plex rebuild with the rebuildparity
command.
- Check if plex is already syncing or rebuilding before initiating a parity
  rebuild or check.
2010-03-08 21:16:28 +00:00
Pawel Jakub Dawidek
32115b105a Please welcome HAST - Highly Avalable Storage.
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.

HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.

For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.

Sponsored by:	FreeBSD Foundation
Sponsored by:	OMCnet Internet Service GmbH
Sponsored by:	TransIP BV
2010-02-18 23:16:19 +00:00
Pawel Jakub Dawidek
12f35a615a - Style fixes.
- Prefer strlcpy() over strncpy().
2010-02-18 22:29:35 +00:00
Pawel Jakub Dawidek
f24bf7522d Correct comment. 2010-02-18 22:28:12 +00:00
Pawel Jakub Dawidek
e5131ab452 Log attach just like we log detach. 2010-02-18 22:27:38 +00:00
Oleksandr Tymoshenko
45a7687f90 - Give geom_redboot taste of flash/spi. Now there is another provider
of redboot partitions. This patch was missed during merge from
    projects/mips.
2010-02-03 01:12:19 +00:00
Xin LI
38907b4cc7 Prevent NULL deference by checking return value of
gctl_get_asciiparam.

MFC after:	2 weeks
2010-02-02 22:25:22 +00:00
Marcel Moolenaar
cd18ad8347 Export the UUID of the partition in the XML. The partition UUID is used
by EFI's device path to identify a partition. In order for FreeBSD to
add EFI boot options, proper device paths need to be constructed.
2010-01-30 23:13:19 +00:00
Ivan Voras
49e232f2c9 Go through with write_metadata() non-error-handling and make it return "void".
This is mostly to avoid dead variable assignment warning by LLVM.
No functional change.

Pointed out by:	trasz
Approved by:	gnn (mentor)
2010-01-25 20:51:40 +00:00
Edward Tomasz Napierala
fdf64c5752 Remove unneeded variables.
Found with:	clang
2010-01-25 17:00:21 +00:00
Edward Tomasz Napierala
1373012510 Remove pointless assignment.
Found with:	clang
2010-01-25 16:58:58 +00:00
Edward Tomasz Napierala
dc9098605e Remove some pointless variable assignments.
Found with:	clang
2010-01-25 16:55:30 +00:00
Edward Tomasz Napierala
0a36cb97a8 Remove unused variable.
Found with:	clang
2010-01-25 16:10:22 +00:00
Xin LI
35daa28f30 Expose stripe offset and stripe size through libgeom and geom(8) userland
utilities.

Reviewed by:	pjd, mav (earlier version)
2010-01-17 06:20:30 +00:00
Edward Tomasz Napierala
b3f9d8c804 Add gmountver, disk mount verification GEOM class.
Note that due to e.g. write throttling ('wdrain'), it can stall all the disk
I/O instead of just the device it's configured for.  Using it for removable
media is therefore not a good idea.

Reviewed by:	pjd (earlier version)
2010-01-16 09:52:49 +00:00
Alexander Motin
0c8fd0c8ac Change the way in which zero stripesize is handled. Instead of reporting
zero stripeoffset in such case (as if device has no stripes), report offset
from the beginning of the media (as if device has single infinite stripe).

This gives partitioning tools information, required to guess better
partition alignment, in case if hardware doesn't report it's stripe size.
For example, it should give disklabel info about odd offset made by fdisk.
2010-01-06 13:14:37 +00:00
Alexander Motin
8de5811320 Move wakeup() out of mutex to reduce contention. 2010-01-05 10:52:21 +00:00
Alexander Motin
86de0ca52c Move wakeup() out of mutex to reduce contention. 2010-01-05 10:30:56 +00:00
Alexander Motin
06b215fd3a Slightly optimize XOR calculation. 2010-01-05 02:06:05 +00:00
Marcel Moolenaar
665bb830e2 Properly return the UUID represented by the alias.
PR:		142174
Submitted by:	Przemyslaw Laczynski <torindel@gmail.com>
Pointy hat to:	rpaulo
2010-01-02 01:02:59 +00:00
Alexander Motin
0d883b11e3 Call wakeup() only for the first request on the queue. 2009-12-30 17:23:27 +00:00
Antoine Brodin
13e403fdea (S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.
Fix some wrong usages.
Note: this does not affect generated binaries as this argument is not used.

PR:		137213
Submitted by:	Eygene Ryabinkin (initial version)
MFC after:	1 month
2009-12-28 22:56:30 +00:00
Alexander Motin
1c80ec0a6b Add BIO_DELETE support to ada(4):
- For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by
ACS-2 specification working draft.
- For CompactFlash use CFA ERASE command, same as ad(4) does.

With this patch, `newfs -E /dev/ada1` was able to restore write speed of
my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level
for the most part of it's capacity. Previous 1.3 firmware, even reportiong
TRIM capabilty bit set, was not working, reporting ABORT error for every
DSM command.

I have no idea whether it is normal, but for some reason it takes 200ms
to handle any TRIM command on this drive, that was making delete extremely
slow. But TRIM command is able to accept long list of LBAs and the length of
that list seems doesn't affect it's execution time. Implemented request
clusting algorithm allowed me to rise delete rate up to reasonable numbers,
when many parallel DELETE requests running.
2009-12-28 20:08:01 +00:00
Alexander Motin
5f9b1143ac Make geom_concat to passthrough stripe parameters of the first component,
hoping that rest will fit.
2009-12-24 14:32:21 +00:00
Alexander Motin
113d8e5046 As soon as geom_raid3 reports it's own stripe as sector size, report largest
underlying provider's stripe, multiplied by number of data disks in array,
due to transformation done, as array stripe.
2009-12-24 13:38:02 +00:00
Alexander Motin
92f60381d9 As soon as mirror has no own stripes, report largest stripe of unrerlying
components, hoping others fit, if they are not equal.
2009-12-24 12:17:22 +00:00
Alexander Motin
8b30323843 Add two disk ioctls, giving user-level tools information about disk/array
stripe (optimal access block) size and offset.
2009-12-24 11:05:23 +00:00
Alexander Motin
f00919d2fc Make geom_stripe report it's stripe size to upper layers. 2009-12-24 10:43:44 +00:00
Alexander Motin
d4060fa67d Make graid3 fallback to malloc() when component request size is bigger
then maximal prepared UMA zone size. This fixes crash with MAXPHYS > 128K.
2009-12-21 23:31:03 +00:00
Rui Paulo
33f7a4124d Add Microsoft and NetBSD partition types handling. 2009-12-14 20:26:27 +00:00
Rui Paulo
f13174303d Simplify partition type parsing by using a data-oriented model.
While there add more Apple and Linux partition types.
2009-12-14 20:04:06 +00:00
Alexander Motin
891852cc12 Change 'load' balancing mode algorithm:
- Instead of measuring last request execution time for each drive and
choosing one with smallest time, use averaged number of requests, running
on each drive. This information is more accurate and timely. It allows to
distribute load between drives in more even and predictable way.
- For each drive track offset of the last submitted request. If new request
offset matches previous one or close for some drive, prefer that drive.
It allows to significantly speedup simultaneous sequential reads.

PR:		kern/113885
Reviewed by:	sobomax
2009-12-03 21:47:51 +00:00
Edward Tomasz Napierala
3ce9ca8947 Provide a set of sysctls and tunables to disable device node creation
for specific "kinds" of disk labels - for example, GPT UUIDs.  Reason
for this is that sometimes, other GEOM classes attach to these device
nodes instead of the proper ones - e.g. they attach to /dev/gptid/XXX
instead of /dev/ada0p2, which is annoying.

Reviewed by:	pjd (earlier version)
MFC after:	1 month
2009-11-28 11:57:43 +00:00
Rui Paulo
f9d551f7df Add a missing check for Apple HFS partitions.
MFC after:	1 week
2009-11-12 19:30:49 +00:00
Robert Noland
a59a131093 We need to allocate space for the header in the create path also.
This fixes a null pointer dereference with "gpart create -s GPT" after
the previous commit.

Reported by:	Yuri Pankov
Pointyhat to:	me
MFC after:	1 week
2009-11-12 16:28:39 +00:00
Robert Noland
1c2dee3cc9 Fix handling of GPT headers when size is > 92 bytes.
It is valid for an on-disk GPT header to report a header size which is
greater than 92 bytes.  Previously, we would read in the sector and copy
only the 92 bytes that we know how to deal with before calculating the
checksum for comparison.  This meant that when we did the checksum, we
overshot the buffer and took in random memory, so the checksum would fail.

We now determine the size of the header and allocate enough space to
preserve the entire on-disk contents.  This allows us to be correctly
calculate the checksum and be able to modify and write the header back
to the disk, while preserving data that we might not understand.

Reported by:	Kris Weston
Approved by:	marcel@
MFC after:	2 weeks
2009-11-07 17:29:03 +00:00
Robert Noland
e80d42dda2 Set the active flag in the PMBR when we install bootcode on a GPT
partitioned disk.  Some BIOS require this to be set before they will
boot the device.

Approved by:	marcel
MFC after:	2 weeks
2009-10-14 19:24:01 +00:00
Pawel Jakub Dawidek
f8727e71d7 If provider is open for writing when we taste it, skip it for classes that
depend on on-disk metadata. This was we won't attach to providers that are used
by other classes. For example we don't want to configure partitions on da0 if
it is part of gmirror, what we really want is partitions on mirror/foo.

During regular work it works like this: if provider is open for writing a class
receives the spoiled event from GEOM and detaches, once provider is closed the
taste event is send again and class can rediscover its metadata if it is still
there.  This doesn't work that way when new class arrives, because GEOM gives
all existing providers for it to taste, also those open for writing. Classes
have to decided on their own if they want to deal with such providers (eg.
geom_dev) or not (classes modified by this commit).

Reported by:	des, Oliver Lehmann <lehmann@ans-netz.de>
Tested by:	des, Oliver Lehmann <lehmann@ans-netz.de>
Discussed with:	phk, marcel
Reviewed by:	marcel
MFC after:	3 days
2009-10-09 09:42:22 +00:00
Ulf Lilleengen
a8a3cd7d9d - Improve error message consistency and wording. 2009-10-05 08:44:31 +00:00
Marcel Moolenaar
b61808630d The first 96 bytes may not be zeroes. It can contain trivial boot
code that merely emits an error and waits for a key press before
rebooting. The error being that extended partitions are not
bootable. The origin is presumed to be Windows 2000; Windows XP
does not do this...

For now, ignore the first 96 bytes when checking that the EBR is
(for the most part) all zeroes.

Tested by:	Mario Lobo <mlobo@digiart.art.br>
MFC after:	1 week
2009-09-28 23:52:47 +00:00
Marcel Moolenaar
87f4470620 Don't create more partitions than can fit in the table by checking
that the index is within bounds.
2009-09-24 06:00:49 +00:00
Edward Tomasz Napierala
bb3fd7ff4f Remove unused variable. 2009-09-08 17:20:17 +00:00
Alexander Motin
18e42503ed Do not check proper request alignment here in geom_dev in production.
It will be checked any way later by g_io_check() in g_io_schedule_down().
It is only needed here to not trigger panic from additional check, when
INVARIANTS enabled. So cover it with #ifdef INVARIANTS. It saves two
64bit divisions per request.
2009-09-08 05:46:38 +00:00
Alexander Motin
7fc019af65 MFp4:
Remove msleep() timeout from g_io_schedule_up/down(). It works fine
without it, saving few percents of CPU on high request rates without
need to rearm callout twice per request.
2009-09-06 19:33:13 +00:00
Pawel Jakub Dawidek
b740e905a4 Add support for changing providers priority.
Submitted by:	Mel Flynn
2009-09-06 06:52:06 +00:00
Alexander Motin
af582ea7af Remove artificial MAX_IO_SIZE constant, equal to DFLTPHYS * 2. Use MAXPHYS
instead. It is NULL change for GENERIC kernel, but allows 'fast' mode to
work on systems with increased MAXPHYS.
2009-09-04 19:20:46 +00:00
Pawel Jakub Dawidek
e93f5e4d25 Simplify g_disk_ident_adjust() function and allow any printable character
in serial number.

Discussed with:	trasz
Obtained from:	Wheel Sp. z o.o. (http://www.wheel.pl)
2009-09-04 09:39:06 +00:00
Pawel Jakub Dawidek
07a93e6b3c There's no need for checking result of M_WAITOK allocation. 2009-08-27 08:40:51 +00:00
Pawel Jakub Dawidek
c16ce31b31 Fix an obvious topology lock leak.
MFC after:	3 days
2009-08-27 08:28:34 +00:00
Marcel Moolenaar
8530137252 The start of the EFI GPT partition in the PMBR can always be represented
by CHS addressing. Don't define these fields as 0xff, but rather define
them correctly. This prevents boot problems on PCs where GPT is being
used.

PR:		115406
Submitted by:	Kent Hauser <kent@khauser.net>
Approved by:	re (kib)
2009-08-17 16:16:46 +00:00
Ulf Lilleengen
b79cac0f92 - Fix the issue with read access count modification on RAID-5 plexes properly.
If the access counts were not increased and decreased in equal numbers by
  gvinum consumers, the read access count would be inconsistent with the write
  access count. Instead, modify the read access count with the write access
  count directly to prevent any inconsistencies.

Approved by:	re (kib)
2009-07-18 11:12:48 +00:00
Marcel Moolenaar
f43b57e32a Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c
is invalid because the ioctl happens without prior open. The ioctl
got introduced to provide backward compatibility for extended
partitions, but it ended up not being used because it didn't work
as expected. Since there are no consumers of the ioctl and the
implementation is broken, the best fix is to remove the code
entirely.

Spotted by:	phk
Approved by:	re (kensmith)
2009-07-08 05:56:14 +00:00
Edward Tomasz Napierala
8edfe76ab5 Fix a panic which (reportedly) can happen when unmounting a filesystem
with I/O requests in flight on kernels compiled with "options INVARIANTS".
Also, make it obvious it's not right to call g_valid_obj() (and macros
using it, e.g. G_VALID_CONSUMER()) without topology lock held.

Approved by:	re (kib)
Reported by:	pho
2009-07-01 20:16:29 +00:00
Edward Tomasz Napierala
fb231f3627 Make gjournal work with kernel compiled with "options DIAGNOSTIC".
Previously, it would panic immediately.

Reviewed by:	pjd
Approved by:	re (kib)
2009-06-30 14:34:06 +00:00
Ulf Lilleengen
ac2a008e69 - Apply the same naming rules of LVM names as done in the LVM code itself.
PR:		kern/135874
2009-06-24 22:09:30 +00:00
John Hay
65a4957806 Do not stop the loop when an empty or deleted directory entry is found.
Rather just skip over it.
2009-06-24 06:42:13 +00:00
Ivan Voras
63f4d880e0 Fix tabs, slightly improve comments.
Approved by:	gnn (mentor) (original)
Noticed by:	stas
2009-06-18 11:12:11 +00:00
Ivan Voras
452f657cb9 Add support for labels derived from GPT metadata.
Approved by:	gnn (mentor)
Reviewed by:	pjd
PR:		128398
Submitted by:	Marius Nuennerich < marius at nuenneri.ch >
2009-06-13 00:27:03 +00:00
Luigi Rizzo
6231f75bcf As discussed in the devsummit, introduce two fields in the
struct bio to store classification information, and a hook
for classifier functions that can be called by g_io_request().

This code is from Fabio Checconi as part of his GSOC work.
2009-06-11 09:55:26 +00:00
Pawel Jakub Dawidek
cb9b72ce4a Simplify. 2009-06-05 23:35:43 +00:00
Doug Barton
8b3bfb0509 Crank the debug level necessary to display the "Label foo is removed"
and "Label for provider ..." messages up from 0 to 1.
2009-05-30 22:31:52 +00:00
Jamie Gritton
76ca6f88da Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex.  Jails may
have their own host information, or they may inherit it from the
parent/system.  The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL.  The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by:	bz (mentor)
2009-05-29 21:27:12 +00:00
Ulf Lilleengen
4147dd02cd - Unbreak 64 bit platforms by casting off_t to intmax. 2009-05-26 14:15:06 +00:00
Ulf Lilleengen
6d66da20b7 - Fix wrong print on BIO_DONE.
- Use db_printf instead of printf. While here, apply this to other ddb commands
  as well.

Pointed out by:		pjd
2009-05-26 10:03:44 +00:00
Ulf Lilleengen
bf7d2c1797 - Add 'show bio' DDB command.
MFC after:	3 weeks
2009-05-26 07:29:17 +00:00
Edward Tomasz Napierala
916cd41c47 Check return value of gctl_get_asciiparam().
Found with:	Coverity Prevent(tm)
CID:		1118
2009-05-12 16:59:50 +00:00
Attilio Rao
dfd233edd5 Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
Ulf Lilleengen
d8d015cddc - Split up the BIO queue into a queue for new and one for completed requests.
This is necessary for two reasons:
  1) In order to avoid collisions with the use of a BIOs flags set by a consumer
     or a provider
  2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was
     available, so the consumer flags of a BIO had to be misused in order to
     support enough flags. The new queue makes it possible to recycle the
     GV_BIO_DONE flag into GV_BIO_GROW.
  As a consequence, gvinum will now work with any other GEOM class under it or
  on top of it.

- Use bio_pflags for storing internal flags on downgoing BIOs, as the requests
  appear to come from a consumer of a gvinum volume. Use bio_cflags only for
  cloned BIOs.
- Move gv_post_bio to be used internally for maintenance requests.
- Remove some cases where flags where set without need.

PR:		kern/133604
2009-05-06 19:34:32 +00:00
Ulf Lilleengen
41944888fe - Fix a case where a RAID5 volume would think that it is supposed to grow a new
subdisk after a parity rebuild.
2009-05-06 19:18:19 +00:00
Ulf Lilleengen
11c4adc49e - Check if any plexes are doing internal maintenance before removing them. 2009-05-06 19:06:28 +00:00
Ulf Lilleengen
5a0fa8531c - Add forgotten KASSERT. 2009-05-06 18:37:32 +00:00
Ulf Lilleengen
1d8dfc60f4 - Fix a bug where the bio_data field of the wrong BIO is freed if an error
occurs when doing a RAID5 request.
2009-05-06 18:27:28 +00:00
Ulf Lilleengen
451b95f489 - GV_BIO_RETRY is not used, and it is actually impossible with more than 8
values for bio_cflags/bio_pflags.
2009-05-06 18:24:56 +00:00
Ulf Lilleengen
040272465d - Split the queue mutex into one for the event queue and one for the BIO queue,
as they do not really relate and to prepare for an additional queue to be
  covered by the BIO queue mutex.
- Implement wrappers for fetching the next element from the event queue as well
  as for putting a new element into the BIO queue.
2009-05-06 18:21:48 +00:00
Ulf Lilleengen
ad75dd77e0 - Make the gvinum softc invisible to userland, as it is not needed. 2009-05-04 17:30:20 +00:00
Ulf Lilleengen
697ab8be86 - Remove assertion of topology lock remaining from 7.x gvinum. It is not needed,
as the renaming only changes internal gvinum names and will not alter the geom
  topology.
- The topology lock was not held when calling g_wither_geom after renaming.
2009-04-18 16:36:27 +00:00
Marcel Moolenaar
cce94b6583 Precision '*' expects an int and strlen() returns a size_t.
Compensate.
2009-04-16 05:52:47 +00:00
Marcel Moolenaar
6ad9a99f21 Add a compat option to the EBR scheme that controls the
naming of the partitions (GEOM_PART_EBR_COMPAT).  When
compatibility is enabled, changes to the partitioning are
disallowed.

Remove the device name aliasing added previously to provide
backward compatibility, but which in practice doesn't give
us anything.

Enable compatibility on amd64 and i386.
2009-04-15 22:38:22 +00:00
Ulf Lilleengen
1de45ea74d - Move out allocation part of different gvinum objects into its own routine and
make use of it in the gvinum userland code.
2009-04-10 08:50:14 +00:00
Andrew Thompson
853a10a581 Revert r190676,190677
The geom and CAM changes for root_hold are the wrong solution for USB design
quirks.

Requested by:	scottl
2009-04-10 04:08:34 +00:00
Marcel Moolenaar
94fe30d0ca Don't use hexadecimal in the EBR partition names, because 'a'..'f'
are more commonly known as BSD partition names.

Discussed with: ivoras@
2009-04-08 16:18:16 +00:00
Andrew Thompson
31da42bab2 Add interleaving root hold tokens from the CAM probe to disk_create and geom
provider tasting. This is needed for disk attachments that happen after threads
are running in the boot process.

Tested by:	rnoland
2009-04-03 19:49:33 +00:00
Andrew Thompson
626fc9fe3d Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called
in situations where sleeping isnt allowed.
2009-04-03 19:46:12 +00:00
Marcel Moolenaar
daca55549f The 9 bytes immediately prior to the partition table can contain
signatures or disk serial numbers. Don't assume those to be zero
in all cases. This fixes a false negative.

Tested by: avatar@mmlab.cse.yzu.edu.tw
2009-04-03 05:54:49 +00:00
Marcel Moolenaar
c146965cd1 Sharpen the saw:
o  PC98 uses 32-bit block numbers. Limit the scheme to 2^32-1
   blocks when the media is larger. The 32-bit block numbers
   are implicit (16-bit cylinder * 8-bit head * 8-bit sector).
2009-03-30 01:03:58 +00:00
Marcel Moolenaar
6154e492ec Sharpen the saw:
o  MBR uses 32-bit block numbers. Limit the scheme to 2^32-1
   blocks when the media is larger.
2009-03-30 00:53:46 +00:00
Marcel Moolenaar
f5f875ed84 Sharpen the saw:
o  EBR uses 32-bit block numbers. Limit the scheme to 2^32-1
   blocks when the media is larger.
o  Calculate the number of entries based on the rounded media
   size, rather than the raw media size.
2009-03-30 00:48:42 +00:00