that I've been working on but put off committing until after the
RELENG_7 branch, including:
* New manpages: cpio.5 mtree.5
* New archive_entry_strmode()
* New archive_entry_link_resolver()
* New read support: mtree format
* Internal API change: read format auction only runs once
* Running the auction only once allowed simplifying a lot of bid logic.
* Cpio robustness: search for next header after a sync error
* Support device nodes on ISO9660 images
* Eliminate a lot of unnecessary copies for uncompressed archives
* Corrected handling of new GNU --sparse --posix formats
* Correctly handle a zero-byte write to a compressed archive
* Fixed memory leaks
Many of these improvements were motivated by the upcoming bsdcpio
front-end.
There have also been extensive improvements to the libarchive_test
test harness, which I'll commit separately.
of libarchive being used. I've been taking advantage of this
with a recent round of updates to libarchive_test so that it
can test older and newer versions of the library.
Approved by: re (Ken Smith)
* "compression_program" support uses an external program
* Portability: no longer uses "struct stat" as a primary
data interchange structure internally
* Part of the above: refactor archive_entry to separate
out copy_stat() and stat() functions
* More complete tests for archive_entry
* Finish archive_entry_clone()
* Isolate major()/minor()/makedev() in archive_entry; remove
these from everywhere else.
* Bug fix: properly handle decompression look-ahead at end-of-data
* Bug fixes to 'ar' support
* Fix memory leak in ZIP reader
* Portability: better timegm() emulation in iso9660 reader
* New write_disk flags to suppress auto dir creation and not
overwrite newer files (for future cpio front-end)
* Simplify trailing-'/' fixup when writing tar and pax
* Test enhancements: fix various compiler warnings, improve
portability, add lots of new tests.
* Documentation: document new functions, first draft of
libarchive_internals.3
MFC after: 14 days
Thanks to: Joerg Sonnenberger (compression_program)
Thanks to: Kai Wang (ar)
Thanks to: Colin Percival (many small fixes)
Thanks to: Many others who sent me various patches and problem reports.
* use "AR_GNU" as the format name instead of AR_SVR4 (it's what everyone is going to call it anyway)
* Simplify numeric parsing to unsigned (none of the numeric values should ever be negative); don't run off end of numeric fields.
* Finish parsing the common header fields before the next I/O request (which might dump the contents)
* Be smarter about format guessing and trimming filenames.
* Most of the magic values are only used in one place, so just inline them.
* Many more comments.
* Be smarter about handling damaged entries; return something reasonable.
* Call it a "filename table" instead of a "string table"
* Update tests.
Enable selection of 'ar', 'arbsd', and 'argnu' formats by name
(this allows bsdtar to create ar format archives).
The 'ar' writer still needs some work; it should reject
entries that aren't regular files and should probably also
strip leading paths from filenames.
* The ACL formatter was mis-formatting entries which had a
user/group ID but no name. Make the parser tolerant of
these, so that old archives can be correctly restored;
fix the formatter to generate correct entries.
* Fix overwrite detection by introducing a new "FAILED" return
code that indicates the current entry cannot be continued
but the archive as a whole is still sound.
* Header cleanup: Remove some unused headers, add some that
are required with new Linux systems.
* libarchive_test program exercises many of the core features
* Refactored old "read_extract" into new "archive_write_disk", which
uses archive_write methods to put entries onto disk. In particular,
you can now use archive_write_disk to create objects on disk
without having an archive available.
* Pushed some security checks from bsdtar down into libarchive, where
they can be better optimized.
* Rearchitected the logic for creating objects on disk to reduce
the number of system calls. Several common cases now use a
minimum number of system calls.
* Virtualized some internal interfaces to provide a clearer separation
of read and write handling and make it simpler to override key
methods.
* New "empty" format reader.
* Corrected return types (this ABI breakage required the "2.0" version bump)
* Many bug fixes.
a vanilla 2-clause BSD license, but somehow some confusing
extra verbage get copied from somewhere.
Also, update the copyright dates to 2007 for all of the files.
Prompted by: several questions about what those extra words really mean
wrap this within #if/#else/#endif so that it will only take effect once
ARCHIVE_API_VERSION is increased (which should happen on HEAD some time
between now and when RELENG_7 is branched).
* If write block size is zero, don't block at all.
This supports the unusual requirement of applications
that need "no-delay" writes.
* Expose _write_finish_entry() to give such applications more
control over write boundaries. (Normal applications do not
need this, as entries are completed automatically.)
* Correct the type of write callbacks; this is a minor API
change that does not affect the ABI.
* Correct the error handling in _write_next_header() around
completing the previous entry.
* Correct the documentation for block-size markers: Remove
docs for the long-defunct _read_set_block_size(); document
all of the write block size manipulators.
MFC after: 14 days
* Expose functions for setting the "skip file" dev/ino information
* Expose functions for setting/querying the block size on reads
* Correctly propagate errors out of archive_read_close/archive_write_close
* Update manpage with information about new functions
increases performance when extracting a single entry from a large
uncompressed archive, especially on slow devices such as USB hard
drives.
Requires a number of changes:
* New archive_read_open2() supports a 'skip' client function
* Old archive_read_open() is implemented as a wrapper now, to
continue supporting the old API/ABI.
* _read_open_fd and _read_open_file sprout new 'skip' functions.
* compression layer gets a new 'skip' operation.
* compression_none passes skip requests through to client.
* compression_{gzip,bzip2,compress} simply ignore skip requests.
Thanks to: Benjamin Lutz, who designed and implemented the whole thing.
I'm just committing it. ;-)
TODO: Need to update the documentation a little bit.
This commit implements storing/reading POSIX.1e-style extended
attribute information in "pax" format archives. An outline of the
storage format is in the tar.5 manpage. The archive_read_extract()
function has code to restore those archives to disk for Linux; FreeBSD
implementation is forthcoming.
Many thanks to Jaakko Heinonen for finding flaws in earlier
proposals and doing the bulk of the coding in this work.
archiver for Fourth Edition through Sixth Edition Unix; it was
replaced by tar in Seventh Edition. (First Edition through
Third Edition used "tap.")
Unfortunately, tp was not so very standard; there were a
few different variants. The code here attempts to support
what I believe were the most common variants.
tp support is not yet enabled by archive_read_support_format_all(),
as I'm not yet entirely comfortable with the detection
heuristics. People interested in experimenting can
add archive_read_support_format_tp() just after any calls
to archive_read_support_format_all() in bsdtar to see how
well this works.
TODO: tp format is roughly similar in structure to dump/restore
archive formats used by many systems. It should be possible
to generalize this code to handle many dump/restore variants.
Format detection heuristics are going to be rough, though.
Thanks to: Warren Toomey, whose very basic tp extraction programs
and documentation made this possible.
systems (or on FreeBSD systems when using ports).
2) Overhaul the versioning logic. In particular,
SHLIB_MAJOR number is now computed as "major+minor",
which ensures library versions are the same for
the FreeBSD build system and the portable
libtool/autoconf/automake build system.
Only supports "deflate" and "none" compression for now.
Also, add a few clarifications to the archive_read.3 manpage as
requested by William Dean DeVries.
This seems to be able to extract a TOC and extract files from
the couple of ISO images I've tested it with.
Treat this as experimental proof-of-concept code for the
moment. There are still a bunch of debug messages (there
are a few oddities in ISO9660 that I haven't yet figured
out how to handle), a lot of bugs to be addressed (this
code leaks memory very badly), and a lot of missing features (no
Rockridge support, in particular). I'd appreciate
feedback from anyone who understands ISO9660 format
better than I do. ;-)
Suggested by: Robert Watson
probably means it also requires a .so version
bump. Defer it until I finish some related
work on cleaning up error returns throughout
the library.
Thanks to: Conrad J. Sabatier
and close it) and "finish" (destroy the object) functions. For backwards
compat and simplicity, have "finish" invoke "close" transparently if needed.
This allows clients to close the archive and check end-of-operation
statistics before destroying the object.
is present for FreeBSD. If you "make distfile" on FreeBSD, you will
soon have a tar.gz file suitable for deploying to other systems
(complete with the expected "configure" script, etc). This latter
relies (at least for now) on the GNU auto??? tools. (I like autoconf
okay, but someday I hope to write a custom Makefile.in and dispense
with automake, which is somewhat odious.)
As part of this, I've cleaned up some of the conditional
compilation options, added make-foo to construct archive.h dynamically
(it now contains some version constants), and added some useful
informational files.
archive_version: Returns a text string, e.g., "libarchive 1.00.000"
archive_api_version: Returns the SHLIB major version
archive_api_feature: Returns a feature number useful for answering
questions such as "Is this recent enough to do XXX?"
The last two also have macros defined in archive.h, so you can compare
the compile-time and run-time environments. (In particular, you can
compare ARCHIVE_API_VERSION to archive_api_version() to detect library
version mismatches.)
With these in hand, it will soon be time to turn on the
shared-library version of libarchive... stay tuned.
* Rename some variables/functions/etc to try to make things clearer.
* Add separate flags to control fflag/acl restore
* Collect metadata restore into a single function for clarity
* Propagate errors in metadata restore back out to the client
* Fix some places where errors were being returned when they
shouldn't and vice-versa
* Modes are now always restored; ARCHIVE_EXTRACT_PERM just controls
whether or not umask is obeyed.
* Restore suid/sgid bits only if user/group matches archive
* Cache the last stat results to try to reduce the number of stat calls
This change also pointed out one API deficiency: the
archive_read_data_into_XXX functions were originally defined to return
the total bytes read. This is, of course, ambiguous when dealing with
non-contiguous files. Change it to just return a status value.
* New read_data_block is both sparse-file aware and uses zero-copy semantics
* Push read_data_block down into specific formats (opens door to
various encoded entry bodies, such as zip or gtar -S)
* Reimplement read_data, read_data_skip, read_data_into_fd in terms
of new read_data_block.
* Update documentation
It's unfortunate that I couldn't just call the new interface
archive_read_data, but didn't want to upset the API that much.
with 'star' ACL handling, though there's still a
bit more work needed in this area.
Added 'write_open_fd' and 'read_open_fd' to simplify, e.g.,
tar's u and r modes. Eliminated old 'write_open_file_position'
as a bad idea. (It required closing/reopening files to
do updates, which led to unpleasant implications.)
Various other minor fixes, API tweaks, etc.
* Disabled shared-library building, as some API breakage is
still likely. (I didn't realize it was turned on by default.) If
you have an existing /usr/lib/libarchive.so.2, I recommend deleting it.
* Pax interchange format now correctly stores and reads UTF8
for extended attributes. In particular, pax format can portably
handle arbitrarily long pathnames containing arbitrary characters.
* Library compiles cleanly at -O2, -O3, and WARNS=6 on all
FreeBSD-CURRENT platforms.
* Minor portability improvements inspired by Juergen Lock
and Greg Lewis. (Less reliance on stdint.h, isolating of
various portability-challenged constructs.)
* archive_entry transparently converts multi-byte <-> wide character
strings, allowing clients and format handlers to deal with either
one, as appropriate.
* Support for reading 'L' and 'K' entries in standard tar archives
for star compatibility.
* Recognize (but don't yet handle) ACL entries from Solaris tar.
* Pushed format-specific data for format readers down into
format-specific storage and out of library-global storage. This
should make it easier to maintain individual formats without mucking
with the core library management.
* Documentation updates to track the above changes.
* Updates to tar.5 to correct a few mistakes and add some additional
information about GNU tar and Solaris tar formats.
Notes:
* The basic 'tar' reader is getting more general; there's not much
point in keeping the 'gnutar' reader separate. Merging the two
would lose a bunch of duplicate code.
* The libc ACL support is looking increasingly inadequate for my needs
here. I might need to assemble some fairly significant code for
parsing and building ACLs. <sigh>
Portability: Thanks to Juergen Lock, libarchive now compiles cleanly
on Linux. Along the way, I cleaned up a lot of error return codes and
reorganized some code to simplify conditional compilation of certain
sections.
Bug fixes:
* pax format now actually stores filenames that are 101-154
characters long.
* pax format now allows newline characters in extended attributes
(this fixes a long-standing bug in ACL handling)
* mtime/atime are now restored for directories
* directory list is now sorted prior to fix-up to permit
correct restore of non-writable dir heirarchies
What it is:
A library for reading and writing various streaming archive
formats, especially tar and cpio. Being a library, it should
be easy to incorporate into pkg_* tools, sysinstall, and any
other place that needs to read or write such archives.
Features:
* Full automatic detection of both compression and archive format.
* Extensible internal architecture to make it easy to add new formats.
* Support for "pax interchange format," a new POSIX-standard tar format
that eliminates essentially all of the restrictions of historic formats.
* BSD license
Thanks to: jkh for pushing me to start this work, gordon for
encouraging me to commit it, bde for answering endless style
questions, and many others for feedback and encouragement.
Status: Pretty good overall, though there are still a few rough edges and
the library could always use more testing. Feedback eagerly solicited.