|
Go to the previous, next
chapter.
The following information may be obsolete or inaccurate. Please take it with a
grain of salt (or even two :-) for the time being.
Here is a summary of differences between tar and cpio. The
accuracy of the following information has not been verified. The following people
contributed to this section, mainly through a survey conducted in 1991. The remainder of
this section does not otherwise try to relate topics to people.
Bent Bertelsen
David Hoopes talgras!david
Guy Harris
Kai Petzke -berlin.de
Kristen Nielsen
Leslie Mikesell
tar handles symbolic links in the form in which it comes in BSD; cpio
doesn't handle symbolic links in the form in which it comes in System V prior to S5R4, and
some vendors may have added symlinks to their system without enhancing cpio
to know about them. Others may have enhanced it in a way other than the way I did it at
Sun, and which was adopted by AT&T (and which is, I think, also present in the cpio
that Berkeley picked up from AT&T and put into a later BSD release - I think I gave
them my changes).
(S5R4 does some funny stuff with tar; basically, its cpio can
handle tar format input, and write it on output, and it probably handles
symbolic links. They may not have bothered doing anything to enhance tar as a
result.)
cpio handles special files; tar, unless you're talking about
a POSIXish version, doesn't.
tar comes with V7, System III, System V, and BSD source; cpio
comes only with System III, System V, and later BSD (4.3-tahoe and later).
tar's way of handling multiple hard links to a file can handle file
systems that support 32-bit inumbers (e.g., the BSD file system); cpios way
requires you to play some games (in its "binary" format, i-numbers are only 16
bits, and in its "portable ASCII" format, they're 18 bits - it would have to
play games with the "file system ID" field of the header to make sure that the
file system ID/i-number pairs of different files were always different), and I don't know
which cpios, if any, play those games. Those that don't might get confused
and think two files are the same file when they're not, and make hard links between them.
tars way of handling multiple hard links to a file places only one copy of
the link on the tape, but the name attached to that copy is the *only* one you can use to
retrieve the file; cpios way puts one copy for every link, but you can
retrieve it using any of the names.
>What type of check sum (if any) is used, and how is this calculated.
See the attached manual pages for tar and cpio format. tar
uses a checksum which is the sum of all the bytes in the tar header for a
file; cpio uses no checksum.
>If anyone knows why cpio was made when tar was
prasent >at the unix scene,
It wasn't. cpio first showed up in PWB/UNIX 1.0; no generally-available
version of UNIX had tar at the time. I don't know whether any version that
was generally available *within AT&T* had tar, or, if so, whether the
people within AT&T who did cpio knew about it.
tar does not backup special files. I got bite by this once. After a system
crash I did a total restore and the tty ports for my multi-port serrial card did not get
restored. cpio does restore special files (I checked).
On restore if there is a coruption on then tape tar will stop at that
point, while cpio will skip over it and try to restore the rest of the files.
cpio seems to do a better job of restoreing links.
Please post the results that you get.
The main difference is just in the command syntax and header format.
tar is a little more tape-oriented in that everything is blocked to start
on a block boundary. cpio knows about special files (devices and FIFOS and is
thus more suitable for complete backups on systems that don't have dump.
>Is there any differences between the ability to recover crashed >archives
between the two of them. (Is there any chance of recovering >crashed archives at all.)
Theoretically it should be easier under tar since the blocking lets you
find a header with some variation of "dd skip=nn". However, modern cpio's
and variations have an option to just search for the next file header after an error with
a reasonable chance of re-syncing. However, lots of tape driver software won't allow you
to continue past a media error which should be the only reason for getting out of sync
unless a file changed sizes while you were writing the archive.
>If anyone knows why cpio was made when tar was
prasent >at the unix scene, please tell me about this too.
Probably because it is more media efficient (by not blocking everything and using only
the space needed for the headers where tar always uses 512 bytes per file
header) and it knows how to archive special files.
You might want to look at the freely available alternatives. The major ones are afio,
GNU tar, and pax, each of which have their own extensions with
some backwards compatibility.
Sparse files were tarred as sparse files (which you can easily test,
because the resulting archive gets smaller, and GNU cpio can no longer read
it).
|