# Tarnation
by Seth Kenlon

If you like open source software, chances are you are going to
encounter a ''.tar'' file in one form or another, because tar is a
free and open source archive tool. The tar utility has been around
since 1979, so it is well and truly ubiquitous in the POSIX world. Its
purpose is simple: it takes one or more files and "wraps" them into a
self-contained file, called a tape archive because when tar was first
invented, it was used to place data on storage tapes.

People new to the tar format usually equate it to .zip files, but a
tar archive is notably *not* compressed. The tar format itself only
creates a container for files, but they can be compressed with
separate utilities. Common compressions applied to a tar file are
''gzip'', ''bzip2'', ''xz''. That's why you rarely see just a ''.tar''
file, and more commonly encounter ''.tar.gz'' or ''.tgz'' files.

## Installing tar

On Linux, BSD, Illumos, and even Mac OS, the ''tar'' command is
already installed for you.

On Windows, the easiest way to handle tar files is to install the LGPL
open source [7-zip](http://www.7-zip.org/) utility. Its name implies
it's a Zip utility, but it also works with tar archives, and even
provides commands for the ''cmd'' command-line interface.

If you really want an actual tar utility on Windows, GNU tar is installable through
[WSL](https://blogs.msdn.microsoft.com/commandline/2017/05/11/new-distros-coming-to-bashwsl-via-windows-store/)
on Windows 10, or through [Cygwin](https://www.cygwin.com/).


## Creating a tarball

A tar archive is often referred to as a "tarball", presumably because
we hackers love to shorten words to as few syllables as possible, and
"tarball" is shorter and easier than "tar archive".

In a GUI, creating a tarball is, at the most, a three-step process. I'm using KDE, but the process is essentially the same on Gnome or XFCE:

1. Create a directory
2. Place your files into the directory
3. Right-click on the directory and select ''Compress''

![Creating a tarball](compress.jpg)

In a shell, it's basically the same process.

To gather a group of files into one archive, place your files in a directory and then invoke tar, providing a name for the archive that you want to create and the directory you want to archive:

   $ tar --create --verbose --file archive.tar myfiles

Power users abbreviate this:

   $ tar cvf archive.tar myfiles

You don't have to put files into a directory before archiving them,
but it's considered poor etiquette not to, because nobody wants 50
files scattered out onto their desktop when they unarchive a
directory. These kinds of archives are sometimes called a "tarbomb",
although not always with a negative connotation. Tarbombs are useful
for patches and software installers; it's just a matter of knowing
when to use them and when to avoid them.


## Compressing archives

Creating a tar archive does not compress your files, it just makes
them easier to move around as one blob. For compression, you can have
tar call [gzip](https://www.gnu.org/software/gzip/) or
[bzip](http://bzip.org):

   $ tar -cjf foo.tar.bz2 myfiles
   $ tar -czf foo.tar.gz myfiles

Common extensions are tar.gz and .tgz for a gzipped tar file, and
tbz and .tar.bz2 for a bzipped tar file.


## Extracting archives

If you've received a tarball from a friend or a software project, you can extract it in either your GUI desktop or in a shell. In a GUI, right-click the archive you want to extract and select ''Extract''.

![Extracting](extract.jpg)

The Dolphin file manager offers a feature to autodetect whether the
files extracted from an archive are contained in a directory, or
whether a new directory needs to be created for them. I use this
option so that when I extract files from a tarbomb, they remain tidy
and contained.

In a shell, the command to extract an archive is pretty intuitive:

   $ tar --extract --file archive.tar.gz

Power users shorten this to:

   $ tar xf archive.tar.gz

You can even use the tar utility to unzip ''.zip'' files:

   $ tar xf archive.zip


## Advanced tar

The tar utilities are very robust and flexible. Once you're
comfortable with the basics, it's useful to explore other features.

### Add a file or directory to an existing tarball

If you have an existing tarball, but you want to add a new file into
it, you don't have to unarchive everything just to add a new file to
it.

Most Linux and BSD desktops come with a graphical archive
utility. Using it, you can open a tar archive as if it were any other
directory, have a look inside, extract individual files, add files to
it, and even preview the text files and images it contains.

![The Ark archive utility](ark.jpg)

In the shell, you can add a file or directory to a tar archive as long as it is not compressed. If your archive has been compressed, you must uncompress it, but you do not need to unarchive it.

For instance, if an archive has been compressed with gzip:

   $ gunzip archive.tar.gz
   $ ls
   archive.tar

Now that you have an uncompressed tar archive, add a file and a directory to it:

   $ tar --append --file archive.tar foo.txt
   $ tar --append --file archive.tar bar/

The shorter version:

   $ tar rf archive.tar foo.txt
   $ tar rf archive.tar bar/


### View a list of files within a tarball

To see the files in an archive, compressed or uncompressed, use the ''--list'' option:

   $ tar --list --file archive.tar.gz
   myfiles/
   myfiles/one
   myfiles/two
   myfiles/three
   bar/
   bar/four
   foo.txt

Power users shorten this to:

   $ tar tf archive.tar.gz


### Extract just one file or directory

Sometimes you don't need all the files in an archive, you just want to
extract one or two. After listing the contents of a tar archive, use
the usual tar extract command along with the path of the file you want
to extract:

   $ tar xvf archive.tar.gz bar/four
   bar/four

Now the file ''four'' is extracted to a new directory called
''bar''. If ''bar'' already exists, then ''four'' is placed inside of
the existing directory.

Extracting multiple files or directories is basically the same:

   $ tar xvf archive.tar.gz myfiles/one bar/four
   myfiles/one
   bar/four

You can even use wildcards:

   $ tar -xvf archive.tar.gz --wildcards '*.txt'
   foo.txt

### Extract a tarball to another directory

Previously, I mentioned that some tarballs were tarbombs that left
files scattered around your directory. If you list a tar archive and
see that its files are not contained in a directory, you can create a
destination directory for them:

   $ tar --list --file archive.tar.gz
   foo
   bar
   baz
   $ mkdir newfiles
   $ tar -xvf archive.tar.gz -C newfiles

This places all of the files in the archive neatly into the newfiles directory.

The destination directory option is useful for a lot more than just keeping extracted files tidy. For example, if you want to distribute files that are intended to be copied into an existing directory structure. For example, if you're working on a web site and want to send the admin some new files, you can do it a few different ways. The obvious way is to email the site admin the files and some text explaining where each file is to be placed: "The attached index.php file goes into /var/www/example.com/store, and the vouchers.php file goes into /var/www/example.com/deals..."

The more efficient way would be to create a tar archive:

   $ tar cvf updates-20170621.tar.bz2 var
   var/www/example.com/store/index.php
   var/www/example.com/deals/voucher.php
   var/www/example.com/images/banner.jpg
   var/www/example.com/images/badge.jpg
   var/www/example.com/images/llama-eating-apple-pie.gif

Given this structure, the site admin could extract your incoming
archive directly to the server's root directory. The tar utility
autodetects the existence of ''/var/www/example.com'' as well as the
subdirectories ''store'', ''deals'', and ''images'', and distributes
the files into the proper directories. It's bulk copying and pasting, done quickly and easily.

## GNU tar and BSD tar

The tar format is just a format, and it's an open format, so it can be
created by more than just one tool.

There are two common tar utilities: the GNU tar utility, installed by
default on Linux systems, and the BSD tar utility, installed by
default on BSD, Mac OS, and some Linux systems. For general use,
either tar will do. All examples in this article work the same on
either GNU or BSD tar, for example. However, the two utilities do have
some minor differences, so once you get comfortable with one, you
should try the other.

You'll probably have to install the "other" tar (whatever that may be
on your system) manually. To avoid confusion between utilities, GNU
tar is often named ''gtar'' and BSD tar is named ''bsdtar'', with the
command ''tar'' being a symlink, or an "alias", to the one that came
preinstalled on your computer.