The main advantages of file compression are reductions in storage space, data transmission time, and communication bandwidth. This can result in significant cost savings. Compressed files may require significantly less storage capacity than uncompressed files, meaning a significant decrease in expenses for storage.
There are a number of different options for collecting files and compressing their size, and there are many compression formats that are optimized for various purposes. Some file types compress more than others, and some compression mechanisms do better at some file types than others. Testing the various mechanisms is often worth the effort.
It is worth noting that there is a trade off in file size savings for compression speed, especially when dealing with large file sets. You might save a few megabytes, but it might take all day to compress!
This article from Network World goes into detail about each of the 5 most common tools for compressing files.
Each Systems Administrator has his or her favorite, but here is a brief overview of two common compression methods.
gzip
If you have a large file that you would like to compress, you can use gzip
:
gzip bigfile
This will result in a new file named bigfile.gz.
You can read more about gzip
at https://www.gnu.org/software/gzip/.
tar
The tar
(Tape ARchiver) command gathers many small files into one archive file. A common usage is:
tar -cvf archive-file.tar mydirectory
where the options used are:
c, create an archive file
v, ‘verbose’; provide a lot of information about the process as it happens
f, provides a name for the archive file to be created.
This will create one file named archive-file.tar that includes all the files in the directory named mydirectory.
To view the list of all the files in the archive file, use a command like:
tar -tvf archive-file.tar
where the t
option means list the contents of this archive.
To extract the files from the archive file, use the -x
(for extract) option:
tar -xvf archive-file.tar
You can also tar and compress using gzip
in one step by adding the -z
option to tar:
tar -cvzf archive-file.tgz mydirectory
Notice that the filename extension to use when creating a tar’ed and compressed file is ‘tgz’ and not ‘tar’.
To view or extract an archive file that has also been compressed with gzip
, use the same syntax as above, but include the -z
option to tar
:
tar -tvzf archive-file.tgz tar -xvzf archive-file.tgz
Once you tar and compress the directory, and the archive.tgz file exists, you can then remove the directory.
You can read more about the tar
command at https://man7.org/linux/man-pages/man1/tar.1.html.