this post was submitted on 28 Nov 2025
24 points (96.2% liked)

Linux Questions

2896 readers
22 users here now

Linux questions Rules (in addition of the Lemmy.zip rules)

Tips for giving and receiving help

Any rule violations will result in disciplinary actions

founded 2 years ago
MODERATORS
 

A .tar archive is basically only the files cat'ed together, with a header and index added, right?

And a .tar.gz takes forever to modify, because it needs to first extract the .tar.

So why is there no archive format that just cat'es the compressed files together?

top 8 comments
sorted by: hot top controversial new old
[–] moonpiedumplings@programming.dev 9 points 23 hours ago

This is what zip does. It compresses files individually, and then combines them into the archive. This comes with the advantage that you don't have to extract the whole archive to view and edit files, but it comes with a very big disadvantage, which is that there is no compression across files. Redundant data in each file is not deduplicated.

Tar.gz does compress across files, which saves more space. That is to say, the reason why we don't just tar gzed files together, is because people decided that compression savings matter more than not having to extract the whole archive to view/edit files.

7z is the best of both worlds, as it compresses across files, but also lets you view and edit files without extracting the whole archive. But it's important to remember that tar.gz is ubiquitous for it's compatibility, rather than it's performance or features. Even the most smallest, stripped down utilities, or the most oldest, out of date systems, always have gz and tar, whereas even on modern desktop distros 7z may need to be explicitly installed.

[–] roflo1@piefed.social 22 points 1 day ago

I’ll try to explain it in another way. First, let’s talk about “semantics”:

Usually we assume that a tarball contains multiple files, and a gz is a single file compressed.

So a .tar.gz file is a single tarball that has been compressed.

A .gz.tar is understood to be a tarball containing a single gzipped file. But if that’s indeed what the file is, it doesn’t make much sense to tar it in the first place.

Moving on to what you really want to accomplish: you can certainly create many gz files and tar them, but we wouldn’t call it a .gz.tar file since tar doesn’t care about the format of the included files. Much like a bunch of compressed PDFs aren’t named .pdf.tar

Also, I’d like to point out that neither tarballs nor gzipped files are optimized for modifying.

[–] bjoern_tantau@swg-empire.de 11 points 1 day ago

TAR is short for Tape ARchive. It was originally meant to be written to magnetic tapes. Like cassette tapes. As such they are not really meant to be changed much. You read the whole thing from beginning to end until you find what you're looking for.

Modifying it would usually mean writing the whole thing again. So that is more a modern notion.

[–] charonn0@startrek.website 8 points 1 day ago

A .tar archive is basically only the files cat'ed together, with a header and index added, right?

Tar does not include an index. It's just the headers and data cat'ed together. You have to read from the beginning of the archive until you find the file you want. This is exacerbated if the archive is also gzipped, since you have to decompress all the files leading up to the one you want, as opposed to skipping over them as you could do in an uncompressed tar archive.

So why is there no archive format that just cat'es the compressed files together?

That's essentially what a zip archive does. Each file is compressed separately and cat'ed together with uncompressed headers in between. Also zip archives do have an index which is what allows for random access and easy changes. The downside is that the compression ratio of a zip archive can be worse than a tar.gz archive.

[–] aarch0x40@piefed.social 4 points 1 day ago* (last edited 1 day ago)

Archiving (catting files together) and compression are two different actions. This is true even in formats and zip or rar where the functions have always been a part of the same utility. They are separated in Unix because the Tape ARchiver wasn't initially intended to produce a file on disk. The tar utility did eventually add compression but that's not always desirable.

An archive of compressed files is just another archive regardless of the format.  Compressed files is also a bit vague a term.  Most video and music formats are compressed, so compressing them again doesn't really add much value and can sometimes even produce larger sizes.

[–] Tuuktuuk@piefed.ee 3 points 1 day ago (1 children)

Hey, you can add to a tarball without extracting it! Maybe that's what you're actually asking?

Anyway, in case it wasn't, here's a bit more answer:

A .tar.gz is a file that used to be .tar and was then compressed by gzip.

A .gz.tar is a file that used to be .gz and was was made into a tarball.

I cannot really imagine why someone would want to turn a .gz file into a tarball. Kind of in order to save it on an actual tape drive, maybe? But then it wouldn't be a tarball. It would be just ones and zeroes physically saved on a tape.

[–] agitated_judge@sh.itjust.works 2 points 1 day ago (1 children)

You can save .tar.gz files to actual tapes. As you said, it's just one and zeros.

[–] Tuuktuuk@piefed.ee 1 points 17 hours ago

Of course, and that is done very often. But why would you want to save a .gz.tar file on a tape?