3

I know that using tar with -J option makes it possible to compress a folder with a high level of compression resulting a tar.xz file.

I have a folder holding multiple backups of my working space, each of which containing a lot of libraries (.so and .a, etc.) which are usually, but not always, the same files foreach backup (duplicated files).

Is there a method which can compress my folder of backups considering the fact that there are a lot of duplicate files in there, and therefore result in a highest level of compression? Does passing -J option to tar command do the job?

I don't want to take care of the duplicate files inside each folder all the time. Is there a smart tool which considers all the duplicate files as one file then compress that? If not, what is the best tool and option to compress such a folder?

hmojtaba
  • 35
  • 9

1 Answers1

1

You probably want to exclude all the back-ups entirely.

Otherwise, everything that produces a solid archive should handle duplicate files quite efficiently, so tar+*, cpio+*, 7-zip (with the "solid" option), RAR (with the "solid" option) or a bunch of others, but not ZIP.

You can easily test that by comparing the size of an archive with exactly one random file to an archive with two copies of that same file.

David Foerster
  • 35,754
  • 55
  • 92
  • 145
  • 1
    Thanks, you are right, I made a custom folder with some duplicated files inside, and using tar with -J handles the duplicates in the best way, it's odd that i didn't see this before. thank you. I got my answer. – hmojtaba Sep 09 '16 at 07:03