0

I have 200,000 XML files in the folder on a RHEL7 Linux server. I need to zip all 200,000 XML files. I am using the Tar command but getting the error "Argument list too long" is there any other way in this 200,000 need zip only 10,000

tar -cvf xml.tar *.xml

All xml files need to be archived separately as individual archive that has original file name in the archive name.

Original files:

1.xml
2.xml
...
n.xml

Result of archiving OP wants:

1.xml.tgz
2.xml.tgz
...
n.xml.tgz
Alex
  • 6,187
  • 1
  • 16
  • 25
  • While I edited in the command verbatim from a now-deleted comment by OP, apparently that's not the case. OP's comments from a now-deleted `tar` answer: "*its working but all the files ziped and genarated into single folder i need zip each and every file : Example : filename .xml its convert in to filename1 .xml .zip*" and "*sorry i need to ".zip"*". – Bob Aug 10 '18 at 15:44
  • 2
    kesavanagaprasadthonta, please clarify: do you mean you *ran* the `tar` command, but you don't actually want a `tar` file? And do you want individual `zip`s or `gzip`s? They are different formats, and require different commands. – Bob Aug 10 '18 at 15:46
  • @Bob shell simply choke on a huge list of files in command line when OP used `*.xml` – Alex Aug 10 '18 at 16:33
  • 1
    IMHO making a big single TAR you are going to reproduce the problem somewhere else. Directories with over 10K files are always a problem. It could be much more practical to make TARs of a few thousands files each. – xenoid Aug 10 '18 at 16:55
  • @Bob Maybe you shouldn't have deleted your answer yet. Now we have a new answer about `tar` that is IMHO not as good as yours. – Kamil Maciorowski Aug 10 '18 at 16:58
  • [Is there any limit on number of file name we can pass in tar](https://superuser.com/q/935102/241386), [zip: Argument list too long (80.000 files in overall)](https://superuser.com/q/272696/241386) – phuclv Aug 11 '18 at 02:04
  • Possible duplicate of [zip: Argument list too long (80.000 files in overall)](https://superuser.com/questions/272696/zip-argument-list-too-long-80-000-files-in-overall) – phuclv Aug 11 '18 at 02:04
  • @phuclv OP wants to archive each file individually, so here he got two problems - "Argument list too long" and compressing files individually, so it doesn't looks like duplicate – Alex Aug 11 '18 at 09:48
  • @Alex the rare deleted comments from the OP were too obscure and hard understand. But if it's about separate archive files then there are already a lot of other duplicates: [How to work around shell limitation of 'Argument list too long'?](https://superuser.com/q/240183/241386), [Argument list too long](https://superuser.com/q/282533/241386), [argument list too long for rm -rf *, 4000 files](https://superuser.com/q/391811/241386)... Just replace the corresponding command with zip/tar/whatever – phuclv Aug 11 '18 at 10:27
  • @phuclv I think OP has a serious language barrier, that's why I patient to him – Alex Aug 11 '18 at 10:35
  • Note: archiving separate files into `.tgz` makes questionable sense. It's a two-step process: `tar` + `gzip`. Tarring a single file makes no sense; to compress one file at a time `gzip` is enough. Use `gzip`, unless you explicitly need `.tgz` format (e.g. to process these files in the exact same manner as other `.tgz` files where the `tar` part makes sense). – Kamil Maciorowski Aug 11 '18 at 10:43

1 Answers1

2

You exceeded maximum command line length. Command line has finite length that you can test with getconf ARG_MAX command. When you running shell command that includes glob pattern such as * in directory that contain huge amount of files then command line is overfilled and one will receive error message "Argument list too long", so it isn't a tar problem. Keep this it in mind when you use other commands with glob patterns that applied to huge amount files.

To resolve your issue, you can use find program that will "walk"
through the directory and feed tar.

To archive all files as a single compressed tar archive you can use:

find . -name "*.xml" -print | tar -czvf xml.tgz -T -

To archive all files individually as compressed tar archives(not really sure why need to by tar'ed if it's a single file, but as you wish), use

find . -name "*.xml" -exec tar -czvf '{}'.tgz '{}' \;

To archive all files individually as gzip archives, use:

find . -name "*.xml" -exec gzip '{}' \;

Be warned, command above will remove original files (!!!)

To archive all files individually as zip archives, use:

find . -name "*.xml" -exec zip '{}'.zip '{}' \;

P.S. I added also missed(?) option to compress tar archive.

Alex
  • 6,187
  • 1
  • 16
  • 25
  • `find . -name ' *.xml' | tar -T -` would work just as well? – xenoid Aug 10 '18 at 16:53
  • @xenoid Doesn't looks like it would work – Alex Aug 10 '18 at 16:55
  • @Alex Very nice, I didn't know tar can also use a list from pipe. I have been using things like `find . -name "*.xml" -print | zip xml.zip -@` for zip archive – Bernard Wei Aug 10 '18 at 18:47
  • @BernardWei I glad you found it useful. You probably might want to post your solution here too, since it not clear if OP wants a `zip` or `tar` archive. – Alex Aug 10 '18 at 18:53
  • @Bernadwe need to Zip each and every xml – kesavanagaprasad thonta Aug 11 '18 at 04:08
  • @BernardWei thank you but all files generated in single zip files but i need each file suppurate zip Example: demo.xml.gz – kesavanagaprasad thonta Aug 11 '18 at 07:22
  • @kesavanagaprasadthonta I updated my post, I think you can find there all solutions you need – Alex Aug 11 '18 at 09:34
  • 1
    @kesavanagaprasadthonta make up your mind! It seems from the above comment you want `*.xml` -> `*.xml.zip` and now you say `*.xml.gz`. And what is it about `*.xml.tgz`? Moreover `zip` and `gzip` (beside bzip2) are very different. There's no option to `zip` directly from `tar`. If you want separate files why don't say it clearly? Read [ask] or you'll just get more downvotes – phuclv Aug 11 '18 at 10:18
  • @kesavanagaprasadthonta Any reason why are you creating multiple tar archive of a single file each? – Bernard Wei Aug 13 '18 at 17:25