5

I am seeking to remove all special characters from several files' worth of downloaded .pdfs, and came across exactly the solution I was looking for, albeit in an OS X environment:

function to automatically remove special characters from file names during saving in MacOS X.

Could a similar method--either using sed or some other function--be implemented in a Linux environment?

nitrl
  • 409
  • 2
  • 5
  • 13

4 Answers4

12

You can do this with the rename command. If you are in the folder with the .pdf files with special characters:

rename 's/[^a-zA-Z0-9]//g' *.pdf

This will remove any characters from files ending in .pdf that are not A-Z in either case, or numbers. You can add to this list:

rename 's/[^a-zA-Z0-9_]//g' *.pdf

This version allows underscores.

Paul
  • 59,223
  • 18
  • 147
  • 168
  • 1
    follow-up question: Would you know how to do this when the filename contains letters that aren't from English alphabet? When I try your commands I get "Can't rename `filename`: Invalid or incomplete multibyte or wide character" for each file. – user2044638 Jul 12 '13 at 09:56
  • @user2044638 Are you able to rename these files with a `mv`? – Paul Jul 12 '13 at 13:28
  • Yes. Worked for every file I tried it on without problems. – user2044638 Jul 12 '13 at 14:15
  • Hm, doesn't seem to be working for me. I installed the requisite util-linux package, and while it appears to run, there is no change in file names. Is there documentation for "rename" somewhere? – nitrl Jul 12 '13 at 14:30
  • 1
    @nitrl There are two separate "rename" utilities, one supports regex, the other doesn't. Check your man page to see which one you have. – Paul Jul 13 '13 at 00:24
  • @Paul I am using the 2.21 release of util-linux, which oddly enough doesn't seem to include a "rename" utility (ftp://ftp.kernel.org/pub/linux/utils/util-linux/v2.21/v2.21-ReleaseNotes). The 2.23 release appears to include one (ftp://ftp.kernel.org/pub/linux/utils/util-linux/v2.23/v2.23-ReleaseNotes). Perhaps an upgrade is in order? – nitrl Jul 15 '13 at 13:24
  • @wjandrea Feel free to edit answers to enhance them. – Paul Jan 14 '17 at 12:31
  • On OS X, it removes the . in front of the suffix, which is kind of obvious... – bot47 Aug 22 '19 at 09:44
5

To handle the whole file name and also multiple files:

  • Add /g to handle the whole file name.
  • Add _to replace with underscore (if required to)
  • Add any additional 'types' of files or individual file names at the end, space separated.

rename 's/[^a-zA-Z0-9_.]/_/g' *mp4 *avi

SJG
  • 231
  • 3
  • 5
0

There is a very handy tool called detox that will do exactly this transformation/renaming for you.

You can pass it a directory name (eventually recursing) or a pattern of specific files:

detox ./

or

detox *.pdf

It is bundle with most Linux distributions.

-2

for file in *; do mv "$file" $(echo "$file" | sed -e 's/[^A-Za-z0-9.-]//g'); done &

  • 2
    While this may answer the question, it would be a better answer if you could provide some explanation **why** it does so. – DavidPostill Jul 04 '17 at 23:09
  • Actually the same as the accepted `rename` method but harder to enter without mistakes – xenoid Jul 04 '17 at 23:22