-1

We regularly download files, which have repeated parts of a file name. This is an issue as the drive we have to save them to has a 256 character path limit, and they are saved many subfolders deep. At the moment I manually remove repeated parts of the file name that are the same, as per the attached image, deleted parts are highlighted in red: enter image description here

Is there a batch file/quicker way of looking for a duplicate in the file name and removing it? Thanks, Russ.

Appleoddity
  • 11,565
  • 2
  • 24
  • 40
  • 2
    So if you are aware of File Explorer's 256 character limit then you should change the download location, so the limit isn't a problem. – Ramhound Oct 30 '17 at 13:24
  • 2
    I'd say that it would be fairly complicated for a script (or program) to do that. Imagine `this_is_a_long_name_which_is_an_example.ext` - if you "deduplicate" the name, it would most likely become `this_is_a_long_name_whichn_example.ext`, as `_is_a` is mentioned twice. so if you don't have an **exact schema** that all new file names share (e.g. `file_name_-_file-name - file name.ext`), it is nearly impossible to do that in an automated way. – flolilo Oct 30 '17 at 13:31
  • 2
    This isn't actually that hard. I would use powershell. It appears that each duplicate is separated by a `_` character. So powershell could easily split the filename at `_` and then remove any duplicates in the resulting array. Finally, it could use the cleaned up array to build a new filename without duplicates. – Appleoddity Oct 30 '17 at 13:44
  • (2/2) if you have such a schema, using PowerShell, you could try to [`.Split()`](https://ss64.com/ps/split.html) the `BaseName`, [`.Replace()`](https://ss64.com/ps/replace.html) the word-separator (space, dot, underscore, hyphen,...) so they are all the same, then [`Sort-Object -Unique`](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/sort-object?view=powershell-5.1) to compare them, then `.Join()` the substrings back again and use them as new `BaseName` in `Rename-Item`. – flolilo Oct 30 '17 at 13:50
  • Hi, Ramhound, I'm unable to do that, our company job drive is out of my control. Flolilolilo, thank you, I didn't think it'd be easy, and lastly Appleoddity, I'm a pretty basic user, and have no idea how to do that, do you have time to tell me? Thanks all :) – Russell_s_smith Oct 30 '17 at 13:54
  • You can overcome the 256 character limit quite easily. At any point in the explorer, click on an empty space in the addressbar so you can type in it. Then type the following command, removing the address itself: `subst j: .` A J folder is now added in explorer which links to the current path. From there you can do anything you want. – LPChip Oct 30 '17 at 14:05
  • See also: https://superuser.com/questions/755298/how-to-delete-a-file-with-a-path-too-long-to-be-deleted/755301#755301 – LPChip Oct 30 '17 at 14:06
  • Please note that https://superuser.com is not a free script/code writing service. If you tell us what you have tried so far (include the scripts/code you are already using) and where you are stuck then we can try to help with specific problems. You should also read [How do I ask a good question?](https://superuser.com/help/how-to-ask). – DavidPostill Oct 30 '17 at 20:21
  • This could have been done in a batch file as well. Can easily emulate that same Powershell code. – Squashman Nov 01 '17 at 19:10

1 Answers1

1

Disclaimer: This PowerShell-code has not been sufficiently tested to know that it will work properly in all environments with all kinds of possibly strange filenames / formats. But, it does work on your provided examples. Use at your own risk or use Rename-Item with the -WhatIf-switch (so it will only show what it would do without actually atering the filename).


Sample folder:

CDS 202 - GLAZING PACKERS_CDS 202 - Glazing Packers_CDS 202 - Glazing Packers.docx
CDS 202 - GLAZING PACKERS_CDS 202 - Glazing Packers_CDS 202 - Glazing Packers.pdf
CDS 202 - GLAZING PACKERS_PX-INA-PD-RP-X-XX-XX-0026.pdf

Here are examples how to accomplish the task:

# Remove all duplicates in filenames in current folder: (Case Sensitive)
Get-ChildItem -Path .\* -File | ForEach-Object {
    Rename-Item $_ -NewName ((($_.Basename.Split("_") | Select-Object -Unique) -Join "_") + $($_.Extension))
}

# Results:
# CDS 202 - GLAZING PACKERS_CDS 202 - Glazing Packers.docx
# CDS 202 - GLAZING PACKERS_CDS 202 - Glazing Packers.pdf
# CDS 202 - GLAZING PACKERS_PX-INA-PD-RP-X-XX-XX-0026.pdf
# Remove all duplicates in filenames in current folder: (Case Insensitive - Drawback: filenames are converted to upper case)
Get-ChildItem -Path .\* -File | ForEach-Object {
    Rename-Item $_ -NewName ((($_.Basename.Split("_").ToUpper() | Select-Object -Unique) -Join "_") + $($_.Extension))
}

# Results:
# CDS 202 - GLAZING PACKERS.docx
# CDS 202 - GLAZING PACKERS.pdf
# CDS 202 - GLAZING PACKERS_PX-INA-PD-RP-X-XX-XX-0026.pdf
# Remove all duplicates in filenames in current folder and all subfolders: (Case Sensitive)
Get-ChildItem -Path .\* -File -Recurse | ForEach-Object {
    Rename-Item $_ -NewName ((($_.Basename.Split("_") | Select-Object -Unique) -Join "_") + $($_.Extension))
}

# Results:
# CDS 202 - GLAZING PACKERS_CDS 202 - Glazing Packers.docx
# CDS 202 - GLAZING PACKERS_CDS 202 - Glazing Packers.pdf
# CDS 202 - GLAZING PACKERS_PX-INA-PD-RP-X-XX-XX-0026.pdf
# Remove all duplicates in filenames in current folder and all subfolders: (Case Insensitive - Drawback: all filenames are converted to upper case)
Get-ChildItem -Path .\* -File -Recurse | ForEach-Object {
    Rename-Item $_ -NewName ((($_.Basename.Split("_").ToUpper() | Select-Object -Unique) -Join "_") + $($_.Extension))
}

# Results:
# CDS 202 - GLAZING PACKERS.docx
# CDS 202 - GLAZING PACKERS.pdf
# CDS 202 - GLAZING PACKERS_PX-INA-PD-RP-X-XX-XX-0026.pdf

Enjoy!

Appleoddity
  • 11,565
  • 2
  • 24
  • 40
  • Thanks so much Appleoddity, works perfectly for what I'm trying to achieve! Most appreciated. – Russell_s_smith Oct 30 '17 at 15:35
  • Not that I have a problem with it directly, but why did you change the results? I took them from personal tests (PowerShell 5.1), so they should have been be accurate - or did I make a mistake? – flolilo Oct 30 '17 at 17:45
  • @flolilolilo I appreciate your updates. But I came back to make a small change to my answer because I had copied and pasted wrong. That change affected the output of everyone of your results. It made it appear the command didn't work right, but it actually does, and I have corrected everything in the post to match. – Appleoddity Oct 30 '17 at 17:47
  • @Appleoddity Ah - for some reason, I did not notice the change in code but only the change in the results - sorry for the bother! – flolilo Oct 30 '17 at 17:51