Here's a method for 'squeezing' filenames that contain duplicate runs of characters; the results are similar as to when doing a 'tr -s'. For example, suppose you have a directory full of files named like this:
LOG_FILE___STATUS1__0001.TXT
and would like to eliminate the multiple underscore characters so that the files are named this way instead:
LOG_FILE_STATUS1_0001.TXT
A fairly quick way to do this is from the shell. Open a terminal and go to the directory that contains your files, then use:
find . -type f -name '*__*' | awk -F\? '{ s=$1 ; gsub \
( "_+","_",s ) ; print "echo n \| mv -i","\""$1"\"","\""s"\""}' \
| /bin/shNOTES:- The command is shown on three lines with continuation marks; if you have any trouble with it, copy and paste it one piece at a time onto one row.
- I prefer to use 'find' to generate file lists instead of 'ls' because you have greater control over what gets matched. In this case, we want the names of files only, not subdirectories, etc.
- The 'echo n | mv -i' section ensures that this command will safely fail when attempting to rename/overwrite an existing file with the same name.
- All the ugliness with the escaped quotation marks (\""$1"\"",etc) and the '-F?' option is to handle filenames with spaces in them.
- If you want to see test the results first (without actually renaming the files), leave off the trailing '| /bin/sh'
- sed experts: I tried to do this with sed but couldn't. You're welcomed to prove me wrong.