-1

I want to replace any special character (not a number or letter) to one single '-'.

I tried the code below with some characters, but it doesn't work when the character is repeated more than 1 time because would still have more than one '-'.

#!/bin/bash
for f in *; do mv "$f" "${f// /-}"; done

for f in *; do mv "$f" "${f//_/-}"; done

for f in *; do mv "$f" "${f//-/-}"; done  

what I want:

test---file       ->  test-file

test   file       ->  test-file

test______file    ->  test-file

teeesst--ffile    ->  teeesst-ffile

test555----file__ ->  test555-file

Please, explain your answer because I don't know much about bash, regexp...

  • No loop needed. All you need is `tr -s [[:punct:]] '-'`, example: `echo "test______file" | tr -s [[:punct:]] '-'` Just paste that at the command line to test. – David C. Rankin Jun 21 '19 at 20:44
  • To handle the punctuation at the end, you can use a *command substitution*, e.g. `a=$(echo "test555----file__" | tr -s [[:punct:]] '-'); echo ${a%-}` yields `test555-file` – David C. Rankin Jun 21 '19 at 20:49
  • 2
    Two different file names can lead to the same file name. Be careful not to overwrite any files. – choroba Jun 21 '19 at 21:03
  • I'd go with something more beginner friendly `sed -e 's/\(\W\+\|\_\+\)\+/-/g' -e 's/-$//g'` sed is for stream editing. So you could pass a stream of file names into it. You seem to be asking to find *groups* of special character (not words and _) and change them into just one `-`. Meaning in regex you are trying to find groupings of `\W` or `_` then you do not want any file to end in `-` so one more substitution `'s/-$//g'` From here you could then write a small script to iterate over your files and then rename them. – treedust Jun 21 '19 at 21:38

2 Answers2

3

There are a couple of different rename (or prename) commands available in various distributions of Linux what will handle regex substitutions.

But you can also use Bash's extended globbing to do some of that. The pattern ${var//+([-_ ])/-} says to replace any runs of one or more characters that are listed in the square brackets with one hyphen.

shopt -s extglob
# demonstration:
for file in test---file 'test   file' test______file teeesst--ffile test555----file__
do
    echo "${file//+([-_ ])/-}"
done

Output:

test-file
test-file
test-file
teeesst-ffile
test555-file-

The extended glob +() is similar to .+ in regex. Other Bash extended globs (from man bash):

          ?(pattern-list)
                 Matches zero or one occurrence of the given patterns
          *(pattern-list)
                 Matches zero or more occurrences of the given patterns
          +(pattern-list)
                 Matches one or more occurrences of the given patterns
          @(pattern-list)
                 Matches one of the given patterns
          !(pattern-list)
                 Matches anything except one of the given patterns

Note that the final hyphen is not removed here, but could be using an additional parameter expansion:

file=${file/%-/}

which says to remove a hyphen at the end of the string.

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • 1
    @JoãoVitorBarbosa: _**Always**_ quote variables when they are going to be expanded and add `--` as the last option before the filename arguments just in case a filename might begin with a hyphen: `mv -n -- "${file[@]}" "${f//+([-_ ])/-}"` – Dennis Williamson Jun 24 '19 at 21:03
  • thanks for the advice. It turns out to be a problem caused by looping over an array of filenames (some filenames had spaces). But just looping directly like : for file in *.txt; solved. – João Vitor Barbosa Jun 25 '19 at 12:58
  • @JoãoVitorBarbosa: Sorry I missed that one since it's hard to read code in comments. `for file in "${files[@]}"; do echo "$file"; done` Always quote your variables. – Dennis Williamson Jun 25 '19 at 13:07
1

You can use tr (as shown above in the comment) or, actually, sed makes more sense in this case. For example, given your list of filenames:

$ cat fnames
test---file
test   file
test______file
teeesst--ffile
test555----file__

You can use the sed expression:

sed -e 's/[[:punct:] ][[:punct:] ]*/-/' -e 's/[[:punct:] ]*$//'

Example Use/Output

$ sed -e 's/[[:punct:] ][[:punct:] ]*/-/' -e 's/[[:punct:] ]*$//' fnames
test-file
test-file
test-file
teeesst-ffile
test555-file

Depending on how your filenames are stored, you can either use command substitution individually, or you can use process substitution and feed the updated names into a while loop or something similar.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85