20

I'd like to use bash to replace multiple adjacent spaces in a string by a single space. Example:

Original string:

"too         many       spaces."

Transformed string:

"too many spaces."

I've tried things like "${str//*( )/.}" or awk '{gsub(/[:blank:]/," ")}1' but I can't get it right.

Note: I was able to make it work with <CMD_THAT_GENERATES_THE_INPUT_STRINGH> | perl -lpe's/\s+/ /g' but I had to use perl to do the job. I'd like to use some bash internal syntax instead of calling an external program, if that is possible.

jww
  • 97,681
  • 90
  • 411
  • 885
Bruno Negrão Zica
  • 764
  • 2
  • 7
  • 16
  • 4
    OP is probably looking for a bash way not necessarily asking for a `sed` or any other external tool solution. – anubhava May 09 '18 at 18:32
  • 2
    Just with built-ins, `str="too many spaces.";shopt -s extglob; printf '%s\n' "${str//+([[:space:]])/ }"` – Inian May 09 '18 at 18:32
  • 3
    Re: the `echo $string | perl` note: When IFS is at its default value, `echo $string` *itself* will replace runs of multiple spaces with a single space (along with other side effects, like replacing a wildcard in that string with a list of files) -- one needs to use `echo "$string"` to keep them in. (Even then, `echo` can still munge contents rather than emitting them exactly as they exist; `printf '%s\n' "$string"` is much more reliable). – Charles Duffy May 09 '18 at 18:39
  • 1
    @Inian, why not tag this as `string` `replace`? – codeforester May 09 '18 at 18:39
  • @codeforester: _Why_ have them? :) those tags are independent of the programming language used. Not adding any value as such – Inian May 09 '18 at 18:40
  • 6
    @Inian: Even if those tags are independent of the programming language used, keeping them in question will only increase visibility of this question in search results. – anubhava May 09 '18 at 18:42
  • 1
    @anubhava: Not sure about it, but added them back though – Inian May 09 '18 at 18:43
  • 2
    How many people will have "string" or "replace" in their tags-to-watch list? Color me in the skeptic camp re: those tags having any value. – Charles Duffy May 09 '18 at 18:44
  • 1
    @CharlesDuffy you may be right: [Do tags help in Google searches?](https://meta.stackoverflow.com/q/367648/6862601) – codeforester May 09 '18 at 18:58
  • @CharlesDuffy, thank you for pointing out that echo already replaces the spaces! I edited my question to show that I wasn't using echo in my tests. I was using | perl .... And I wanted to substitute the spaces in the output from SOME_COMMAND – Bruno Negrão Zica May 09 '18 at 19:35
  • 1
    Possible duplicate of [How to remove extra spaces in bash?](https://stackoverflow.com/q/13092360/608639) – jww May 26 '19 at 22:45

3 Answers3

42

Using tr:

$ echo "too         many       spaces." | tr -s ' '
too many spaces

man tr:

-s, --squeeze-repeats
       replace each sequence of a repeated character that is listed  in
       the last specified SET, with a single occurrence of that charac‐
       ter

Edit: Oh, by the way:

$ s="foo      bar"
$ echo $s
foo bar
$ echo "$s"
foo      bar

Edit 2: On the performance:

$ shopt -s extglob
$ s=$(for i in {1..100} ; do echo -n "word   " ; done) # 100 times: word   word   word...
$ time echo "${s//+([[:blank:]])/ }" > /dev/null

real    0m7.296s
user    0m7.292s
sys     0m0.000s
$ time echo "$s" | tr -s ' ' >/dev/null

real    0m0.002s
user    0m0.000s
sys     0m0.000s

Over 7 seconds?! How is that even possible. Well, this mini laptop is from 2014 but still. Then again:

$ time echo "${s//+( )/ }" > /dev/null

real    0m1.198s
user    0m1.192s
sys     0m0.000s
James Brown
  • 36,089
  • 7
  • 43
  • 59
  • 3
    I'd certainly go this route on a POSIX shell where bash extensions weren't available. That said, it's going to be much slower to run (with reasonable input sizes) than the approach given by @anubhava when one *can* use bash. – Charles Duffy May 09 '18 at 18:37
  • Added a test with a hundred words and 3 spaces between each. – James Brown May 09 '18 at 19:44
  • Over 7 seconds every time. `echo $BASH_VERSION 4.4.12(1)-release` – James Brown May 09 '18 at 20:00
  • 2
    ...waitaminute, found it -- I didn't have extglob enabled. Okay, that's a legitimate result. Might add `shopt -s extglob` to make it copy-and-pasteable for folks trying to repro. – Charles Duffy May 09 '18 at 20:01
14

Here is a way to do this using pure bash and extglob:

s="too         many       spaces."

shopt -s extglob
echo "${s//+([[:blank:]])/ }"

too many spaces.
  • Bracket expression [[:blank:]] matches a space or tab character
  • +([[:blank:]]) matches one or more of the bracket expression (requires extglob)
anubhava
  • 761,203
  • 64
  • 569
  • 643
5

Another simple sed expression using BRE is:

sed 's/[ ][ ]*/ /g'

For example:

$ echo "too         many       spaces." | sed 's/[ ][ ]*/ /g'
too many spaces.

There are a number of ways to skin the cat.

If the enclosed whitespace could consist of mixed spaces and tabs, then you could use:

sed 's/\s\s*/ /g'

And if you simply want to have bash word-splitting handle it, just echo your string without quotes, e.g.

$ echo "too         many       spaces." | while read line; do echo $line; done
too many spaces.

Continuing with that same thought, if your string with spaces is already stored in a variable, you can simply use echo unquoted within command substitution to have bash remove the additional whitespace for your, e.g.

$ foo="too         many       spaces."; bar=$(echo $foo); echo "$bar"
too many spaces.
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85