8

Commonly I see people manipulating strings using sed as follows:

echo "./asdf" | sed -n -e "s%./%%p"

I recently learned I can also do:

sed -n -e "s%./%%p" <<< "./asdf"

Is there a reason to avoid the latter? For instance, is it bash-specific behaviour?

Sigve Karolius
  • 1,356
  • 10
  • 26
  • 1
    `<<<` is only available in newer versions of bash, so if you care about portability and backward-compatibility then you might want to use the first method. (From [Wikipedia entry for bash](https://en.wikipedia.org/wiki/Bash_%28Unix_shell%29#Features): "Since version 2.05b Bash can redirect standard input (stdin) from a "here string" using the <<< operator.") – Paul R Jan 27 '17 at 16:55
  • 2
    `<<<` has been in `bash` for a long time; the main concern is portability with POSIX rather than other versions of `bash`. – chepner Jan 27 '17 at 16:56
  • 2
    If portability were a concern, would it not be better to use `printf` rather than `echo`? My understanding `echo` implementations vary more, than those of `printf`. – Fred Jan 27 '17 at 16:58
  • @Charles Duffy: Didn't want to get into the nitty-gritty details of implementation level, just wanted to suggest a point on the byte level representation. – Inian Jan 27 '17 at 16:58
  • 1
    @Fred, definitely a very good point; [POSIX echo](http://pubs.opengroup.org/onlinepubs/009695399/utilities/echo.html) has a lot of leeway in its specification, as the APPLICATION USAGE and RATIONALE parts of the spec make clear. – Charles Duffy Jan 27 '17 at 17:05
  • @BenjaminW., hmm. If that question didn't digress into unrelated content (kept itself scoped to its title), it'd be a tighter dupe. – Charles Duffy Jan 27 '17 at 17:12
  • @CharlesDuffy I actually prefer the answers here anyway. – Benjamin W. Jan 27 '17 at 17:13

3 Answers3

13

How should I trim ./ from the beginning of a path (or perform other simple string manipulations)?

Bash's built-in syntax for this is called parameter expansion. ${s#./} will expand $s with any leading ./ trimmed internal to the shell, with no subprocess or other overhead. BashFAQ #100 covers many additional string manipulation operations.


What are the differences between echo "$s" | ... and ... <<<"$s"?

  1. Portability

    As you've noted, <<< is not available in POSIX sh; this is a ksh extension also available in bash and zsh.

    That said, if you need portability, the multiline equivalent is not far away:

    ... <<EOF
    $s
    EOF
    
  2. Disk usage

    As currently implemented by bash (and as an implementation detail subject to change), <<< creates a temporary file, populates, it, and redirects from it. If your TEMPDIR is not on an in-memory filesystem, this may be slower, or may generate churn.

  3. Process overhead

    A pipeline, as in echo foo | ..., creates a subshell -- it forks off a completely new process, responsible for running echo and then exiting. When you're running result=$(echo "$s" | ...), then that pipeline is itself in a subshell of your parent shell, and that shell has its output read by the parent.

    Modern unixlikes go to significant effort to make fork()ing off a subprocess low-overhead to the extent possible, but even then it can add up when in an operation done in a loop -- and on platforms such as Cygwin it can be even more significant.

  4. echo bugs

    Last but not least -- <<<"$s" will represent any contents of the variable s precisely, with the exception that it can add a trailing newline. By contrast, echo has a great deal of leeway in its specified behavior: It can honor backslash expansions or not depending on compliance with the optional XSI extensions to the standard (and presence or lack of the widespread but entirely noncompliant extension of -e, and/or runtime flags that disable it); the ability to avoid addition of trailing newlines with -n is not guaranteed by the standard; &c. Even if you're using a pipeline, it's better to use printf:

    # emit *exactly* the contents of "$s", with no newline added
    printf '%s' "$s" | ...
    
    # emit the contents of "$s", with an added trailing newline
    printf '%s\n' "$s" | ...
    
    # emit the contents of "$s", with '\t', '\n', '\b' &c replaced, and no added newline
    printf '%b' "$s" | ...
    
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
5

Using sed at all is not desirable if it can be helped (see Charles Duffy's answer); put the string in a variable and let the shell do it with POSIX-compatible parameter expansion.

$ s="./asdf"
$ echo "${s#./}"
asdf
chepner
  • 497,756
  • 71
  • 530
  • 681
  • 1
    Good comment that is relevant to my example. However, it was merely a "dummy" example, the real regular expressions can not be replaced by parameter expansion. – Sigve Karolius Jan 27 '17 at 17:02
1

I think there are two things at play here.

  1. <<< vs. pipelines
  2. sed (or other external command) vs parameter expansion

If you can do something with expansion, it is very likely it will be much quicker, as it saves an external command being launched.

However, not everything can be done with expansion. So you may have to use an external command and use as input something you have in a variable. In this case, you will have to make your choice based on portability considerations. As for performance, if it matters, you should probably test in your context what performs best.

Fred
  • 6,590
  • 9
  • 20