19

I'm trying to make some base64 substitution with sed.

What I'm trying to do is this:

sed -i "s|\(some\)\(pattern\)|\1 $(echo "\2" | base64 -d)|g" myFile

In English that would be:

  • Math a pattern
  • Capture groups
  • Use the captured group in a bash command
  • Use the output of this command as a replacement string

So far my command doesn't work since \2 is known only by sed and not by the bash command I'm calling.

What elegant solution to I have to pass a capture group to a command of which I want to use the output?


Edit

Here is a minimal example of what I'm trying to do:

I have the following file:

someline
someline
Base64Expression stringValue="Zm9v"
someline
Base64Expression stringValue="YmFy"

And I want to replace the base64 by plain text:

someline
someline
Base64Expression stringValue="foo"
someline
Base64Expression stringValue="bar"

In the future I'll have to do the backward operation (encoding string in base64 on the decoded file)

I've started using awk but I though it could get simpler (and much more elegant) with sed. So far with awk I have this (where $bundle is the file I'm editing):

#For each line containing "Base64Expression"
#Put in the array $substitutions[]:
# The number of the line (NR)
# The encoded expression ($2)
# The decoded expression (x)
substitutions=($(awk -v bd=$bundle '
    BEGIN {
        # Change the separator from default
        FS="""
        ORS=","
        OFS=","
    }
    /Base64Expression/ {
        #Decode the base64 lines
        cmd="echo -ne \""$2"\" | base64 -d"
        cmd | getline x

        if ( (cmd | getline) == 0 ){
            print NR, $2, x
        }
    }
' $bundle))

# Substitute the encoded expressions by the decoded ones
# Use the entries of the array 3 by 3
# Create a sed command which takes the lines numbers
for ((i=0; i<${#substitutions[@]}; i+=3))
do
    # Do the substitution only if the string is not empty
    # Allows to handle properly the empty variables
    if [ ${substitutions[$((i+1))]} ]
    then
        sed -i -e "${substitutions[$i]}s#${substitutions[$((i+1))]}#${substitutions[$((i+2))]}#" $bundle
    fi
done
fedorqui
  • 275,237
  • 103
  • 548
  • 598
statox
  • 2,827
  • 1
  • 21
  • 41
  • This is not possible because `$(echo "\2" | base64 -d)` is done first.. Moreover you need to replace single quote with double quote if you use shell variables within sed.. – sjsam Sep 06 '16 at 12:34
  • `awk` is designed for such processing. But we'll need to see the smallest set of sample data to reproduce your issue AS WELL AS your required output given that input in order to help. Please edit your Q to include that information. Good luck. – shellter Sep 06 '16 at 12:37
  • @shellter I edited my question with what I have done with awk. @ sjsam, thanks for pointing out my quoting, I edited that too. – statox Sep 06 '16 at 12:46
  • @shellter I don't think this can be done with `awk` -- it doesn't have a way to run a shell command and get the output. `perl` seems like the best way. – Barmar Sep 06 '16 at 12:54
  • @Barmar Actually my solution with `awk` then `sed` works but it has a lot of flaws. Nevertheless, I'd really need a solution with either `awk` or `sed`: I know `perl` is really good for this kind of operation but in this context I can't use it and I can't change that. – statox Sep 06 '16 at 12:59
  • GNU `sed` has an `/e` flag which passes the substitution string to a shell for evaluation. – tripleee Sep 06 '16 at 13:30
  • @barmar : "awk .. doesn't have a way to run a shell command and get the output" . Note the O.P. revised Q with awk code that shows `cmd | getline x` . This assigns the output of `cmd` to the var `x` . C'mon, you knew that, didn't you? ;-) And thanks for tip about clicking on screen shots in another msg thread (I still can't read that one). Didn't know about that. Good luck to all! – shellter Sep 06 '16 at 16:34
  • @shellter No, forgot about that. – Barmar Sep 06 '16 at 16:35
  • tried perl as learning exercise... `perl -MMIME::Base64 -pe 's/.*;\K(.*)(?=&.*)/decode_base64($1)/e' myFile` – Sundeep Sep 07 '16 at 08:49
  • @spasic: damn perl looks really like a cool language: this version is so simple compared to the other answer. Too bad I can't use it in this case, thanks for your participation! – statox Sep 07 '16 at 09:24
  • 1
    @statox yeah, am learning perl-oneliners along with sed/awk.. got idea for this one from http://www.catonmat.net/series/perl-one-liners-explained – Sundeep Sep 07 '16 at 09:29

2 Answers2

18

You can use e in GNU sed to pass the substitution string to a shell for evaluation. This way, you can say:

printf "%s %s" "something" "\1"

Where \1 holds a captured group. All together:

$ sed -r 's#match_([0-9]*).*#printf "%s %s" "something" "\1"#e' <<< "match_555 hello"
something 555

This comes handy when you want to perform some shell action with a captured group, like in this case.

So, let's capture the first part of the line, then the part that needs to be encoded and finally the rest. Once this is done, let's print those pieces back with printf triggering the usage of base64 -d against the second slice:

$ sed -r '/^Base64/s#(.*;)([^\&]*)(&.*)# printf "%s%s%s" "\1" $(echo "\2" | base64 -d) "\3";#e' file
someline
someline
Base64Expression stringValue=&quot;foo&quot;
someline
Base64Expression stringValue=&quot;bar&quot;

Step by step:

sed -r '/^Base64/s#(.*;)([^\&]*)(&.*)# printf "%s%s%s" "\1" $(echo "\2" | base64 -d) "\3";#e' file
#        ^^^^^^^    ^^^  ^^^^^^  ^^^                        ^^^^^^^^^^^^^^^^^^^^^^^^       ^
#           |   first part  |   the rest                encode the 2nd captured group      |
#           |               |                                                              |
#           |           important part                                      execute the command
#           |
# on lines starting with Base64, do...

The idea comes from this superb answer by anubhava on How to change date format in sed?.

Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    Now that is some great sed skills! The trick of using the `e` flag combined with `printf` is what I needed! Thanks a lot! – statox Sep 06 '16 at 13:43
  • Too bad: this approach looked promising but I ran into `Syntax error: end of file unexpected` exceptions in my case that were not verbose enough `cat my-short-file.csv | sed -E 's#\{media path=([^\}]+)\}#echo "cheese"#e'`. I ended up writing a PHP script with preg_replace_callback() to achieve the same results. — Would this be compatible with the `g` modifier? – WoodrowShigeru Jun 09 '21 at 11:27
  • @WoodrowShigeru I cannot reproduce the error here. `gsed -E 's#\{media path=([^\}]+)\}#echo "cheese"#e' <<< "{media path=(hola)}"` works well to me. May be best to provide a more meaningful example or to post a new question with all the details. – fedorqui Jun 10 '21 at 15:11
3

It sounds like this is what you're trying to do:

$ cat tst.awk
BEGIN { FS=OFS="&quot;" }
/^Base64Expression/ {
    cmd="echo -ne \""$2"\" | base64 -d"
    if ( (cmd | getline x) > 0 ) {
        $2 = x
    }
    close(cmd)
}
{ print }

$ awk -f tst.awk file
someline
someline
Base64Expression stringValue=&quot;foo&quot;
someline
Base64Expression stringValue=&quot;bar&quot;

assuming your echo | base64 is the right approach.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I'm not sure I understand how the `{ print }` statement is used. I'm not an awk veteran and the statement used without a regex or `BEGIN|END` before it makes me perplex. It would be really great if you could explain a little more how your code works. – statox Sep 06 '16 at 14:29
  • 1
    OK thanks for the rewriting then, it is clearer now. And I'll definitely look for this book. – statox Sep 06 '16 at 15:45