5

I wanted to mask the bits in a file in linux as follows:

 actual : 00-00000000-00000011-00110010-10000000
 expect : 00-00000000-00000001-01101010-00000000
 error  : 00-00000000-00000010-01011000-00000000

In the output, the actual part should be as it is unless, the error is 1 then actual bits should be masked with X

As it is seen, error is XOR of actual and expected data.

Output should look somewhat like this:

output : 00-00000000-000000X1-0X1XX010-10000000

Is there anyway of doing this using SED,AWK,etc commands?

What is done so far:

grep 'err' gpbat | sed 's?-??g' | cut -c 11-35 | sed 's?1?X?g' > a1
grep 'act' gpbat | sed 's?-??g' | cut -c 11- > a2

But from here, not getting how to merge a1 & a2

John809
  • 93
  • 5

7 Answers7

4

One awk idea using the substr() function to step through the positions in the strings:

awk '
/^actual / { actual = $3 }
/^error /  { error  = $3
             for (i=1;i<=length(error);i++)
                 output = output (substr(error,i,1) == "1" ? "X" : substr(actual,i,1))
             print "output :",output
           }
' gpbat

This generates:

output : 00-00000000-000000X1-0X1XX010-10000000

One verbose GNU awk (for patsplit() support) idea:

awk '
/actual[ ]+:/ {   patsplit($3,actual,"[0-9]") }       # split digits from 3rd field and place in array actual[]
/error[ ]+:/  { n=patsplit($3,error ,"[0-9]")         # split digits from 3rd field and place in array error[]
                for (i=1;i<=n;i++)                    # loop through indices of
                    if (error[i]=="1")                # error[] array and if value == "1" then ...
                       actual[i]="X"                  # overwrite corresponding entry in actual[] array with "X"
                for (i=2;i<n;i+=8)                    # for the 2nd, 10th, 18th and 26th entries of actual[] array ...
                    actual[i]=actual[i] "-"           # append a "-"
                printf "output : "                    # start printing output
                for (i=1;i<=n;i++)                    # loop through indices of actual[] array and ...
                    printf "%s", actual[i]            # print to stdout
                print ""                              # terminate the line
              }
' gpbat

This generates:

output : 00-00000000-000000X1-0X1XX010-10000000
markp-fuso
  • 28,790
  • 4
  • 16
  • 36
4

With many versions of awk:

awk '
    /actual/ { split($3,a,"") }
    /error/ { split($3,e,"") }

    END {
        for (i in a)
            c[i] = e[i]==1 ? "X" : a[i]

        for (i = length(c); i>0; --i)
            s = c[i] s

        printf "output : %s\n", s
    }
' gpbat

POSIX states that the result of split on empty string is undefined behaviour "to allow a proposed future extension that would split up a string into an array of individual characters" but I have personally not encountered a version that doesn't already do so. YMMV. For complete portability @markp-fuso's approach using substr would be preferred.


Or slightly more efficiently (per @markp-fuso):

awk '
    /actual/ { n = split($3,a,"") }
    /error/ { split($3,e,"") }

    END {
        for (i = 1; i<=n; ++i)
            o = (o) ( e[i]==1 ? "X" : a[i] )
        print "output :", o
    }
' gpbat

And, if the order of input is guaranteed, processing can stop after the "error" line is read, and so the the END actions can move into the /error/ actions along with an exit to avoid having to read the rest of the file:

awk '
    /actual/ { n = split($3,a,"") }
    /error/ {
        split($3,e,"")
        for (i = 1; i<=n; ++i)
            o = (o) ( e[i]==1 ? "X" : a[i] )
        print "output :", o
        exit
    }
' gpbat
jhnc
  • 11,310
  • 1
  • 9
  • 26
1

Solution in TXR Lisp.

$ txr xbit.tl < data
 output : 00-00000000-000000X1-0X1XX010-10000000
(flow (get-string)
  (match ` actual : @actual\n\
         \ expect : @nil\n\
         \ error  : @error\n`
          @1
    (mapcar (do if (eql @2 #\1) #\X @1) actual error))
  ` output : @1`
  put-line)
Kaz
  • 55,781
  • 9
  • 100
  • 149
1

might as well directly "add" the bit-strings like they're integers :

echo ' actual : 00-00000000-00000011-00110010-10000000
       expect : 00-00000000-00000001-01101010-00000000' |

mawk '{ ___[NR] = $NF } END { NF = +(\
                              FS = OFS = "-" (____ = ""))
                                                __ = 6
    for(_ in ___)____ = (____)___[_] FS

    $(_ = NF) = ____
            _ = __-- 
    while(--_) 
        $_ = sprintf("%.*d", 2^3^(1 < _), $(_+__) + $_)

    gsub(_ = "1","X")
    gsub(_ + _, _)

    print $!(NF = __) }'

00-00000000-000000X1-0X1XX010-X0000000
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11
1

This might work for you (GNU sed):

sed -E '/actual/{s//output/;h}
        /error/!d
        G
        :a;s/^1(.*)\n./\1X\n/;s/^[^1](.*)\n(.)/\1\2\n/;s/\n$//;ta' file

If a line contains actual replace that string by output and make a copy.

If a line does not contains error, delete it.

Otherwise, append the copy to the current line.

Create a loop.

If the first character is a 1,replace the character in the same position in the copied line with an X, delete the first character and shift the newline one character to the right.

If the first character is not a 1, delete the first character and shift the newline right by one character.

If the last character of the line is a newline, remove it.

If a substitution has taken place, repeat the loop, otherwise print the result.

N.B. This assumes the actual and error lines are the same length and follow the same template.

potong
  • 55,640
  • 6
  • 51
  • 83
1

Here is a ruby:

ruby -e '
inf=$<.read.split(/\R/).map(&:strip).map{|s| s.split(/\s+:\s+/,2)}.to_h
puts "output : " + inf["actual"].split("").zip(inf["error"].split("")).
    map{|a,e| if e=="1" then "X" else a end}.join("")
' gpbat

Prints:

output : 00-00000000-000000X1-0X1XX010-10000000

The first line creates a hash of the input (assuming a file as posted):

echo ' actual : 00-00000000-00000011-00110010-10000000
 expect : 00-00000000-00000001-01101010-00000000
 error  : 00-00000000-00000010-01011000-00000000' >gpbat 

ruby -e '
p $<.read.split(/\R/).map(&:strip).map{|s| s.split(/\s+:\s+/,2)}.to_h' gpbat

{"actual"=>"00-00000000-00000011-00110010-10000000", "expect"=>"00-00000000-00000001-01101010-00000000", "error"=>"00-00000000-00000010-01011000-00000000"}

The RH of the second line created the output array as decribed by ziping the strings from "actual" and "error" together:

ruby -e '
inf=$<.read.split(/\R/).map(&:strip).map{|s| s.split(/\s+:\s+/,2)}.to_h
p inf["actual"].split("").zip(inf["error"].split("")).
    map{|a,e| if e=="1" then "X" else a end}
' gpbat

["0", "0", "-", "0", "0", "0", "0", "0", "0", "0", "0", "-", "0", "0", "0", "0", "0", "0", "X", "1", "-", "0", "X", "1", "X", "X", "0", "1", "0", "-", "1", "0", "0", "0", "0", "0", "0", "0"]

Then join that array into a string and add the other parts for the output.

dawg
  • 98,345
  • 23
  • 131
  • 206
0

Perl, just because I love this gibberish:

perl -ne 'print; s/actual/output/ and $o=$_;
@E=$_=~/[01]/g if /error/; 
END {$o =~ s/[01]/$E[$i++]==1?X:$&/ge; print $o}' file
  • for every line of input:
    • print the line
    • substitute actual with output (s/actual/output/) and if that succeeds:
      • save the line ($_) as $o
    • if the line matches /error/, save the [01] bits as array @E using match /[01]/g
  • At the END {} of the input:
    • substitute every bit in string $o with the result of this expression:
      • $E[$i++]==1?X:$& : if the corresponding bit in array @E is 1, then replace with X. Otherwise, replace with the match $& (no change)
    • print string $o

Note that generating the "error" from just the two actual&expect lines is also fun perl using similar s///ge iteration:

perl -ne 'print; @A=$_=~/[01]/g if /actual/; 
s/expect/error / and s/[01]/(0+$&)^$A[$i++]/ge and print' file

... There the bitwise XOR operator ^ can be used, but I found that it required the 0+ numification of the match $&.

stevesliva
  • 5,351
  • 1
  • 16
  • 39