1

I have a number of txt (*.log) files containing lines of this type:

...

Mismatched '9:132170673': f[G,T] = [0.32,0.68]

Mismatched '9:132228706': f[C,T] = [0.27,0.73]

Possible strand flip for '9:132280822': f[C,G,T] = [0.16,0.00,0.84]

...

I am trying to extract the string number:number between the quotation marks. FROM THE COMMAND LINE!

I can manage it with with a script but would like to understand how to do it from the command line. There must be an easy way!

I have been trying the obvious solutions, for example:

  1. perl -ne 'if (/Possible/ or /Mismatch/) {/'(\S+)'/ ;print "$1\n";}' *.log

  2. perl -ne 'if (/Possible/ or /Mismatch/) {/\'(\S+)\'/ ;print "$1\n";}' *.log

Both gets this answer from UNIX:

-bash: syntax error near unexpected token `('

I also tried to split on the /'/ with the -F option got the same error.

How do I escape the ' inside the command line?

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Ekan
  • 11
  • 1

4 Answers4

1

Put your code inside double quotes. If " occurs in your code then put that part only inside single quotes.

$ perl -ne "if (m/Possible|Mismatch/) {/'(\S+)\'/ ;print "'"$1\n";}' file
9:132170673
9:132228706
9:132280822

OR

perl -ne "if(/Possible/ or /Mismatch/) {/'([^']+)'/ ;print "'"$1\n";}' file
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
1

As the bash is evaluating the single quotes I simply made three strings out of it, the first one ending after /, then adding a single escaped single quote and continuing with the last part. This way the strings are glued together by bash and Perl gets the right input.

perl -ne 'if (/Possible/ or /Mismatch/) {/'\''(\S+)'\''/ ;print "$1\n";}' *.txt
nlu
  • 1,903
  • 14
  • 18
1

I'd do:

perl -nE '/\b(?:Mismatched|Possible)\b.*?'\''(\S+)'\''/ and say $1'  in1.txt

Output:

9:132170673
9:132228706
9:132280822
Toto
  • 89,455
  • 62
  • 89
  • 125
1

I find using double quotes for all or part of the script argument to be problematic - particularly when you need to include a $ or a ! in the script.

Another approach is to use the fact that the single quote character is at code point number 27(hex) in the ASCII/Unicode chart. In a Perl string or regex you can refer to it as \x27 or \x{27}:

perl -ne 'if (/Possible/ or /Mismatch/) {/\x27(\S+)\x27/ ;print "$1\n";}' *.log

You could use a named variable to make things clearer but that's probably overkill for a 1-liner:

perl -ne 'BEGIN { $apos = "\x27" } if (/Possible/ or /Mismatch/) {/$apos(\S+)$apos/ ;print "$1\n";}' *.log
Grant McLean
  • 6,898
  • 1
  • 21
  • 37
  • Thank you Grant! perl -ne 'if (/Possible/ or /Mismatch/) {/\x27(\S+)\x27/ ;print "$1\n";}' *.log is exactly the kind of answer I was looking for. Also, thanks to all you other guys! – Ekan Mar 13 '15 at 11:05