-2

I had a ~300mb text file full of asterisk calls which needs to be sent to a customer although cannot include specific information,

The only information i would like to extract is as follow;

Everything between the asterisks *NUMBER#NUMBER,sip-out*

I was thinking of using a awk similar to .*#(\d+),sip-out.* on the file numbers.txt

Although my formatting is slightly wrong. any ideas?

The goal is to print out on the screen \n in between the above asterisks.

Thanks in advance.

Ashley

Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
  • 2
    Can you post some sample input and desired output to make it more understandable? – fedorqui Feb 28 '14 at 10:37
  • 2
    sample input and output would be more helpful – Fidel Feb 28 '14 at 10:42
  • Hi Apologies, For example it should only print 123456#624634763,sip-out from masses of text instead of everything, so if i cat file.txt it would bring back everything, but i would say.. like the results of the cat to only print back all information within the parameters *NUMBER#NUMBER,sip-out* – Ashley Jordan Feb 28 '14 at 11:00
  • @user3327188 - no, don't try to tell us what the file looks like in a comment, update your question to include an actual SAMPLE input file of say 10 lines plus the output you'd like to get given that input file. – Ed Morton Feb 28 '14 at 12:54

3 Answers3

0

Maybe this gnu awk (due to RS) will get correct data?

awk -v RS=",sip-out" 'NF{print $NF RS}' file

cat file

some data 123456#624634763,sip-out more data
just 223456#624634763,sip-out more
not this line
1234666#62468883,sip-out

gives this

123456#624634763,sip-out
223456#624634763,sip-out
1234666#62468883,sip-out

If you do not like the sip-out text, just remove the RS from the print like this:

awk -v RS=",sip-out" 'NF {print $NF}' file
123456#624634763
223456#624634763
1234666#62468883
Jotne
  • 40,548
  • 12
  • 51
  • 55
  • Thank you so much, that is really help, now just another quick question. it seems to be printing everything up to the sip-out", how could i limit it so it is everything in between the # and the sipout" for example? – Ashley Jordan Feb 28 '14 at 11:19
  • @user3327188 Try this: `awk -v RS=",sip-out" 'NF {i=split($NF,a,"#");print a[i]}'` – Jotne Feb 28 '14 at 11:29
  • Or this: `awk -v RS=",sip-out" 'NF {sub(/[^#]*#/,"",$NF);print $NF}'` – Jotne Feb 28 '14 at 11:31
0

Using grep with o option.

grep -o "\*.*\*" file
BMW
  • 42,880
  • 12
  • 99
  • 116
0
egrep -o '\*[0-9]+#[0-9]+,sip-out\*' numbers.txt | tr -d '*'
  • Uses egrep -o to only extract all substrings of interest, including the enclosing * chars.
  • Then removes the enclosing * chars. using tr.

Note: With GNU grep you could get away with just a grep command by using look-around assertions:

grep -Po '(?<=\*)[0-9]+#[0-9]+,sip-out(?=\*)' numbers.txt
mklement0
  • 382,024
  • 64
  • 607
  • 775