2

Single entry has multiple lines. Each entry is separated by two blank lines. Each entry has to be made into a single line followed by a delimiter(;).

Sample Input:

Name:Sid
ID:123


Name:Jai
ID:234


Name:Arun
ID:12

Tried replacing the blank lines with cat test.cap | tr -s [:space:] ';'

Output:

Name:Sid;ID:123;Name:Jai;ID:234;Name:Arun;ID:12;

Expected Output:

Name:SidID:123;Name:JaiID:234;Name:ArunID:12;

Same is the case with Xargs.

I've used sed command as well but it only joined two lines into one. Where as I've 132 lines as one entry and 1000 such entries in one file.

sesha
  • 59
  • 8

3 Answers3

1

Could you please try following.

awk 'NF{val=(val?$0~/^ID/?val $0";":val $0:$0)} END{print val}' Input_file

Output will be as follows.

Name:SidID:123;Name:JaiID:234;Name:ArunID:12;

Explanation: Adding explanation of above code too now.

awk '                                    ##Starting awk program here.
NF{                                      ##Checking condition if a LINE is NOT NULL and having some value in it.
  val=(val?$0~/^ID/?val $0";":val $0:$0) ##Creating a variable val here whose value is concatenating its own value along with check if a line starts with string ID then add a semi colon at last else no need to add it then.
}
END{                                     ##Starting END section of awk here.
  print val                              ##Printing value of variable val here.
}
'  Input_file                            ##Mentioning Input_file name here.
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thanks Ravinder. While this worked on the sample input, for some reason it didn't execute the original cap file (Neither output nor error has been shown) – sesha Sep 17 '18 at 09:38
  • @sesha, not sure it worked for me, could you please check if you have control M characters in it? by doing `cat -v Input_file` if yes then remove them by doing `tr -d '\r' < Input_file > Output_file`, let me know how it goes then? – RavinderSingh13 Sep 17 '18 at 09:39
  • 1
    @RavinderSingh13 I doubt CRs are the culprit, since my solution worked (and it is based on the assumption there are no CRs in the input). – Wiktor Stribiżew Sep 17 '18 at 09:42
  • @WiktorStribiżew, not sure but it is working fine for me sir. – RavinderSingh13 Sep 17 '18 at 09:42
  • 1
    @RavinderSingh13 [Works on my machine meme](https://me.me/i/broken-in-productionp-works-on-my-machine-5227598e96d74afc9419806607e5a152) :) – Wiktor Stribiżew Sep 17 '18 at 09:53
1

You may use

cat file | awk  'BEGIN { FS = "\n"; RS = "\n\n"; ORS=";" } { gsub(/\n/, "", $0); print }' | sed 's/;;*$//' > output.file

Output:

Name:SidID:123;Name:JaiID:234;Name:ArunID:12

Notes:

  • FS = "\n" will set field separators to a newline`
  • RS = "\n\n" will set your record separators to double newline
  • gsub(/\n/, "", $0) will remove all newlines from a found record
  • sed 's/;;*$//' will remove the trailing ; added by awk

See the online demo

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

This might work for you (GNU sed):

sed -r '/./{N;s/\n//;H};$!d;x;s/.//;s/\n|$/;/g' file

If it is not a blank line, append the following line and remove the newline between them. Append the result to the hold space and if it is not the end of the file, delete the current line. At the end of the file, swap to the hold space, remove the first character (which will be a newline) and then replace all newlines (append an extra semi-colon for the last line only) with semi-colons.

potong
  • 55,640
  • 6
  • 51
  • 83