-1

I have a html text file and i want to format it so that paragraphs are always on the same line e.g.

<p>paragraph info here</p>

instead of

<p>paragraph
info here </p>

Is there a tool that enables me to do this

rurounisuikoden
  • 269
  • 1
  • 4
  • 16

2 Answers2

0

You can use sed

 cat test.html |sed ':a;N;$!ba;s/\n/ /g' |sed 's/<\/p> /<\/p>\n/g'

In first run it remove all line break and then add it after paragraph tag

It is not clear but it work

newman
  • 2,689
  • 15
  • 23
0

While the requirement paragraphs are always on the same line would be met by simply joining the whole file to a single line, this solution is less radical:

perl -pe 'if (/<p>/../<\/p>/) { s/\n/ / unless /<\/p>/ }' test.html
Armali
  • 18,255
  • 14
  • 57
  • 171