I have a html text file and i want to format it so that paragraphs are always on the same line e.g.
<p>paragraph info here</p>
instead of
<p>paragraph
info here </p>
Is there a tool that enables me to do this
I have a html text file and i want to format it so that paragraphs are always on the same line e.g.
<p>paragraph info here</p>
instead of
<p>paragraph
info here </p>
Is there a tool that enables me to do this
You can use sed
cat test.html |sed ':a;N;$!ba;s/\n/ /g' |sed 's/<\/p> /<\/p>\n/g'
In first run it remove all line break and then add it after paragraph tag
It is not clear but it work
While the requirement paragraphs are always on the same line would be met by simply joining the whole file to a single line, this solution is less radical:
perl -pe 'if (/<p>/../<\/p>/) { s/\n/ / unless /<\/p>/ }' test.html