I am writing a perl script that will be run inside of an Automator app to process documents that were previously processed by hand. I need to do this process weekly, always with the same junk data removed. These are rtf files, converted from html files on Mac OS X using another Automator script in order to maintain formatting. I have created a new droplet script to process the rtf files to remove unnecessary junk data.
My shell script is:
#!/bin/bash
#
# replace CR with CRLF
#
/usr/bin/perl -CSDA -pi <<'EOF' - "$@"
s/dateformat//og;
s/text1//og;
s/text2//og;
s/text3//og;
s///og;
EOF
This takes care of 99% of what needs to be done. However, the final file comes out with excess line breaks. Is there any way to have that the substitution of text1, text2 etc includes removing the line break that follows? My only restriction is that this has to be able to be run in an Automator script shell window.
Input sample data is formatted as such:
Text1 Dateformat
[Content1]
Text2 Dateformat
[Content2]
Text3 Dateformat
[Content3]
The script above produces output:
[Content1]
[Content2]
[Content3]
Desired output should be formatted as:
[Content1]
[Content2]
[Content3]
In the original document, there is a single line break after a content block, then the Text1
and Dateformat
.
After processing, Text1
and Dateformat
are removed, but as you can see there are now two line breaks between content blocks.