we process a lot of srt files in linux to generate derivatives , but some of them have ctrl-M characters since they were generated in windows. right now I put two commands to check and take out the hidden characters
tr -d '\015' <${file}.srt >${file}.srt
awk '/^$/{ if (! blank++) print; next } { blank=0; print }' ${file}.srt | tee ${file}.srt
but I still have srt files that slips through the command and still have ctrl-M character in it. Does anyone have a solution in this case to keep on empty line only between each subtle lines? so if pre-processed srt file looks like
1
00:00:05,569 --> 00:00:07,569
Welcome to this overview of ShareStream,
2
00:00:07,820 --> 00:00:11,940
which is a new digital streaming service
from Information Technology Services
3
00:00:11,940 --> 00:00:13,740
at the University of Iowa.
after taking out the ctrl-M character or extra space line should be
1
00:00:05,569 --> 00:00:07,569
Welcome to this overview of ShareStream,
2
00:00:07,820 --> 00:00:11,940
which is a new digital streaming service
from Information Technology Services
3
00:00:11,940 --> 00:00:13,740
at the University of Iowa.
any help is appreciated thanks!