I know there are tons of questions about matching multiline regexes with perl on this site, however, I'm still having trouble figuring out how to do the below. So any help or links to the relevant questions would be highly appreciated.
I have a text file input.txt
that is structured with a field-label (identified by a backslash) and field-contents, like this:
\x text
\y text text
text text
\z text
Field-contents can contain line breaks, but for further processing I need to make sure that all field contents are on one line. The following apparently is able to correctly match across multiple lines, however, it doesn't delete it but instead reinserts it.
#!/usr/bin/perl
$/ =undef;
{
open(my $in, "<", "input.txt") or die "impossible: $!";
open(my $out, ">", "output.txt") or die "Can't open output.txt: $!";
while (<$in>) {
s/\n([^\\])/ \1/g; # delete all line breaks unless followed by backslash and replace by a single space
print $out $_ ;
}
}
It adds the space to the front (so I know it correctly finds it) but nonetheless keeps the newline character. Output looks like this:
\x text
\y text text
text text
\z text
Whereas I was hoping to get this:
\x text
\y text text text text
\z text