1

There are numerous questions on how to do a multiline regex in Perl. Most of them mention the s switch that makes a dot match a newline. However, I want to match an exact phrase (so, not a pattern) and I don't know where the newlines will be. So the question is: can you ignore newlines, instead of matching them with .?

MWE:

$pattern = "Match this exact phrase across newlines";

$text1 = "Match\nthis exact\nphrase across newlines";
$text2 = "Match this\nexact phra\nse across\nnewlines";

$text3 = "Keep any newlines\nMatch this exact\nphrase across newlines\noutside\nof the match";

$text1 =~ s/$pattern/replacement text/s;
$text2 =~ s/$pattern/replacement text/s;
$text3 =~ s/$pattern/replacement text/s;

print "$text1\n---\n$text2\n---\n$text3\n";

I can put dots in the pattern instead of spaces ("Match.this.exact.phrase") but that does not work for the second example. I can delete all newlines as preprocessing but I would like to keep newlines that are not part of the match (as in the third example).

Desired output:

replacement text
---
replacement text
---
Keep any newlines
replacement text
outside
of the match
Marijn
  • 1,640
  • 14
  • 24
  • Most of the time, you are treating newlines as spaces. Then there's the one time you want to ignore it. Doing either is easy. Doing both is next to impossible. – ikegami May 24 '16 at 14:09

4 Answers4

3

Just replace the literal spaces with a character class that matches a space or a newline:

$pattern = "Match[ \n]this[ \n]exact[ \n]phrase[ \n]across[ \n]newlines";

Or, if you want to be more lenient, use \s or \s+ instead, since \s also matches newlines.

Mark Reed
  • 91,912
  • 16
  • 138
  • 175
3

Most of the time, you are treating newlines as spaces. If that's all you wanted to do, all you'd need is

$text =~ s/\n/ /g;
$text =~ /\Q$text_to_find/    # or $text =~ /$regex_pattern_to_match/

Then there's the one time you want to ignore it. If that's all you wanted to do, all you'd need is

$text =~ s/\n//g;
$text =~ /\Q$text_to_find/    # or $text =~ /$regex_pattern_to_match/

Doing both is next to impossible if you have a regex pattern to match. But you seem to want to match literal text, so that opens up some possibilities.

( my $pattern = $text_to_find )
   =~ s/(.)/ $1 eq " " ? "[ \\n]" : "\\n?" . quotemeta($1) /seg;
$pattern =~ s/^\\n\?//;
$text =~ /$pattern/
ikegami
  • 367,544
  • 15
  • 269
  • 518
2

It sounds like you want to change your "exact" pattern to match newlines anywhere, and also to allow newlines instead of spaces. So change your pattern to do so:

$pattern = "Match this exact phrase across newlines";
$pattern =~ s/\S\K\B/\n?/g;
$pattern =~ s/ /[ \n]/g;
ysth
  • 96,171
  • 6
  • 121
  • 214
1

It certainly is ugly, but it works:

M\n?a\n?t\n?c\n?h\st\n?h\n?i\n?s\se\n?x\n?a\n?ct\sp\n?h\n?r\n?a\n?s\n?e\sa\n?c\n?r\n?o\n?s\n?s\sn\n?e\n?w\n?l\n?i\n?n\n?e\n?s

For every pair of letters inside a word, allow a newline between them with \n?. And replace each space in your regex with \s.

May not be usable, but it gets the job done ;)

Check it out at regex101.

SamWhan
  • 8,296
  • 1
  • 18
  • 45