28

I cannot match a String containing newlines when the newline is obtained by using %n in Formatter object or String.format(). Please have a look at the following program:

public class RegExTest {

  public static void main(String[] args) {
    String input1 = String.format("Hallo\nnext line");
    String input2 = String.format("Hallo%nnext line");
    String pattern = ".*[\n\r].*";
    System.out.println(input1+": "+input1.matches(pattern));
    System.out.println(input2+": "+input2.matches(pattern));
  }

}

and its output:

Hallo
next line: true
Hallo
next line: false

What is going on here? Why doesn't the second string match?

Java version is 1.6.0_21.

Kalle Richter
  • 8,008
  • 26
  • 77
  • 177
Axel
  • 13,939
  • 5
  • 50
  • 79

2 Answers2

69

You can set the Pattern.DOTALL flag to make . match newlines, as default it doesn't. It is done with the (?s) notation. So, this regex does what you want:

    String pattern = "(?s).*[\n\r].*";
Keppil
  • 45,603
  • 8
  • 97
  • 119
  • Then why does the first one match (I'm on windows)? – Axel Jul 25 '12 at 07:04
  • 1
    Also, you might want to switch the `[\r\n]` part to `\r?\n` to be able to match both `\n` and `\r\n`. – Keppil Jul 25 '12 at 07:05
  • 3
    Just found out. On windows, lineend is `\r\n`. The `\n` in `input1` is not considered a line end and so the regex matches. – Axel Jul 25 '12 at 07:08
  • Thank you, but in this case it is not needed. I'm matching to find out whether field quoting is necessary while creating a csv file, and so it is sufficient to know if any of these characters are conained in the string. – Axel Jul 25 '12 at 07:11
  • The second one using `"(?m).*[\n\r].*"` doesn't work either, but `"(?s).*[\n\r].*"` does. Please update your answer so that I can accept it. – Axel Jul 25 '12 at 07:15
  • First one should be "(?s).*[\n\r].*", second one doesn't work. If you update the first and remove the second, I will accept. – Axel Jul 26 '12 at 05:45
  • @Alex: Sorry for all the different answers, I had troubles understanding exactly what you were trying to achieve. – Keppil Jul 26 '12 at 06:56
  • I don't get this answer, why not just `"(?s).*"` or `"(?s).*\\R.*"` if you want there to be at least one new line? – Maarten Bodewes Nov 13 '18 at 19:07
  • @Axel if you're writing CSV output, you should be aware that it requires `\r\n` row terminators, regardless of the platform's line ending. – OrangeDog May 28 '20 at 16:11
  • The current regex is overly verbose, and adding the `(?s)` doesn't truly "fix" the problem. The question is worded very narrowly and the actual use-case (if you read the comments) is even more narrow. So this works for this particular use-case but this answer doesn't actually explain why. There are much better ways to (a) write a regular expression to solve the original problem, (b) use regular expressions in general to solve this problem, and (c) solve the problem _without_ regular expressions. – Christopher Schultz May 20 '22 at 18:59
22

On Windows, in Java, \n is LF, \r is CR and %n is CRLF. Your pattern does not match the latter.

As of Java 8, you can now use \R in regular expressions to match any end-of-line sequence.

Linebreak matcher

\R Any Unicode linebreak sequence, is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]

Example:

String pattern = ".*\\R.*";
String.format("Hallo\nnext line").matches(pattern); // true
String.format("Hallo%nnext line").matches(pattern); // true
String.format("Hallo same line").matches(pattern); // false
OrangeDog
  • 36,653
  • 12
  • 122
  • 207
  • Yes, ".*\r?\n.*" works, but not if there are multiple line breaks. I now am using "(?s).*[\n\r].*". – Axel Jul 26 '12 at 05:49
  • `(?s).*\\R.*` can be used if you want there to be at least one line end. Otherwise just use `(?s).*` to allow any number of line endings. – Maarten Bodewes Nov 13 '18 at 19:09
  • @Axel if the aim to test whether a string contains any linebreak, use the pattern `\\R` and the `Matcher.test()` method. – OrangeDog May 28 '20 at 16:13
  • Ah, \R was introduced in Java 8. It wasn’t available when I posted the question, and I missed this addition to the Java regex implementation. The second thing I learned today. As that library still exists and now has a minimum Java version of 8, I will update my code right away. – Axel May 28 '20 at 16:25