19

What are the differences between perl and java with regard to what regular expression terms are supported?

This question is isolated to just the regular expressions, and specifically excludes differences in how regex can be used - ie the functions/methods available that use regex - and syntactic differences between the languages such as the java requirement to escape backslashes etc.

Of particular interest is the partial/occasional support java has for variable length look-behinds.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • 1
    A bit off the question, but this is a rough comparison between **ECMA** (JS) regex and Perl: http://stackoverflow.com/a/12127503/1400768 – nhahtdh Dec 25 '12 at 11:57

3 Answers3

20

The "Comparison to Perl 5" section of java.util.regex.Pattern lists many differences. For example, Java does not support conditional regex. For that, you need to use some external library like JRegex.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • 8
    That seems quite complete. To that, I'd add that Perl regex engine has much better Unicode support than others. In fact, the Perl devs recently encountered shortcomings in the Unicode standard because noone's ever gotten as far as Perl has before. (The standard was updated as a result!) – ikegami Dec 25 '12 at 11:54
1

There is a paragraph in java.util.regex.Pattern API "Comparison to Perl 5".

Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
  • Please add a link. The page I got from searching that is saying Perl regex doesn't have a feature that was added half a decade ago. – ikegami Dec 25 '12 at 11:30
  • @ikegami.. Possibly you got to the Java 6 or older link. Search for `PAttern class in Java 7`. I have added the link in my answer. – Rohit Jain Dec 25 '12 at 11:40
  • the link to Pattern class and comparsion to perl: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#jcc – Jeryl Cook Apr 04 '22 at 19:44
1

The slides from Tom Christiansen's OSCON talk Unicode Support Shootout: The Good, the Bad, & the (mostly) ugly cover some of the differences between Perl and Java (and other languages) regarding support of the Unicode technical recommendations for regexes, and they distinguish between Java 1.6 and 1.7 (which improves support significantly).

hobbs
  • 223,387
  • 19
  • 210
  • 288