84

Assume I have a Regex pattern I want to match many Strings to.

val Digit = """\d""".r

I just want to check whether a given String fully matches the Regex. What is a good and idiomatic way to do this in Scala?

I know that I can pattern match on Regexes, but this is syntactically not very pleasing in this case, because I have no groups to extract:

scala> "5" match { case Digit() => true case _ => false }
res4: Boolean = true

Or I could fall back to the underlying Java pattern:

scala> Digit.pattern.matcher("5").matches
res6: Boolean = true

which is not elegant, either.

Is there a better solution?

mkneissl
  • 4,902
  • 2
  • 26
  • 28
  • I think `"5" match { case Digit() => true case _ => false }` looks better than using underlying pattern object. – Mygod Jan 31 '17 at 13:20

6 Answers6

71

Answering my own question I'll use the "pimp my library pattern"

object RegexUtils {
  implicit class RichRegex(val underlying: Regex) extends AnyVal {
    def matches(s: String) = underlying.pattern.matcher(s).matches
  }
}

and use it like this

import RegexUtils._
val Digit = """\d""".r
if (Digit matches "5") println("match")
else println("no match")

unless someone comes up with a better (standard) solution.

Notes

  • I didn't pimp String to limit the scope of potential side effects.

  • unapplySeq does not read very well in that context.

Nick
  • 11,475
  • 1
  • 36
  • 47
mkneissl
  • 4,902
  • 2
  • 26
  • 28
  • Did you have any particular side effect in mind? I pimped `String` instead, and this works fine so far, in spite of `String`'s member function `matches(regex: String)`. – KajMagnus Apr 05 '12 at 12:37
  • 1
    I pimped with a function `misses` too. Match and missmatch :-) It's so annoying to have to write `!s.matches(r)` instead of `s misses r`. Hmm – KajMagnus Apr 05 '12 at 12:38
  • 2
    How about the built-in `"5" matches "\\d"` which @polygenelubricants suggested? – Erik Kaplun Feb 16 '14 at 21:17
  • 2
    Data matches a pattern, not vice-versa. The scaladoc on Regex makes a big deal about the lack of a boolean for "matches". Personally, I think you've swapped a nice match for a clunkier if-else. If you don't care about groups, use `case r(_*) =>`. – som-snytt May 14 '14 at 07:02
  • 1
    There has to be a way to do this without importing an external library... – Jameela Huq Dec 22 '17 at 22:38
  • 2
    @JameelaHuq Folks visiting this question will be pleased with 2.13, where Regex finally gets matches. https://github.com/scala/scala/pull/6521 – som-snytt May 25 '18 at 14:13
61

I don't know Scala all that well, but it looks like you can just do:

"5".matches("\\d")

References

polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • 26
    Well, that works, but has the disadvantage that the pattern is compiled on every try to match. I'd like to avoid that for performance reasons. – mkneissl Jun 11 '10 at 10:57
  • 3
    @mkneissl: then it looks like your `.pattern.matcher(text).matches` is the way to go. You can hide the verbosity under some utility method or overloaded operator or something if Scala supports it. – polygenelubricants Jun 11 '10 at 11:18
  • 4
    Thanks, that's what I am going to do, see my answer. I hope answering one's own questions is accepted behaviour on Stack Overflow... Meta says it is... – mkneissl Jun 11 '10 at 14:04
  • 2
    @ed. that's even slower and cruftier, so why? – Erik Kaplun Feb 16 '14 at 21:20
  • 2
    The link given as a reference is broken – Valy Dia Apr 12 '19 at 22:50
15

For the full match you may use unapplySeq. This method tries to match target (whole match) and returns the matches.

scala> val Digit = """\d""".r
Digit: scala.util.matching.Regex = \d

scala> Digit unapplySeq "1"
res9: Option[List[String]] = Some(List())

scala> Digit unapplySeq "123"
res10: Option[List[String]] = None

scala> Digit unapplySeq "string"
res11: Option[List[String]] = None
Vasil Remeniuk
  • 20,519
  • 6
  • 71
  • 81
12
  """\d""".r.unapplySeq("5").isDefined            //> res1: Boolean = true
  """\d""".r.unapplySeq("a").isDefined            //> res2: Boolean = false
Jack
  • 16,506
  • 19
  • 100
  • 167
  • Hmm. Why posting a duplicate of http://stackoverflow.com/a/3022478/158823 two years later? – mkneissl Jan 16 '13 at 19:26
  • 3
    Your original question asked for a result ending in 'true' or 'false', not 'Some' or 'None'. As far as I'm aware isDefined was not part of the library 2 years ago, but maybe it was. Anyway, my answer is not a duplicate ;-) – Jack Jan 16 '13 at 19:38
  • 1
    I see, it isn't a duplicate. Sorry. – mkneissl Jan 16 '13 at 20:41
  • 2
    No probs ;-) My mistake, I should have explained why I'm using isDefined in my answer. Just giving code as an answer is generally a bad idea, so it's my bad. – Jack Jan 16 '13 at 20:52
1

Using Standard Scala library and a pre-compiled regex pattern and pattern matching (which is scala state of the art):

val digit = """(\d)""".r

"2" match {
  case digit( a) => println(a + " is Digit")
  case _ => println("it is something else")
}

more to read: http://www.scala-lang.org/api/2.12.1/scala/util/matching/index.html

Sven
  • 85
  • 2
  • 9
0

The answer is in the regex:

val Digit = """^\d$""".r

Then use the one of the existing methods.

Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • 3
    I don't think anchors is the issue here. `String/Pattern/Matcher.matches`, in Java at least, is whole string match already. I think the issue is just style/idiom for regex-ing in Scala, i.e. what those "one of the existing methods" are. – polygenelubricants Jun 12 '10 at 17:18
  • @polygenelubricants Well, `Matcher.matches` is an aberration. Ok, it makes some optimizations possible, though I don't know if the Java library actually takes advantage of it. But the _standard_ way for Regular Expressions to express that a full match is required is to use anchors. Since the Scala library does _not_ provide a full match method, then the proper way to do it is to use anchors. Either that, or use the Java library. – Daniel C. Sobral Jun 14 '10 at 13:32
  • Anchoring is not the problem. See also the "123" example in Vasil's answer. – mkneissl Jun 14 '10 at 18:15
  • @mkneissl In what way is `findFirstIn` with `"""^\d$""".r` different from using `unapplySeq` with `"""\d""".r`? – Daniel C. Sobral Jun 14 '10 at 22:32
  • 5
    @Daniel You might be missing the point -- My question was, if I only need to know if a regex matches fully, what is a good way to express that in Scala. There are a lot of working solutions, but in summary I think there is a method missing in Regex that just does that and nothing else. To answer the question in your commment: The difference from unapplySeq to findFirstMatch is, that I have to change the Regex to add the anchors. Both methods neither immediately express my intent nor return a boolean value, that is I'd have to go from Option to Boolean (no problem, but adding more clutter). – mkneissl Jun 15 '10 at 07:45
  • 1
    @mkneissl I dislike the concept of Java's `matches`, but ok. As for `Option` vs `Boolean`, add `nonEmpty` to the end and you'll get the `Boolean`. – Daniel C. Sobral Jun 15 '10 at 12:33