0

I wrote the following function:

import scala.util.matching.Regex
val COL1 = "COL1"
val COL2 = "COL2"
val COL3 = "COL3"
val COL4 = "COL4"
val COL5 = "COL5"
val reg = ".+-([\w\d]{3})-([\d\w]{3})-([\d\w]{3})-([\w]+)$-([\w]+)".r.unanchored
val dataExtraction: String => Map[String, String] = {
  string: String => {
    string match {
      case reg(col1, col2, col3, col4, col5) =>
                 Map(COL1 -> col1, COL2 -> col2, COL3 -> col3, COL4 -> col4 ,COL5 -> col5 )
      case _  => Map(COL1 -> "", COL2 -> "", COL3 -> "", COL4 -> "" ,COL5 -> "" )
    }
  }
}

it is supposed to parse strings like "dep-gll-cde3-l4-result" or "cde3-gll-dep-l4-result"

any idea how to define a regex parsing both of these

scalacode
  • 1,096
  • 1
  • 16
  • 38

1 Answers1

3

You may use the following regex:

val reg = """(\w{3,4})-(\w{3})-(\w{3,4})-(\w+)-(\w+)""".r

You need not make it unanchored since that pattern matches your whole inputs.

Note that inside a triple quoted string literal you may define backslashes with a single \, in your case, they need doubling. Also, see the {3,4} quantifiers that seem sufficient for the cases you provided.

See the online Scala demo and the regex demo.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • this doesn't keep the same order , I got : dataExtraction("dfp-vll-cvi3-l2-result") I got Map[String,String] = Map(COL4 -> l2, COL2 -> vll, COL3 -> cvi3, COL1 -> dfp, COL5 -> result). is there a way to keep the same order please – scalacode Nov 29 '18 at 10:38
  • 1
    @scalacode `Map` does not keep the order of the items, if you need to keep the order [use `LinkedHashMap`](https://stackoverflow.com/questions/3835743/scala-map-implementation-keeping-entries-in-insertion-order). – Wiktor Stribiżew Nov 29 '18 at 10:41