1

Right now I am trying to remove any commas that are contained within quotation marks and replace them with spaces in this string:

(,(,data,"quoted,data",123,4.5,),(,data,(,!@#,(,4.5,),"(,more","data,)",),),)

I am currently using this function that uses Javascript style regex:

removeNeedlessCommmas sExpression =
    sExpression
      |> (\_ -> replaceSpacesWithCommas sExpression)
      |> Regex.replace Regex.All (Regex.regex ",") (\_ -> ",(?!(?:[^"]*"[^"]*")*[^"]*$)g")

This regex is displayed as working correctly in sites such as regex101.com.

However, I have tried many ways of escaping the regex so that it works in Elm 0.16, but the rest of my code in my file is always still highlighted like the rest of the file is enclosed in a string. This is the error that I am getting with my current code:

(line 1, column 64): unexpected "_" expecting space, "&" or escape code

39│     printToBrowser "((data \"quoted data\" 123 4.5) (data (!@#(4.5) \"(more\" \"data)\")))"

Maybe <http://elm-lang.org/docs/syntax> can help you figure it out.

I will post the main function that the error is referring to so that it makes more sense:

main : Html.Html
main =
    printToBrowser "((data \"quoted data\" 123 4.5) (data (!@# (4.5) \"(more\" \"data)\")))"

Any assistance would be greatly appreciated. Thanks in advance.

Thomas Lloyd
  • 67
  • 1
  • 6

2 Answers2

2

I think you need 3 things:

  1. Add a closing ) to the last anonymous function in removeNeedlessCommmas (this could have just been a copy-paste error)
  2. Escape all the inner " in your regex like so: ",(?!(?:[^\"]*\"[^\"]*\")*[^\"]*$)g"
  3. Use the regex for matching, and replace with a space like so: Regex.replace Regex.All (Regex.regex ",(?!(?:[^\"]*\"[^\"]*\")*[^\"]*$)g") (\_ -> " ")
robertjlooby
  • 7,160
  • 2
  • 33
  • 45
  • Thank you very much for your response. You are right about point 1, but the regex you posted seems to give the same error as before. Here is the new removeNeedlessCommmas function: `removeNeedlessCommmas sExpression = sExpression |> (\_ -> replaceSpacesWithCommas sExpression) |> Regex.replace Regex.All (Regex.regex ",") (\_ -> ",(?!(?:[^\"]*\"[^\"]*\")*[^\"]*$)g")` – Thomas Lloyd Feb 07 '16 at 17:59
  • hmm, that regex and the string in `main` compile fine for me http://www.share-elm.com/sprout/56b78887e4b070fd20da9a28 – robertjlooby Feb 07 '16 at 18:11
  • Ok, it seems that the error is gone, but every comma in the string has been replaced with `,(?!(?:[^\"]*\"[^\"]*\")*[^\"]*$)g`. Can you tell if this would be the result of an error in the regex or an error in my programs logic? – Thomas Lloyd Feb 07 '16 at 18:26
1

If you'd consider a cowardly workaround alternative to a death-defying super-regex, I can offer this:

removeNeedlessCommas sExpr = 
  replace All (regex "\"[^\"]*?\"")
    (\{match} -> String.map (\c -> if c == ',' then ' ' else c) match)
    sExpr

It lets regex find the quoted strings but does the comma substitution to those strings in a separate step. If preferred, that could be done by regex as well.

Here's my test harness, which ran fine in http://elm-lang.org/try :

import Html exposing (..)
import Regex exposing (..)
import String

str = """(,(,data,"quoted,data",123,4.5,),(,data,(,!@#,(,4.5,),"(,more","data,)",),),)"""
main = div [] 
  [ (text str)
  , br [] []
  , (text (removeNeedlessCommas str))]

Output:

(,(,data,"quoted,data",123,4.5,),(,data,(,!@#,(,4.5,),"(,more","data,)",),),)
(,(,data,"quoted data",123,4.5,),(,data,(,!@#,(,4.5,),"( more","data )",),),)

Just for good measure, here's an algorithmic solution that does completely without regex:

removeNeedlessCommas str = 
  reverse
  <| snd
  <| foldl (\c (inQ, acc) ->
              case c of
                '"' -> (not inQ, cons c acc)
                ',' -> (inQ, cons (if inQ then ' ' else c) acc)
                _ -> (inQ, cons c acc))
           (False, "")
           str 
Carl Smotricz
  • 66,391
  • 18
  • 125
  • 167