-3

I am trying to unquote a string that uses single quotes in Go (the syntax is same as Go string literal syntax but using single quotes not double quotes):

'\'"Hello,\nworld!\r\n\u1F60ANice to meet you!\nFirst Name\tJohn\nLast Name\tDoe\n'

should become

'"Hello,
world!
Nice to meet you!
First Name      John
Last Name       Doe

How do I accomplish this?

strconv.Unquote doesn't work on \n newlines (https://github.com/golang/go/issues/15893 and https://golang.org/pkg/strconv/#Unquote), and simply strings.ReplaceAll(ing would be a pain to support all Unicode code points and other backslash codes like \n & \r & \t.

I may be asking for too much, but it would be nice if it automatically validates the Unicode like how strconv.Unquote might be able to do/is doing (it knows that x Unicode code points may become one character), since I can do the same with unicode/utf8.ValidString.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Ken Shibata
  • 71
  • 1
  • 9
  • I'm unfamiliar with Go, but could you deserialise it with a JSON parser? – Matthew Jan 09 '21 at 05:51
  • Thanks, @Matthew! It works: https://play.golang.org/p/Q6cy-anCwds – Ken Shibata Jan 09 '21 at 05:58
  • @CeriseLimón It's in Go and JSON: '\n' gets converted to a rune with a newline in it, and the same goes for '\u4E16'. – Ken Shibata Jan 09 '21 at 05:59
  • 5
    Single quotes are not used as string delimiters in Go or JSON. Edit the question to state the syntax used. Is it from Javascript, Vimscript, ??? – Charlie Tumahai Jan 09 '21 at 05:59
  • @CeriseLimón https://yourbasic.org/golang/multiline-string/ – Ken Shibata Jan 09 '21 at 06:00
  • Single quotes in Go define runes. a string is a slice of runes (EDIT incorrect: a string is a slice of bytes). A rune is a unicode code point. – Ken Shibata Jan 09 '21 at 06:01
  • 1
    The example in the question uses single quotes. The article you linked uses backquotes. Which is it? A string is not a slice of runes. – Charlie Tumahai Jan 09 '21 at 06:02
  • Sorry, I think I misunderstood. To clarify, I want to unquote a single quotes strings using Go code, not use single quote strings in Go code. The syntax is the same as Go but with single quotes. – Ken Shibata Jan 09 '21 at 06:03
  • Oops, sorry, yes - a string is a slice of bytes. – Ken Shibata Jan 09 '21 at 06:03
  • 1
    If the syntax is Go syntax with `"` replaced with `'`, then use `strconv.Unquote(strings.ReplaceAll(s, "'", "\""))`. Also, edit the question to state the syntax. A string is a sequence of bytes, not a slice of bytes. – Charlie Tumahai Jan 09 '21 at 06:05
  • The problem with the above code is that when `'` becomes `"`, I do not know if `"` in the results is originally `'` or `"`. For example, `'hello\''` becomes `hello\"`. – Ken Shibata Jan 09 '21 at 06:09

1 Answers1

0

@CeriseLimón came up with this answer, and I just put it into a function with more shenanigans to support \ns. First, this swaps ' and ", and changes \ns to actual newlines. Then it strconv.Unquotes each line, since strconv.Unquote cannot handle newlines, and then reswaps ' and " and pieces them together.

func unquote(s string) string {
        replaced := strings.NewReplacer(
            `'`,
            `"`,
            `"`,
            `'`,
            `\n`,
            "\n",
        ).Replace(s[1:len(s)-1])
        unquoted := ""
        for _, line := range strings.Split(replaced, "\n") {
            tmp, err := strconv.Unquote(`"` + line + `"`)
            repr.Println(line, tmp, err)
            if err != nil {
                return nil, NewInvalidAST(obj.In.Text.LexerInfo, "*Obj.In.Text.Text")
            }
            unquoted += tmp + "\n"
        }
        return strings.NewReplacer(
            `"`,
            `'`,
            `'`,
            `"`,
        ).Replace(unquoted[:len(unquoted)-1])
}
Ken Shibata
  • 71
  • 1
  • 9