-1

The problem that I met is that Given a string contains several special substrings like: "\\u0301" "\\b" "\\t",..etc

Please convert all special strings above into "\u0301","\b","\t", ..etc

Note that if just removing a backslash here, you will get a plaintext of "\u0301", instead an accent mark of Unicode.

A one-by-one solution is to replace each special string

str = strings.Replace(str, "\\u0301","\u0301", -1)

Is a general solution to all escape character codes?

bguiz
  • 27,371
  • 47
  • 154
  • 243
WeiAnHsieh
  • 33
  • 4
  • 3
    What format the original string is encoded in? There is a chance there is a proper decoder for it, so that you didn't need to mess with string replace functions. – zerkms May 02 '23 at 03:56
  • 1
    Use package strconv after studing the documentation and understanding how string literals work in Go (or C). – Volker May 02 '23 at 04:25
  • 2
    Could you explain where this string comes from ? for example: is it in a json payload ? – LeGEC May 02 '23 at 04:44
  • my original string are converted from []bytes. Those bytes are content of response body. I printed out the content of []bytes and it's `[92 117 48 51 48 49]` which is the byte data of `"\\u0301"` – WeiAnHsieh May 02 '23 at 08:11
  • So, is it a plain text HTTP response, consisting of only 6 characters? (backslash, u, 0, 3, 0, 1) – qrsngky May 02 '23 at 10:14
  • not exactly. these characters are part of http response. They are in a json payload. – WeiAnHsieh May 03 '23 at 02:52
  • 1
    If you have the complete json, you don't need to worry about manually un-escaping the byte sequence, because `Unmarshal` will take care of it. [Playground link](https://play.golang.com/p/rINCgKCVhgZ) – qrsngky May 03 '23 at 03:06
  • somehow I just don't see any proper struct in previous work. After I created it, `Unmarshal` actually act perfectly. Thank you all for the patience. – WeiAnHsieh May 03 '23 at 07:08

1 Answers1

1

If you need to convert a byte sequence that contains escaped Unicode sequences and control sequences to a Unicode string, then you can use strconv.Unquote function.

To do this, convert the bytes to a string, escape the double-quote and newline characters, and add double-quote characters at the beginning and end of this string.

package main

import (
    "fmt"
    "strconv"
    "strings"
)

func main() {
    b := []byte{65, 92, 117, 48, 51, 48, 49, 9, 88, 10, 34, 65, 34}
    // the same as
    // b := []byte("A\\u0301\tX\n\"A\"")

    // convert to string
    s := string(b)

    fmt.Println(s)
    fmt.Println("=========")

    // escape double quotes
    s = strings.ReplaceAll(s, "\"", "\\\"")
    // escape newlines
    s = strings.ReplaceAll(s, "\n", "\\n")

    r, err := strconv.Unquote(`"` + s + `"`)
    if err != nil {
        panic(err)
    }
    fmt.Println(r)
    fmt.Println("=========")
}

Output:

A\u0301 X
"A"
=========
Á  X
"A"
=========

https://go.dev/play/p/WRsNGOT1zLR

Wild Zyzop
  • 580
  • 1
  • 3
  • 13