0

I have a string that can contain unicode characters in the form of "\u{0026}" and I want it to be converted to its appropriate character "&".

How do I do that?

let input = "\\u{0026} something else here"
let expectedOutput = "& something else here"

Thanks a lot!

Junz
  • 3
  • 2
  • 2
    `"\u{0026} something else here"` logs "& something else here". What's the problem? – Ahmad F Dec 21 '16 at 09:31
  • Made an edit to add an extra backslash. Thanks for pointing it out – Junz Dec 21 '16 at 09:41
  • 1
    Where does the string come from? *Why* is it encoded like that? – Martin R Dec 21 '16 at 09:49
  • It comes from Google Translate's unofficial API. https://ctrlq.org/code/19909-google-translate-api – Junz Dec 21 '16 at 09:59
  • It actuallys returns in the format "\u0026" but I managed to use regex to add the curly braces – Junz Dec 21 '16 at 10:01
  • 2
    AHA! So we have a typical [XY problem](http://xyproblem.info) here. What you apparently get is [JSON](http://json.org) (which uses `"\u0026"` as Unicode escape sequence). *Don't* add curly braces. Use (NS)JSONSerialization and you are done. – Martin R Dec 21 '16 at 10:03
  • See also [What is the XY problem?](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem): *"To avoid falling into this trap, always include information about a broader picture along with any attempted solution."* – Martin R Dec 21 '16 at 10:12
  • The unofficial API actually gives a reply in this format [[["你好&","hello\u0026",,,1]],,"en"] which wouldn't exactly qualify as JSON, so I'm kinda at a loss here – Junz Dec 21 '16 at 14:27

2 Answers2

0

You may need to use regular expression:

class StringEscpingRegex: NSRegularExpression {
    override func replacementString(for result: NSTextCheckingResult, in string: String, offset: Int, template templ: String) -> String {
        let nsString = string as NSString
        if
            result.numberOfRanges == 2,
            case let capturedString = nsString.substring(with: result.rangeAt(1)),
            let codePoint = UInt32(capturedString, radix: 16),
            codePoint != 0xFFFE, codePoint != 0xFFFF, codePoint <= 0x10FFFF,
            codePoint<0xD800 || codePoint > 0xDFFF
        {
            return String(Character(UnicodeScalar(codePoint)!))
        } else {
            return super.replacementString(for: result, in: string, offset: offset, template: templ)
        }
    }
}

let pattern = "\\\\u\\{([0-9A-Fa-f]{1,6})\\}"
let regex = try! StringEscpingRegex(pattern: pattern)

let input = "\\u{0026} something else here"
let expectedOutput = "& something else here"

let actualOutput = regex.stringByReplacingMatches(in: input, range: NSRange(0..<input.utf16.count), withTemplate: "?")

assert(actualOutput == expectedOutput) //assertion succeeds

I don't understand how you have gotten your input. But if you adopted some standard-based representation, you could get the expectedOutput more simply.

OOPer
  • 47,149
  • 6
  • 107
  • 142
  • @WongJunMing, I don't know much about the result of the API you are referring, but at least, if you do not insert braces into it, the code can be simpler. And remember, the code shown in my answer does not work, if the result of the API uses escaped surrogate pairs to represent non-BMP characters. You'd better start a new thread asking how to decode the original response (before inserting braces), you would find many simpler answers. – OOPer Dec 21 '16 at 14:44
0

In fact, I'm not familiar with what @MartinR suggested in his comment(s), it might be the solution for your issue...

However, you can simply achieve what are you trying to do by using the replacingOccurrences(of:with:) String method:

Returns a new string in which all occurrences of a target string in the receiver are replaced by another given string.

So, applied to your string:

let input = "\\u{0026} something else here"

let output1 = input.replacingOccurrences(of: "\\u{0026}", with: "\u{0026}") // "& something else here"

// OR...

let output2 = input.replacingOccurrences(of: "\\u{0026}", with: "&") // "& something else here"

Hope it helped.

Community
  • 1
  • 1
Ahmad F
  • 30,560
  • 17
  • 97
  • 143