5

The Facebook graph API's return to me the user's email address as

foo\u0040bar.com.

in a JSON object. I need to convert it to

foo@bar.com.

There must be a built in method in .NET that changes the Unicode character expression (\u1234) to the actual unicode symbol.

Do you know what it is?

Note: I prefer not to use JSON.NET or JavaScriptSerializer for performance issues.

I think the problem is in my StreamReader:

        requestUrl = "https://graph.facebook.com/me?access_token=" + accessToken;
        request = WebRequest.Create(requestUrl) as HttpWebRequest;
        try
        {
            using (HttpWebResponse response2 = request.GetResponse() as HttpWebResponse)
            {
                // Get the response stream  
                reader = new StreamReader(response2.GetResponseStream(),System.Text.Encoding.UTF8);
                string json = reader.ReadToEnd();

I tried different encodings for the StreamReader, UTF8, UTF7, Unicode, ... none worked.

Many thanks!

Thanks to L.B for correcting me. The problem was not in the StreamReader.

Barka
  • 8,764
  • 15
  • 64
  • 91
  • 1
    Looks like something went wrong when reading/decoding this. Fixing it after the fact will not be so straightforward. – H H Nov 30 '11 at 18:08
  • Most not trivial - how to extract `\u0040` symbol from the string – sll Nov 30 '11 at 18:09
  • 1
    looking back at it it looks like I need to give the StreamReader that reads the JSON the right encoding – Barka Nov 30 '11 at 18:15
  • 1
    `I prefer not to use JSON.NET or JavaScriptSerializer for performance issues` How many requests do you make to facebook per second? How many CPU cycles do you need more? – L.B Nov 30 '11 at 18:46
  • L.B I think and @Henk Holterman also thinks that my StreamReader is at fault. So performance aside, going to other libraries will not solve the problem. – Barka Nov 30 '11 at 18:51
  • No, I said 'something'. And that something probably is _not using_ a proper deserializer lib. – H H Nov 30 '11 at 19:06
  • @user277498, Yes it will **solve**, see my answer – L.B Nov 30 '11 at 19:13

2 Answers2

11

Yes, there is some built in method for that, but that would involve something like using a compiler to parse the string as code...

Use a simple replace:

s = s.Replace(@"\u0040", "@");

For a more flexible solution, you can use a regular expression that can handle any unicode character:

s = Regex.Replace(s, @"\\u([\dA-Fa-f]{4})", v => ((char)Convert.ToInt32(v.Groups[1].Value, 16)).ToString());
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • The latter is exactly what I was typing. +1 – J. Holmes Nov 30 '11 at 18:26
  • This is an excellent answer, however looking back I think the fix really should be in my StreamReader. I edited the question to reflect my new thinking. Many thanks! – Barka Nov 30 '11 at 18:32
  • @user277498: I don't know who's version you used, it was edited several times after i posted it. Anyway, I have fixed it now. – Guffa Nov 30 '11 at 19:17
3

Json responses are not binary data to convert to a string using some encodings. Instead they are strings correctly decoded by your browser or by HttpWebResponse as in your example. You need a second procesing on it(regex, deserializers etc) to get the final data.

See what you get with webClient.DownloadString("https://graph.facebook.com/HavelVaclav?access_token=????") without any encoding

{"id":"100000042150992",
    "name":"Havel V\u00e1clav",
    "first_name":"Havel",
    "last_name":"V\u00e1clav",
    "link":"http:\/\/www.facebook.com\/havel.vaclav",
    "username":"havel.vaclav",
    "gender":"male",
    "locale":"cs_CZ"
}

Would your encoding change \/ to /?

So, the problem is not in your StreamReader.

L.B
  • 114,136
  • 19
  • 178
  • 224
  • But you accepted another answer? Where your best solution would be Json.Net, JavaScriptSerializer etc. – L.B Nov 30 '11 at 20:15
  • both answers were good. it was a tossup. your answer was what i implemented but @Guffa's answer was the correct answer to my exact question. I want to give you both credit. – Barka Nov 30 '11 at 20:37