3

I'm currently trying to POST some JSON containing emojis to a python API. I tried feeding the NSJSONSerialization directly with the string containing the emojis from my UITextField but the serializer crashed with no meaningful explanation. Afterwards I tried to do some format conversion and ended up with something like this:

NSString *uniText = mytextField.text;
NSData *msgData = [uniText dataUsingEncoding:NSNonLossyASCIIStringEncoding];
NSString *goodMsg = [[NSString alloc] initWithData:msgData encoding:NSUTF8StringEncoding] ;

This basically works except that the resulting UTF-8 is kinda double-"escaped" resulting in the following:

"title":"\\ud83d\\udc8f\\ud83d\\udc8f\\ud83d\\udc8f\\ud83d"

Any suggestions how to fix that?

Binarian
  • 12,296
  • 8
  • 53
  • 84
Tim Specht
  • 3,068
  • 4
  • 28
  • 46
  • NSJSONSerialization should work. – Martin R May 22 '14 at 18:21
  • 1
    I thought that is exactly what `NSNonLossyASCIIStringEncoding` was supposed to do. – Brad Allred May 22 '14 at 18:24
  • 2
    JSON should ALWAYS be UTF. – Hot Licks May 22 '14 at 18:25
  • uniText contains the text that you want. goodMsg contains the text in some messed up format. NSJSONSerialization would have serialized uniText perfectly well, instead you asked it to serialize your messed up "goodMsg". – gnasher729 May 22 '14 at 18:47
  • No. I tried inputting it directly but it just SIGABRTs, no exception printed, exception breakpoint also doing nothing >. – Tim Specht May 22 '14 at 18:56
  • You're encoding the string using one encoding and trying to decode the result using a different encoding! That's probably not what you meant to do. – Ian Henry May 22 '14 at 19:13
  • 1
    Do not use `NSNonLossyASCIIStringEncoding` use `NSUTF32LittleEndianStringEncoding`. emoji are in UTF Plane 1 and thus are 21 bit code points. – zaph May 22 '14 at 19:23
  • I tried using your suggestion @Zaph but I get a nil data :( – Tim Specht May 22 '14 at 19:29
  • so what can I do to resolve the problem? – Tim Specht May 22 '14 at 19:32
  • See my answer example code, I didn't use your unicode characters, can you post, copy and paste, the emoji characters. Or in your code change `NSNonLossyASCIIStringEncoding` to `NSUTF32LittleEndianStringEncoding`, run the code and post the output. – zaph May 22 '14 at 19:54
  • (NSString *) uniText = 0x1680e0a0 @"" – Tim Specht May 22 '14 at 19:56
  • Okay, thanks for that. If I copy and paste your code everything seems to work fine, only when I try to use my text field everything goes berserk... any ideas? – Tim Specht May 22 '14 at 20:15
  • Log what is coming back from the text field: `NSLog(@"uniText utf-32: %@", [uniText dataUsingEncoding:NSUTF32LittleEndianStringEncoding]);` and post. – zaph May 22 '14 at 20:23

2 Answers2

4

There are two difficulties:
1. Apple hosed NSString WRT UTF Planes 1 and above, the underlying use of UTF-16 shows through. An example is that length will return 2 for one emoji character.
2. Whoever decided to put emoji in Plane 1 was just being difficult, it is the first use of Plane 1 and a lot of legacy UTF code does not handle that correctly.

Example code (adapted from @Hot Licks): Updated with OP emoji

NSString *uniText = @"";
NSDictionary* jsonDict = @{@"title":uniText};

NSData * utf32Data = [uniText dataUsingEncoding:NSUTF32LittleEndianStringEncoding];
NSLog(@"utf32Data: %@", utf32Data);

NSError* error = nil;
NSData* jsonData = [NSJSONSerialization dataWithJSONObject:jsonDict options:0 error:&error];
if (jsonData == nil) {
    NSLog(@"JSON serialization error: %@", error);
}
else {
    NSString* jsonString = [[NSString alloc] initWithData:jsonData encoding:NSUTF8StringEncoding];
    NSLog(@"The JSON result is %@", jsonString);
    NSLog(@"jsonData: %@", jsonData);
}

NSLog output

utf32Data: a6f40100 8ff40100 52f40100 52f40100 a6f40100
The JSON result is {"title":""}
jsonData: 7b227469 746c6522 3a22f09f 92a6f09f 928ff09f 9192f09f 9192f09f 92a6227d

zaph
  • 111,848
  • 21
  • 189
  • 228
0

Sigh:

NSString* uniText = mytextField.text;
NSDictionary* jsonDict = @{@"title":uniText};
NSError* error = nil;
NSData* jsonData = [NSJSONSerialization dataWithJsonObject:jsonDict options:0 error:&error];
if (jsonData == nil) {
    NSLog(@"JSON serialization error: %@", error);
}
else {
    NSString* jsonString = [[NSString alloc] initWithData:jsonData encoding:NSUTF8StringEncoding];
    NSLog(@"The JSON result is %@", jsonString);
}

If myTextField.text is a valid NSString then no other conversions should be required. NSJSONSerialization will provide all necessary "escaping".

Hot Licks
  • 47,103
  • 17
  • 93
  • 151
  • I already tried that. I really just copy pasted your code and the line where the serialization to NSData happens just crashes with SIGABRT. No useful exception, setting an exceptions breakpoint also provides no further results ... – Tim Specht May 22 '14 at 18:48
  • I fixed them, otherwise I wouldn't have been able to compile it obviously ...NSString* text = metadata[@"title"]; NSDictionary* jsonDict = @{@"title":text}; NSError* error = nil; NSData* jsonData = [NSJSONSerialization dataWithJSONObject:jsonDict options:0 error:&error]; if (jsonData == nil) { NSLog(@"JSON serialization error: %@", error); } else { NSString* jsonString = [[NSString alloc] initWithData:jsonData encoding:NSUTF8StringEncoding]; NSLog(@"The JSON result is %@", jsonString); } – Tim Specht May 22 '14 at 19:22
  • 1
    @TimSpecht What about Hot Licks' answer is different from Zaph's? – Aaron Brager May 22 '14 at 22:00