Update: After further investigation I've managed to narrow the problem down to the json encoder. Passing the input straight through works fine, but passing it through MultiJson.encode
is what's causing the problem.
I'm sending the following up to a restful web service via curl:
$ curl -v -X POST "http://my/url" -d "{\"body\": \"\"}"
The character that you probably can't see is the Credit Card emoji character, which is U+1F4B3.
The response I get back from the service is essentially:
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2013 02:38:04 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 266
< Connection: close
<
{ [data not shown]
100 304 100 266 100 38 936 133 --:--:-- --:--:-- --:--:-- 936
* Closing connection 0
{
"body": "\uf4b3"
}
This encoded character does not correspond to what I sent and I would expect it to be returned as sent (in this case).
I have access to the server's source code. It's built on Ruby, Sinatra and ActiveRecord. There is some amount of processing going on before the response is sent:
- First the content is passed through
ERB::Util.html_escape
- Then, a series of regexs are applied via
str.gsub!(reg, " ### ")
- Finally, the response is returned via
MultiJson.encode
I'm not a Ruby person, but can provide additional details if necessary. Would appreciate someone pointing me in the right direction. Thanks!