This SO post has an example of a server that generates json with a byte order mark. RFC7159 says:
Implementations MUST NOT add a byte order mark to the beginning of a JSON text. In the interests of interoperability, implementations that parse JSON texts MAY ignore the presence of a byte order mark rather than treating it as an error.
Currently yajl and hence jsonlite choke on the BOM. I would like to follow the RFC suggestion and ignore the BOM from the UTF8 string if present. What is an efficient way to do this? A naive implementation:
if(substr(json, 1, 1) == "\uFEFF"){
json <- substring(json, 2)
}
However substr
is a bit slow for large strings, and I am not sure this is the correct way to do this. Is there a more efficient way in R or C to remove the BOM if present?