0

I have downloaded history of my Google search from here but the diacritics (latin-extended characters) in JSON files (encoded in utf-8) are messed up.

E.g.:

dva na ôsmu

displays as

dva na �smu

and when I use JSON intedation package in Sublime Text, I get this:

dva na \ufffdsmu

All the special characters are replaced with this same broken character. Is there any way how to fix this, is simply Google exporting broken JSONs so non-english users can't use this export? I want to build app that will display statistics of words used in my searches but it is now possible with JSONs broken this way.

wewo
  • 700
  • 7
  • 16

1 Answers1

0

The JSON seems to be corrupt. I inspected the text bytes with hex dump and the character is encoded as 0xEFBFBD, which is unicode replacement character. The letter is already lost in the JSON and the character there is the replacement character.

screenshot

towolf
  • 350
  • 1
  • 6
  • So it is on Google to fix its export app, right. Is there any way how to reportthis kind of bug? – wewo Apr 23 '15 at 14:36