1

I was trying to figure out what was the characters encoded in this url: http://whatyouth.com/9236/roadtripppp-%f0%9f%8c-%b4%f0%9f%9a-%8c%f0%9f%90-roadtrip-throwback-again-sorry-missingsummer-palmtrees-rememberwhatyouth/

When I use the javascript function decodeURI, I got this error :

decodeURI("http://whatyouth.com/9236/roadtripppp-%f0%9f%8c-%b4%f0%9f%9a-%8c%f0%9f%90-roadtrip-throwback-again-sorry-missingsummer-palmtrees-rememberwhatyouth/")
> URIError: URI malformed

Does someone know what these characters are ?

  • %f0%9f%8c
  • %b4%f0%9f%9a
  • %8c%f0%9f%90
Apolo
  • 3,844
  • 1
  • 21
  • 51
  • Those encoded bytes are not in UTF-8, it may be some other encoding. No idea which one it is though. – Karol S Dec 17 '14 at 11:56
  • another strange thing with this : try this : `var url = 'http://whatyouth.com/9236/roadtripppp-%f0%9f%8c-%b4%f0%9f%9a-%8c%f0%9f%90-roadtrip-throwback-again-sorry-missingsummer-palmtrees-rememberwhatyouth/'; console.log(url);` => bug in chrome console. – Apolo Dec 17 '14 at 13:28
  • I don't have Chrome. What does it do? – Karol S Dec 17 '14 at 13:52
  • a weird bug, nothing shows up and the console becomes buggy – Apolo Dec 17 '14 at 13:55

1 Answers1

0

I suppose that is a Windows-1252 encoding : ASCII Encoding Reference (W3Schools) (Sorry for that W3Schools link... not my favourite website)

I replaced every '%' with '\x' in my url and I used functions in this answer : https://stackoverflow.com/a/4129920/3484498

var url = 'http://whatyouth.com/9236/roadtripppp-\xf0\x9f\x8c-\xb4\xf0\x9f\x9a-\x8c\xf0\x9f\x90-roadt‌​rip-throwback-again-sorry-missingsummer-palmtrees-rememberwhatyouth/';

decodeBytes(url,'cp1252');
> "http://whatyouth.com/9236/roadtripppp-ðŸŒ-´ðŸš-ŒðŸ�-roadtrip-throwback-again-sorry-missingsummer-palmtrees-rememberwhatyouth/"

decodeBytes(url,'cp1251');
> "http://whatyouth.com/9236/roadtripppp-рџЊ-ґрџљ-Њрџђ-roadtrip-throwback-again-sorry-missingsummer-palmtrees-rememberwhatyouth/"
Community
  • 1
  • 1
Apolo
  • 3,844
  • 1
  • 21
  • 51