Ok, here is what Google said (https://developers.google.com/webmasters/ajax-crawling/docs/getting-started).
When a crawler sees a url like this www.example.com/ajax.html#!key=value
, it will temporarily convert that url into www.example.com/ajax.html?_escaped_fragment_=key=value
However, when doing that it also escapes certain characters in the fragment during the transformation. Ex:
www.example.com/ajax.html#!key=value;car=%
to www.example.com/ajax.html?_escaped_fragment_=key=value;car=%25
so if we want to convert www.example.com/ajax.html?_escaped_fragment_=key=value;car=%25
back to the original url then we need to unescape all %XX characters in the fragment.
Google said:
Note: The crawler escapes certain characters in the fragment during the transformation. To retrieve the original fragment, make sure to unescape all %XX characters in the fragment. More specifically, %26 should become &, %20 should become a space, %23 should become #, and %25 should become %, and so on.
But google doesn't say How to do that in java.
String originalUrl=changedStr.replace("?_escaped_fragment_=", "!#");
// then what to do next so that all the escaped characters will go back to normal?
Is it ok to do like this
originalUrl=java.net.URLDecoder.decode(originalUrl, "UTF-8");
Which one do we have to use: "UTF-8" or "ASCII" ?
So when the crawler escape the url, does it use URL.encode()?
if it does then which one it uses "UTF-8" or "ASCII"?