I've been through Googles documentation and countless blog posts about this subject, and depending on date and source, there seems to some conflicting information. Please shine your wisdom upon this lowly peasant, and all will be good.
I'm building a website pro-bono where a large part of the audience is from African countries with poor internet connectivity, and the client can't afford any decent infrastructure. So I've decided to serve everything as static html files, and if javascript is available, I load the page content directly into the DOM, if a user clicks on a nav link, to prevent overhead from loading a whole page.
My client-side routes looks like this:
//domain.tld/#!/page
My first question is; Does googlebot translate that into:
//domain.tld/_escaped_fragment_/page
or //domain.tld/?_escaped_fragment_=/page
?
I've made a simple server-side router in php, that builds the requested pages for googlebot, and my plan was to redirect //d.tld/_escaped_fragment_/page
to //d.tld/router/page
.
But when using Google's "Fetch as Googlebot" (for the first time I might add), it doesn't seem to recognize any links on the page. It just returns "Success" and shows me the html of the main page (Update: When pointing Fetch as Googlebot to //d.tld/#!/page
it just returns the content of the main page, without doing any _escaped_fragment_ magic). Which leads me to my second question:
Do I need to follow a particular syntax when using hashbang links, for googlebot to crawl them?
My links looks like this in the HTML:
<a href="#!/page">Page Headline</a>
Update1: So, when I ask Fetch as Googlebot to get //d.tld/#!/page
this shows up in the access log: "GET /_escaped_fragment_/page HTTP/1.1" 301 502 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
But it doesn't seem to follow the 301 I set up, and shows the main page instead. Should I use a 302 instead? This is the rule I'm using: RedirectMatch 301 /_escaped_fragment_/(.*) /router/$1
Update2: I've changed my plans, and will account for googlebot as part of my non-javascript fallback tactic. So now all the links are pointing to the router /router/page
and are then changed to /#!/page/
onLoad with javascript. I'm keeping the question open for a bit, in case someone has a brilliant solution that might help others.