Having real trouble working out how to fire up a headless browser to serve up static HTML snapshots of a site that uses javascript (sammy.js, to be specific) to deliver the AJAX content.
I'm working off Google's specification for making AJAX apps crawlable:
http://code.google.com/web/ajaxcrawling/docs/getting-started.html
which for the most part is great and very clear, and I'm having no problems picking up the ?_escaped_fragment_ URLs.
Most of the templating is done server side, so I was tempted to just write a PHP snapshot-building file that uses the same regex matches from the sammy app code (there are a lot of routes) to include in various template files. However, a lot of the action happens in the javascript app, so it would mean mirroring all of that processing in PHP, which then means maintaining both files side by side, cross-language - which is a lot of work!
Now, I've read that you can use a Headless Browser to 'render' the page and execute all the javascript (matching the #!/ route and delivering the correct content for the request) and then return the entire DOM contents as HTML, which would be served to googlebot.
I've searched long and hard and can't find any step-by-step guides on running headless browsers from PHP (for total Java newbs). Which I suppose means I just don't know what to search for.
What I'm wondering: is it even more work to set up and use a headless browser to serve up these HTML snapshots? And if so, is it worth doing anyway?
Also, if there are any guides you could point me to, that'd be great!
Thanks!
Joss