1

So I have created a single page web app using Firebase where each page has content dynamically loaded from the Firebase Database.

Search engines however, would only see blank pages and not the dynamic content. I created a Firebase function to pre-render each page for SEO purposes, which has worked great.

The issue is that this has majorly affected the user experience as there is an extra delay from the function being run, followed by a FOUC when the dynamic content is loaded with all other JS.

Is it possible to only trigger the pre-rendering function for GoogleBot (and other know crawlers/bots) allowing the normal website experience for users and a pre-rendered html page for bots.

Thanks

Edit:

exports.helloWorld = functions.https.onRequest((request, response) => {
  // console.log(request.useragent)
}

The user agent expected is:

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"

However has snippet appended to it:

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 AppEngine-Google; (+http://code.google.com/appengine; appid: s~gcf-http-proxy)"

I have tried several plugins that detect bots, however each of these report everything as a bot due to AppEngine-Google

Finlay Percy
  • 6,763
  • 4
  • 22
  • 28

1 Answers1

3

Cloud Functions for Firebase uses Express middleware for its HTTP triggers. So it's quite possible to detect crawlers, as shown here Detect social bots in Node Express and How to detect web crawlers for SEO, using Express?.

But I wonder if you're steering in the right direction there.

While pre-rendering may take more time than serving the raw content, that time should be compensated by it rendering a meaningful rendition straight away. There way a great article about server-side rendering with Cloud Functions and Express recently.

In addition (unless your data is very dynamic) most of your users should be hitting a cached version of most HTML. See David Easts talk about dynamic HTTP at I/O, specifically his explanation of setting cache headers in Cloud Functions.

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807
  • Thanks Frank, however every user-agent request seems to be from the family of `AppEngine-Google` no matter what browser used or where the request comes from. – Finlay Percy Aug 09 '17 at 13:16
  • 1
    Hmmm.... I don't know if that is expected. Can you update your question to include the [minimal code needed to reproduce this problem](http://stackoverflow.com/help/mcve) as well as the user agent you get and how you're triggering the function? – Frank van Puffelen Aug 09 '17 at 14:15
  • Hi Frank, I've added an edit, let me know if you need any more clarification. – Finlay Percy Aug 09 '17 at 14:25
  • 1
    It looks like somewhere along the way a ` (+http://code.google.com/appengine; appid: s~gcf-http-proxy)` gets **added** to the user-agent. While that *is* unexpected, it is not uncommon for proxies to add information to the UA. You'll have to filter accordingly in your server-side code. – Frank van Puffelen Aug 09 '17 at 16:45