2

I have set up a website using docusaurus which works great, but when I check the indexing status of my website on google, I got a lot of 'Discovered - currently not indexed' problems, which appears to be a routing problem.
The compiled version of the docusaurus website generates an 'index.html' file for every page or md file in the docusauus project in a separate folder, but the generated links don't add the 'index.html' and instead end like this: https://cam-inspector.com/contact. The browser doesn't seem to go to 'https://cam-inspector.com/contact/index.html', instead it uses the root index.html page and appears to handle the routing locally. This works for viewing, but the google crawler only gets to see the root, so every page contains a canonical url of 'https://cam-inspector.com'. When you browse directly to 'https://cam-inspector.com/contact/index.html', the correct file is retrieved from the server and the canonical tag is correct to 'https://cam-inspector.com/contact'.

I tried to add a redirect to the yaml file of the google app engine deployment so that it would add 'index.hmtl' to all routes that don't end with a file extension, but that doesn't seem to work:

handlers:
# for url without extensions, needs to go to index.html to get seo correct
- url: /(?:/|^)[^./]+$
  static_files: www/\1/index.html
  upload: www/(.*)/index.html
# files with extension, needed to get all the files
- url: /(.*\..+)$
  static_files: www/\1
  upload: www/(.*\..+)$
# Catch all handler to index.html, needed to get the root
- url: /.*
  static_files: www/index.html
  upload: www/index.html

In this post: app.yaml : Redirect any URL ending with / to {url}/index.html they appear to say that you can't add a redirect like that to the yaml def.

So now I'm a little stuck, the only thing I can think of is to add '/index.html' to all the links in the docusaurus code, but that sort of creates something very difficult and tricky to maintain (especially with the auto generated side bar of the docs, where it's much harder to change the links).
Any ideas on how to fix this?

Jan
  • 300
  • 1
  • 3
  • 8
  • You are saying `/contact/index.html` has the correct canonical tag but that `/contact/` does not? – Stephen Ostermiller Nov 17 '21 at 10:59
  • What about if you create a sitemap and submit the sitemap to Google Search Index? In addition, you write code to automatically generate the sitemap whenever it is accessed and generating the sitemap would then include automatically adding 'index.html' to the end of each url that you have? – NoCommandLine Nov 17 '21 at 12:15
  • @StephenOstermiller yep, that's what was happening – Jan Nov 17 '21 at 13:02

2 Answers2

3

found the solution:

  • use the following config in docusaurs.config.js: trailingSlash: true,
  • use these handlers in app.yaml:
- url: /
  static_files: www/index.html
  upload: www/index.html
- url: /(.*)/$
  static_files: www/\1/index.html
  upload: www/(.*)
- url: /(.*)
  static_files: www/\1
  upload: www/(.*)
  • make certain you declare the links to your pages this way: \contacts\

then things start to behave as expected.

Jan
  • 300
  • 1
  • 3
  • 8
0

Here's my version of deploying a Docusaurus site to Google App Engine.

handlers:
  # static files with a URL ending with a file extension
  # (e.g. favicon.ico, manifest.json, jylade.png)
  - url: /(.*\..+)$
    static_files: build/\1
    upload: build/(.*\..+)$

  # index page
  - url: /
    static_files: build/index.html
    upload: build/index.html

  # anything that ends with a slash (e.g. /docs/)
  - url: /(.*)/$
    static_files: build/\1/index.html
    upload: build/(.*)

  # anything else (e.g. /docs)
  - url: /(.*)
    static_files: build/\1/index.html
    upload: build/(.*)

Works for me with the default Docusaurus configuration. It does not require trailingSlash: true.

Tzach
  • 12,889
  • 11
  • 68
  • 115