2

I have a site built on ExpressionEngine (EE). By default, EE requires index.php to be present in the first segment of the URL. To pretty up my URLs, I use a .htaccess RewriteRule:

# Remove index.php from ExpressionEngine URLs
RewriteCond $1 !\.(gif|jpe?g|png)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?/$1 [L]

The entire site is also served with SSL, which I accomplish with another RewriteRule:

# Force SSL
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R,L]

Recently, the client asked to move their RSS feeds to Feedburner. However, Feedburner doesn't like https URLs, so I had to modify my SSL RewriteRule to not force SSL on feed pages:

# Force SSL except on RSS feeds
RewriteCond %{SERVER_PORT} 80
RewriteCond %{REQUEST_URI} !^/feeds/ [NC]
RewriteCond %{REQUEST_URI} !^/index\.php [NC]
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R,L]

So my whole .htaccess file looks like this:

RewriteEngine On
RewriteBase /

# Force SSL except on RSS feeds
RewriteCond %{SERVER_PORT} 80
RewriteCond %{REQUEST_URI} !^/feeds/ [NC]
RewriteCond %{REQUEST_URI} !^/index\.php [NC]
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R,L]

# Remove index.php from ExpressionEngine URLs
RewriteCond $1 !\.(gif|jpe?g|png)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?/$1 [L]

As soon as I added the feeds rule to the .htaccess file, however, Google stopped indexing the site's pages. The sitemap URL that's submitted to Google is /index.php/sitemap, so I'm thinking that index.php is playing a role here.

How can I adjust my .htaccess file to allow SSL on my feed pages, but not mess up Google's indexing?

kmgdev
  • 2,607
  • 28
  • 41
  • So what do you see when you make a request to http://example.com/sitemap and http://example.com/index.php/sitemap/ – AllInOne Nov 04 '13 at 19:58
  • @AllInOne `example.com/sitemap` displays the sitemap (and redirects to https). `example.com/index.php/sitemap` also displays the sitemap (but *doesn't* redirect to https) – kmgdev Nov 05 '13 at 16:25

1 Answers1

0

This was happening because the rule

RewriteCond %{REQUEST_URI} !^/index\.php [NC]

was preventing any URLs starting with index.php from being redirected to HTTPS.

The reason Google stopped indexing the site is because the sitemap is dynamically generated, and uses the current host URL to create the links.

Since /index.php/sitemap was no longer being redirected to HTTPS, Google was indexing URLs starting with HTTP, which were totally new as far as Google was concerned, because it had been indexing HTTPS URLs up to that point.

kmgdev
  • 2,607
  • 28
  • 41