2

I have a site with a big .htaccess with a lot of dynamic rules, everything works fine, but unfortunately, Google is duplicating my URLs, considering the same URL with trailing slash and without... I will paste the code of my .htaccess if someone could help me to enforce adding the trailing slash, without generating a 301 loop.

#Options -MultiViews
RewriteEngine On
RewriteCond %{SERVER_PORT} 443
RewriteRule ^(.*)$ http://www.advogadosaqui.com.br/$1 [R=301,L]

RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

## Adding a trailing slash <<<< (HERE IS WHATS I TRIED) >>>>
#RewriteCond %{REQUEST_FILENAME} !-f
#RewriteRule [^/]$ %{REQUEST_URI}/ [L,R=301]

# external redirect rule to remove /artigos/ from URLs
RewriteCond %{THE_REQUEST} \s/artigos/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_agencia/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_agencia/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_locker/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_locker/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_estado/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_estado/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_cidade/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_cidade/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_bairro/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_bairro/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# Remove .php extension externally
# To externally redirect /dir/file.php to /dir/file
# %{THE_REQUEST} \s/+(.+?)\.php[\s?] [NC]
#RewriteRule ^ /%1 [R=301,NE,L]

#Hide and Redirect Extension
RewriteCond %{THE_REQUEST} ^[A-Z]+\s.+\.php\sHTTP
RewriteRule ^(.+)\.php$ /$1 [R=301,L]

# internal rewrite from root to /artigos/
RewriteCond %{HTTP_HOST} advogadosaqui [NC]
RewriteCond %{DOCUMENT_ROOT}/artigos/$1.php -f
RewriteRule ^([\w-]+)/?$ artigos/$1.php [L]

# internal rewrite from root to /unidades/pagina_agencia/
RewriteCond %{HTTP_HOST} advogadosaqui [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_agencia/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_agencia/$1.php [L]

# internal rewrite from root to /unidades/pagina_locker/
RewriteCond %{HTTP_HOST} advogadosaqui [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_locker/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_locker/$1.php [L]

# internal rewrite from root to /unidades/pagina_estado/
RewriteCond %{HTTP_HOST} advogadosaqui [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_estado/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_estado/$1.php [L]

# internal rewrite from root to /unidades/pagina_cidade/
RewriteCond %{HTTP_HOST} advogadosaqui [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_cidade/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_cidade/$1.php [L]

# internal rewrite from root to /unidades/pagina_bairro/
RewriteCond %{HTTP_HOST} advogadosaqui [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_bairro/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_bairro/$1.php [L]

# handle .php extension internally
#RewriteCond %{REQUEST_FILENAME} !-d
#RewriteCond %{DOCUMENT_ROOT}/$1.php -f
#RewriteRule ^(.+?)/?$ $1.php [L]

# remove .php extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+?)/?$ $1.php [L]

##ErrorDocument 404 https://www.advogadosaqui.com.br/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . / [L,R=301]

I tried to add this rule, but it's generating a 301 loop:

## Adding a trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule [^/]$ %{REQUEST_URI}/ [L,R=301]

UPDATED Question

After using Anubhava's rules everything works fine without any 301 loopings, the problem is when the request comes without the trailing slash... because now it's generating one unnecessary 301 to enforce the trailing slash, I need to find a way to do it with only one redirect...

Some prints to explain what's going on:

###FIRST JUMP - THE REQUESTED URL

enter image description here

###SECOND JUMP

enter image description here

###LAST JUMP

enter image description here

Instead of this, I need to do one redirect directly to the URL with: https + www + trailing slash

FINAL UPDATE

Ok I found a solution... using page rules in Cloudflare / removing the enforce https and www from the local file (.htaccess) and changing the rules to add a trailing slash to:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)([^/])$ /$1$2/ [L,R=301]

The final .htaccess looked like this:

Options -MultiViews
RewriteEngine On

# Adding a trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)([^/])$ /$1$2/ [L,R=301]

# external redirect rule to remove /artigos/ from URLs
RewriteCond %{THE_REQUEST} \s/artigos/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# Remove .php extension externally
RewriteCond %{THE_REQUEST} \s/+(.+?)\.php[\s?] [NC]
RewriteRule ^ /%1/ [R=301,NE,L]

# internal rewrite from root to /artigos/
RewriteCond %{DOCUMENT_ROOT}/artigos/$1.php -f
RewriteRule ^([\w-]+)/?$ artigos/$1.php [L]

# handle .php extension internally
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(.+?)/?$ $1.php [L]

By doing this I managed to avoid the extra redirect ;)

Arvind Kumar Avinash
  • 71,965
  • 6
  • 74
  • 110
Sophie
  • 410
  • 3
  • 10
  • That rule alone should not result in a redirect loop. And I can't see another rule that would conflict to result in a "loop". What is the nature of the redirect "loop" you are seeing? What URLs are you being repeatedly redirected from/to? There is, however, an issue with your redirect that removes the `.php` extension if you are favoring URLs that end in a trailing slash (but that would only cause an additional redirect and that should only be an edge case). – MrWhite Jan 12 '23 at 12:46
  • However, there is the matter of how Google found both (slash and no-slash) URLs to begin with. Have you confirmed that you are consistently linking to the with-trailing-slash URLs throughout your application? And you have set the `rel="canonical"` meta in all your pages accordingly? – MrWhite Jan 12 '23 at 12:48
  • At the exactly moment I cant say for sure the urls it being looping, because I removed the trailing slash for users consume the site, but more few hours I will put it again to keep trying to find a solution and I will describe exactly for you, thank you soo much for your time reading my question – Sophie Jan 12 '23 at 14:39
  • Your original directives specifically redirect from HTTPS to HTTP - is that intentional? Your screenshots show HTTPS (due to the updated directives from anubhava's answer). – MrWhite Jan 14 '23 at 13:35
  • I am using the Anubhava answer – Sophie Jan 14 '23 at 14:35
  • The problem is not the fact beeing redirected to https, the problem is it takes 2 301's to do it, it must be directly – Sophie Jan 14 '23 at 14:36

2 Answers2

2

With your shown samples, attempts please try following .htaccess rules file. Apart from fix of trailing slashes I have clubbed 4 rules into 1 Rule.

Make sure to clear your browser cache before testing your URLs.

#Options -MultiViews
RewriteEngine On
RewriteCond %{SERVER_PORT} 443
RewriteRule ^(.*)/?$ http://www.avantitecnologiati.com.br/$1 [R=301,L]

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)/?$ http://www.%{HTTP_HOST}/$1 [R=301,L]

## Adding a trailing slash.....
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+[^/])$  $1/ [L,R=301]

# external redirect rule to remove /artigos/ from URLs
RewriteCond %{THE_REQUEST} \s/artigos/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_agencia/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_agencia/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]

# external redirect rule to remove /unidades/pagina_locker/ from URLs
RewriteCond %{THE_REQUEST} \s/+unidades/pagina_(?:locker|estado|cidade|bairro)/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301]


# Remove .php extension externally
# To externally redirect /dir/file.php to /dir/file
# %{THE_REQUEST} \s/+(.+?)\.php[\s?] [NC]
#RewriteRule ^ /%1 [R=301,NE,L]

#Hide and Redirect Extension
RewriteCond %{THE_REQUEST} ^[A-Z]+\s.+\.php\sHTTP
RewriteRule ^(.+)\.php$ /$1 [R=301,L]

# internal rewrite from root to /artigos/
RewriteCond %{HTTP_HOST} avantitecnologiati [NC]
RewriteCond %{DOCUMENT_ROOT}/artigos/$1.php -f
RewriteRule ^([\w-]+)/?$ artigos/$1.php [L]

# internal rewrite from root to /unidades/pagina_agencia/
RewriteCond %{HTTP_HOST} avantitecnologiati [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_agencia/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_agencia/$1.php [L]

# internal rewrite from root to /unidades/pagina_locker/
RewriteCond %{HTTP_HOST} avantitecnologiati [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_locker/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_locker/$1.php [L]

# internal rewrite from root to /unidades/pagina_estado/
RewriteCond %{HTTP_HOST} avantitecnologiati [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_estado/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_estado/$1.php [L]

# internal rewrite from root to /unidades/pagina_cidade/
RewriteCond %{HTTP_HOST} avantitecnologiati [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_cidade/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_cidade/$1.php [L]

# internal rewrite from root to /unidades/pagina_bairro/
RewriteCond %{HTTP_HOST} avantitecnologiati [NC]
RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_bairro/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_bairro/$1.php [L]

# handle .php extension internally
#RewriteCond %{REQUEST_FILENAME} !-d
#RewriteCond %{DOCUMENT_ROOT}/$1.php -f
#RewriteRule ^(.+?)/?$ $1.php [L]

# remove .php extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.+?)/?$ $1.php [L]

##ErrorDocument 404 http://www.avantitecnologiati.com.br/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . / [L,R=301]
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thank you soo much for your time to help me, I am trying to test here but caching is being a problem lol... I tried to clear history, cookies, all data, tried to flushdns damm... at proxy online it is working but I really trying to test from my network hahah – Sophie Jan 12 '23 at 09:27
  • @Sophie, Or try in any other browser where you haven't tried it yet? – RavinderSingh13 Jan 12 '23 at 09:28
  • Yea I did, tried at edge lol but anyway, using this proxy site: https://www.proxysite.com I could simulate an request without trailing slash at any page, and got redirected to home ;( – Sophie Jan 12 '23 at 09:30
  • @Sophie, what is the sample URL you are hitting? – RavinderSingh13 Jan 12 '23 at 09:31
  • Test with proxy browser is terrible so forgive me if it is working.. but I'm not sure if it is lol – Sophie Jan 12 '23 at 09:34
  • Ok now I managed to test it and it really isn't working properly, when I try to request any url without the trailing slash I got redirected to index, instead of beeing redirected to the same url with trailing slash – Sophie Jan 12 '23 at 09:46
  • 1
    @RavinderSingh13 Your rule to "add the trailing slash" is missing the slash prefix on the target URL (and there is no `RewriteBase` directive) so will result in a malformed redirect, which ultimately does not exist so gets redirected a second time to the _root_ by the very last rule. The rule that removes the `.php` extension is also missing the trailing slash so will result in an additional (unnecessary) redirect. – MrWhite Jan 14 '23 at 17:40
2

You have a massive .htaccess. I will try to combine and merge few rules to shorten the length and also handle adding slash removal:

Options -MultiViews
RewriteEngine On

## Adding a trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule [^/]$ https://www.avantitecnologiati.com.br%{REQUEST_URI}/ [L,R=301,NE]

## add www and turn on https in same rule
RewriteCond %{HTTP_HOST} !^www\. [NC,OR]
RewriteCond %{HTTPS} !on
RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]
RewriteRule ^ https://www.%1%{REQUEST_URI} [R=301,L,NE]

# external redirect rule to remove /unidades/pagina_agencia/ from URLs
RewriteCond %{THE_REQUEST} \s/+(?:artigos|unidades/pagina_(?:agencia|locker|estado|cidade|bairro))/(\S*)\sHTTP [NC]
RewriteRule ^ /%1 [L,R=301,NE]

# Remove .php extension externally
RewriteCond %{THE_REQUEST} \s/+(.+?)\.php[\s?] [NC]
RewriteRule ^ /%1/ [R=301,NE,L]

# internal rewrite from root to /artigos/
RewriteCond %{DOCUMENT_ROOT}/artigos/$1.php -f
RewriteRule ^([\w-]+)/?$ artigos/$1.php [L]

RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_agencia/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_agencia/$1.php [L]

RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_locker/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_locker/$1.php [L]

RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_estado/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_estado/$1.php [L]

RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_cidade/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_cidade/$1.php [L]

RewriteCond %{DOCUMENT_ROOT}/unidades/pagina_bairro/$1.php -f
RewriteRule ^([\w-]+)/?$ unidades/pagina_bairro/$1.php [L]

# handle .php extension internally
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(.+?)/?$ $1.php [L]

Make sure to use a different browser or remove cache data from your browser to test this rule to avoid old cache.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Hey Anubhava, thank you soo much for your time! you always have a solution hehe Just one point... after using this rules to force trailing slash I got 3 jumps (301) if I do a request without trailing slash, thats a problem to SEO, because I lose force, do you know how can I solve this? I will update the question with the print – Sophie Jan 14 '23 at 12:12
  • Open your page in Chrome dev tool without trailing slash and catching disabled. Then check network tab to see what redirects you get. Make sure you don’t have any other htaccess and your site root htaccess is exactly as I have shown in my answer. – anubhava Jan 14 '23 at 12:23
  • I updated the question with the prints hehe – Sophie Jan 14 '23 at 12:31
  • There was a typo. In the first redirect rule I should have used `https` but it was `http`. It should all work fine now once you clear your browser cache. – anubhava Jan 14 '23 at 17:08