7

I shaped two different RewriteRules for my page:

# Enable URL Rewriting
RewriteEngine on

# exclude followed stuff
RewriteRule ^(js|img|css|favicon\.ico|image\.php|anprobe|content|libs|flash\.php|securimage)/ - [L,QSA,S=2]

# conditions (REQUEST dont point @ file|dir|link)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-F
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l

# rules
RewriteRule ^(?!index\.php)brillen/(.*(brillen)|360|neu)/(.*)([a-zA-Z0-9]{5}-[a-zA-Z0-9]{5}(?!\.))(.*)$     /index.php/brillen/$1?art_id=$4&$5&%{QUERY_STRING}      [NS,QSA,L]
RewriteRule ^(?!index\.php)(.*)$                                                            /index.php/$1                                   [NS,QSA,L]

... and I'm encountering a strange problem, which lies in every request causing the page internally to load twice, which leads to the problem that db actions and email dispatching are also executed twice.

Does anyone have an idea concerning that?

Thanks in advance!

Note 1: All requested resources are valid and available according to the browser's resource tracking.

Note 2: May the problem originate in retaining and post-processing the PATH_INFO? (/index.php/$1 => /index.php/foo/bar/...)

proximus
  • 689
  • 7
  • 20

3 Answers3

8

The rewrite Engine cannot make a single HTTP request run twice. It routes the HTTP request for Apache to either a static file, a proxy function, or a module (like PHP) with alteration in the request. But it cannot clone the request and give it 2 times to apache.

When you have any "run twice" problem chances are that you are hit by the empty image url bug. In fact it's not really a bug it's a feature of HTML (at least before HTML5) and a feature of url-parsing.

If you get somewhere an empty GET url, HTML states that the browser should re-send the same query (the one that gave him the current page) with same parameters. This can make a POST request happen 2 times (if the requested 1st page were a POST). So where are these empty GET url? Most of the time you get either :

<IMG SRC="" ...> (in the HTML)

or:

url() (in the css)

or:

<script type="text/javascript" src=""></script>
<link rel="stylesheet" type="text/css" href=""> (in the HTML headers)

Read also @Jon answer about the favicon query. You should always test the result without browsers behaviours by using wget or telnet 80 queries.

Update: detailled explanations and followups available on this blog with HTML5 additions which should remove this behavior for modern browsers.

regilero
  • 29,806
  • 6
  • 60
  • 99
  • I triple checked all resources and searched for your mentioned patterns but they all don't apply... there's even no output in the browser's (js) console. – proximus May 10 '11 at 10:05
  • on firebug on the network scan don't you see a second html request? and if you try with lynx or telnet do you get the second execution? – regilero May 10 '11 at 10:35
  • There's no second request visible in firebug. The second request is actually not visible to the user at all, it seems just to be done internally (therefore it's only visible in my syslogs). Same happens calling the page with lynx. – proximus May 10 '11 at 10:53
  • Thanks 4 your help regilero but I just found out that the problem comes from outside of the rewrites. A recent coincidence misled me to the conclusion the reload comes from the rewrites. I appreciate your correct answer in applicable cases! – proximus May 10 '11 at 11:12
  • 1
    @proximus, can you give us the source of the bug, for future readers, proxy-cache, fcgi? – regilero May 10 '11 at 11:40
  • What was your problem? I am having the same issue and really can't get to the bottom of it... – Mihai Fratu Apr 22 '13 at 12:49
  • @regilero Could you elaborate please? I'm having a run-twice problem too, but it's intermittent. I have scripts that I run by typing in the URL in the browser. Sometimes they do produce ``, but shouldn't that just show a broken link on the page? Why would the browser reload the whole page just due to one bad image? Thanks for any light you can shed on the issue. – SaganRitual Oct 07 '13 at 23:56
  • because that's the HTTP norm, an empty GET url, and an empty img src is an empty GET url, means : `same-url-and-method-used-to-produce-containing-current-page`, so never generates empty urls, never, really, never ever trys to assume this means broken url as it does not mean broken-url at all. – regilero Oct 14 '13 at 14:17
  • @GreatBigBore I 've added a link on follow-ups details where we can see that this behavior will be rmoved from modern browser (so you will be allowed to forget it in ten years) – regilero Oct 14 '13 at 14:25
  • Thanks for the answer. I was having problems with a captcha code always being incorrect. – Twifty Jan 08 '15 at 19:47
  • Same issue happens if you call for JS file that does not exists. Which might help understand why @Avi answer is also a way to resolve it. – Ham Dong Kyun Apr 07 '16 at 15:54
1

I had the same problem, caused because I did some url rewriting, and the script was being loaded twice, due to the fact that i did not add this:

RewriteRule ^(js|img|css|favicon\.ico)/ - [L,QSA,S=2]

This will stop the script from being loaded twice; it solved my problem.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
Avi
  • 199
  • 1
  • 3
  • 8
1

I had the same issue (or so I thought). It was caused by the request for favicon.ico, which I hadn't considered in my rewrite rule.

Gavin Miller
  • 43,168
  • 21
  • 122
  • 188
Jon
  • 121
  • 1
  • 2