0

UPDATE:

The solution in my case was this:

map $http_host$request_uri $pageCache {
    default "nonexistent";
    "~^(?<subdomain1>.{4})(?<subdomain2>.*)\.example\.com(?<folder1>.*?)\/?(\?.*)?$" page-cache/subdomains/$subdomain1/$subdomain1$subdomain2$folder1/1.html;
    "~^example\.com(?<folder1>.*?)\/?(\?.*)?$" page-cache$folder1/1.html;
}

Especially this part:

(?<folder1>.*?)\/?(\?.*)?

Thank you Gerard H. Pille!


How can I get everything from a url except the last / (forwardslash) and any query strings like ?querystring=blah - I need it captured into a group, like the "path" group below. The follow example captures a "path" group, but it won't work if the last character is not / or ?

   ^(?<path>.*[^\/?])

The following will capture everything including the last forward-slash (but nothing after), but I need to omit the last forward slash too:

^(?<path>.*[\/])

For example, I need:

If you're wondering why I need this regex, it's because I need to get the full path to determine where a "cached html" file is located to serve that instead of php

map $request_uri $request_uri_path {
  "~^(?<path>.*[^\/?])$" $path;
}
# Get the page cache path
map $http_host$request_uri_path $pageCache {
    default "nonexistent";
    "~^(?<subdomain1>.{4})(?<subdomain2>.*)\.example\.com(?<folder1>.*)$" page-cache/subdomains/$subdomain1/$subdomain1$subdomain2$folder1/1.html;
    "~^example\.com(?<folder1>.*)$" page-cache$folder1/1.html;
}

Note: When I use $uri, it returns a value with "/index.php" which is not what I want. I cannot use $scheme://$http_host either since it doesn't include the folder path of the url (e.g. /sub/folders).

PS. Yes, I have asked this question before, but I posed it without explaining properly so I delete it and re-submitted it with more clarification.

Update: Full server block as requested:

server {
    listen 80;
    listen [::]:80;
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name .example.com;
    root /home/sys/example.com/public;

    # Block Bad Bots
    if ($http_user_agent ~* (bingbot|360Spider|80legs.com) ) {
        return 444;
    }

    # SSL (DO NOT REMOVE!)
    ssl_certificate /etc/nginx/ssl/example.com/123/server.crt;
    ssl_certificate_key /etc/nginx/ssl/example.com/123/server.key;

    ssl_protocols TLSv1.2;
    ssl_ciphers blahblah;
    ssl_prefer_server_ciphers on;
    ssl_dhparam /etc/nginx/dhparams.pem;

    add_header X-Frame-Options "SAMEORIGIN";
    add_header X-XSS-Protection "1; mode=block";
    add_header X-Content-Type-Options "nosniff";

    index index.html index.htm index.php;

    charset utf-8;

    include sys-conf/example.com/server/*;    

    location / {
        limit_req zone=one burst=10 nodelay;

        if ($http_user_agent ~* "^.*wkhtmltoimage.*$"){
            return 403;
        }

        try_files $pageCache $uri $uri/ /index.php?$query_string;
    }

    location = /favicon.ico {
        access_log off;
        log_not_found off;
    }
    location = /robots.txt  {
        access_log off;
        log_not_found off;
    }
    location = /ads.txt  {
        access_log off;
        log_not_found off;
    }

    location ~*  \.(js)$ {
        expires 3d;
    }

    access_log off;
    error_log  /var/log/nginx/example.com-error.log error;

    error_page 404 /index.php;

    location ~ \.php$ {
        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_pass unix:/var/run/php/php7.3-fpm.sock;
        fastcgi_index index.php;
        include fastcgi_params;

        #tweaks
        fastcgi_buffers 8 16k; # increase the buffer size for PHP-FTP
        fastcgi_buffer_size 32k; # increase the buffer size for PHP-FTP
        fastcgi_connect_timeout 60;
        fastcgi_send_timeout 300;
        fastcgi_read_timeout 300;
    }

    location ~ /\.(?!well-known).* {
        deny all;
    }
}
NAMAssist
  • 125
  • 9

1 Answers1

0

Please give this a whirl:

map $uri $pageCache {
    "~^(?<folder1>.*?)/?(\?.*)?$" page-cache$folder1/1.html;
}
Gerard H. Pille
  • 2,569
  • 1
  • 13
  • 11
  • Thank you! This helped me write the code needed. I have to use $http_host$request_uri instead of $uri in my case, since $uri returns 'index.php' each time. – NAMAssist May 02 '20 at 13:13
  • It would be no problem to strip index.php too. The tric was the ".*?", an ungreedy match all. Is the index.php problem caused by the split_path_info? – Gerard H. Pille May 02 '20 at 13:54
  • Honestly, not sure what causes the index.php response from $uri. But regardless thank you! – NAMAssist May 02 '20 at 13:59
  • I think I found it. When does something get stored in the pagecache? – Gerard H. Pille May 02 '20 at 14:06
  • PHP generates the html files that are stored in pagecache when that page is visited for the first time (when the cache file does not exist yet). – NAMAssist May 02 '20 at 14:11
  • When the file hasn't been cached yet, try_files changes the uri to index.php. – Gerard H. Pille May 02 '20 at 14:18
  • Ok, so $uri won't be usable then based on my current setup it seems. I've got the above implementation working now - try_files is calling $pageCache and returning those files successfully :) – NAMAssist May 02 '20 at 14:48