2

Why? I am currently purchasing traffic from Adwords to my e-commerce site. On Adwords, I can see how many visits I've purchased and I am trying to verify the number of visits to my site by analyzing Apache's access_log.

Note that I do have Mixpanel installed and it's reporting a number much lower.

So I tried installing Apache's mod_unique_id. I added it to my access_log and tried visiting the front page of my site (a wordpress page). This is the log:

W2P5AswfRANRW1uZBgVINAAAAAA 50.74.231.163 - - [03/Aug/2018:06:41:06 +0000] "GET / HTTP/1.1" 200 15571
W2P5A8wfRANRW1uZBgVINQAAAAE 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/plugins/LayerSlider/static/layerslider/css/layerslider.css?ver=6.7.1 HTTP/1.1" 200 3876
W2P5A8wfRANRW1uZBgVINgAAAAM 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/plugins/contact-form-7/includes/css/styles.css?ver=5.0.1 HTTP/1.1" 200 656
W2P5A8wfRANRW1uZBgVINwAAADw 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/themes/themextend/css/magnific-popup.css?ver=4.9.5 HTTP/1.1" 200 1816
W2P5A8wfRANRW1uZBgVIOAAAADI 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/themes/themextend/css/animate.css?ver=4.9.5 HTTP/1.1" 200 4348
W2P5A8wfRANRW1uZBgVIOQAAACY 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/themes/themextend/css/theme-style.css?ver=4.9.5 HTTP/1.1" 200 3035
W2P5A8wfRANRW1uZBgVIOgAAABI 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/themes/themextend/style.css?ver=4.9.5 HTTP/1.1" 200 2136
W2P5A8wfRANRW1uZBgVIOwAAACQ 50.74.231.163 - - [03/Aug/2018:06:41:07 +0000] "GET /wp-content/plugins/kingcomposer/includes/frontend/vendors/owl-carousel/owl.theme.css?ver=2.6.17 HTTP/1.1" 200 658
W2P5BMwfRANRW1uZBgVIPAAAADE 50.74.231.163 - - [03/Aug/2018:06:41:08 +0000] "GET /wp-content/plugins/kingcomposer/includes/frontend/vendors/owl-carousel/owl.carousel.css?ver=2.6.17 HTTP/1.1" 200 528
W2P5BMwfRANRW1uZBgVIPQAAACU 50.74.231.163 - - [03/Aug/2018:06:41:08 +0000] "GET /wp-content/themes/themextend/css/slick.css?ver=4.9.5 HTTP/1.1" 200 557
W2P5BMwfRANRW1uZBgVIPgAAAC0 50.74.231.163 - - [03/Aug/2018:06:41:08 +0000] "GET /wp-content/themes/themextend/css/owl.carousel.min.css?ver=4.9.5 HTTP/1.1" 200 912
W2P5BJARzZ5ERwnzv6rD5gAAAIs 50.74.231.163 - - [03/Aug/2018:06:41:08 +0000] "GET /wp-content/themes/themextend/css/meanmenu.min.css?ver=4.9.5 HTTP/1.1" 200 682
W2P5BNIyikkB8rhsjsyUkQAAAMw 50.74.231.163 - - [03/Aug/2018:06:41:08 +0000] "GET /wp-content/themes/themextend/css/theme-default.css?ver=4.9.5 HTTP/1.1" 200 2643
W2P5BISOELTKgpmuElDYuQAAAEI 50.74.231.163 - - [03/Aug/2018:06:41:08 +0000] "GET /wp-content/themes/themextend/css/blog-post.css?ver=4.9.5 HTTP/1.1" 200 4773

As you can see:

  1. Everything included from above is 1 visit
  2. mod_unique_id generates a unique ID for each web request. The html of my front page includes > 30 assets such as CSS and images hosted locally.
  3. The timestamp for each requests above can be different.
  4. There can be multiple visits happening simultaneously (IP will be different)

How can I configure Apache to allow me to count each visit to my site properly?

Tinker
  • 171
  • 1
  • 1
  • 5

2 Answers2

3

With client side tracking, cookies and such. Remember most of the IPv4 addresses are behind NAT so there may be multiple users per address.

One implementation for httpd is mod_usertrack.

If you are all in on Google, you can get Analytics along with your Ads.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34
  • I just read mod_usertrack and learned that it assigns a cookie to each user. This kind of solves my problem, but if the user decides to go to another page on my site I still have the problem with the same cookie visiting multiple pages. – Tinker Aug 03 '18 at 16:43
  • Access logs are raw data and are ignorant of user sessions. You would need to analyze them, such as import into a database and do queries on unique cookies per day or whatever. And somehow filtering your Ad referrals from non-ad driven traffic. – John Mahowald Sep 09 '18 at 20:03
1

The following is a snippet of PHP code I wrote that sets a cookie with a random value, overwriting the cookie if it is already set. You can append it to the beginning (or end) of every page that you would like to count as an individual visit (that and give it a .php extension).

<?php
    /*
     * generateRandomString() is used to generate the ID, taken from 
     * https://stackoverflow.com/questions/4356289/php-random-string-generator/4356295#4356295
     */
    function generateRandomString($length = 10) {
        $characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
        $charactersLength = strlen($characters);
        $randomString = '';
        for ($i = 0; $i < $length; $i++) {
             $randomString .= $characters[rand(0, $charactersLength - 1)];
        }
        return $randomString;
    }

// Will generate an ID 12 characters long.
    setcookie("visit", generateRandomString(12));
?>

Then, put %{visit}C in the LogFormat entry in place of the ID that mod_unique_id was generating. This works by setting a cookie with the ID every time a client visits an actual page, and that cookie will appear in the log in place of %{visit}C with every visit. When I was writing this answer, I was going to use a PHP session variable instead, but couldn't figure out a way to include that in an Apache log file.


This is a simpler solution that I wrote before coming up with the PHP script and didn't want to delete. This solution doesn't answer your question as accurately as the other, but you (or someone else that reads this) may still find it useful.

You could record the requests for assets in another log, so that the main log would just have page visits (and robots). The following code checks if the URL simply contains .css or .jpg (add or remove your own extensions as necessary) in the VirtualHost section of your site config:

SetEnvIf Request_URI /.css/ assets
SetEnvIf Request_URI /.jpg/ assets

For recording 'visits', append env=!assets to your CustomLog entry: CustomLog /path/to/visit/log <logformat identifier> env=!assets

For recording assets, as denoted by the list: CustomLog /path/to/asset/log <logformat identifier> env=assets

Billy
  • 234
  • 3
  • 10