3

I'm trying to block out a huge amount of spam some of our websites are receiving via spam referral visits, so I've created the following regular expression:

.*(event-tracking|porn|hulfington|free-share-buttons|buttons-for-your-website|Get-Free-Traffic|darodar|best-seo-offer|buy-cheap-online|theguardlan|googlsucks).*

I've then gone into Analytics > Admin > Filters > +New Filter > Custom Filter > Exclude Referral > and then added this regular expression.

When verifying though, I receive:

This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small.

Is there any reason why this regex wouldn't work in Google Analytics?

Liam McArthur
  • 1,033
  • 3
  • 18
  • 42
  • This is really about configuration, not about programming (plus you might want to try to filter by campaign source, I think the referring url is stored there). – Eike Pierstorff May 13 '15 at 09:55

2 Answers2

3

The expression is correct you can even get rid of the .* I know using the referral as filter field sounds logic, but you should use Campaign Source instead, and you filter and verification will work (the filter verification just takes a sample of your data)

Here is the recommendation from Google https://support.google.com/analytics/answer/1034842?hl=en

Although this is the common way to stop referrer spam, lately the spammers have been hitting with direct visits along with the referrals, your filter will only work with the referral part and you will still have the spam from the direct visits. Here is a demonstration:

https://webmasters.stackexchange.com/a/81193/49561

If you want to get rid of ghost spam no matter how it hits (referral, keyword or direct) you should use a valid hostname filter. Ghost Spam uses either a fake hostname or is "not set". Here is detailed information about this solution

https://stackoverflow.com/a/28354319/3197362

https://stackoverflow.com/a/29717606/3197362

Community
  • 1
  • 1
Carlos Escalera Alonso
  • 2,333
  • 2
  • 25
  • 37
0

In Filter help, it is written:

Limitations of filters

Filters require up to 24 hours before they are applied to your data.

Fields specified in a filter must exist in the hit and not be null in order for the filter to be applied to that hit. For example, if you are filtering on Hostname, but the hit does not contain that field (perhaps the hit was sent via the Measurement Protocol and that request did not contain the &dh parameter), then any filters acting on Hostname will be ignored and the hit will be processed as if there was no filter.

So, either wait 24 hrs for the filter to take effect (and your regex will work since it is valid), or check if you are filtering on hostname (in that case, your filter just won't work).

Also, you can check the filter configuration as it is stated in the warning. Here you can find some good step-by-step tutorial on how to use the exclusion filter.

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I don't think that quite answers the question - the OP was wondering why the filter verification tool returned an error (which is there precisely so that you *don't* have to wait 24 hours). – Eike Pierstorff May 13 '15 at 09:47
  • I do not see any error here (this is just a warning). Also, this answers *Is there any reason why this regex wouldn't work in Google Analytics?* - no, there is no reason for it, since the regex is valid. – Wiktor Stribiżew May 13 '15 at 09:53