1

I need to filter out any spam requests in our url that has email address or any sort of personal information.

For example : if anyone enters the url

www.mydomain.com/en-us?email=abc@gmail.com

it should redirect to

www.mydomain.com/en-us?email=

[Use regex to pattern match the email and remove that] basically it should keep the url as is and remove emailaddress

another example

Redirect

www.mydomain.com/en-us/sompePage/SomeStructure?query=abc.gmail.com

to

www.mydomain.com/en-us/sompePage/SomeStructure?query=

This is what I tried :

(http|https)://mydomain.com/(^((?!\.)[\w-_.]*[^.])(@\w+)(\.\w+(\.\w+)?[^.\W])$ but throws an error Back story and reasoning, if interested

We looked at google analytics and realized that our website is getting with a random email address with a random query string. But Google is marking them as storing personal information and hence see it as a violation of policy. Sl we are trying to place some regex in akamai so that these requests never hit the server. (We also have a fallback javascript in place to handle the same)

Night Monger
  • 770
  • 1
  • 10
  • 33

4 Answers4

0

Your examples are not very good at explanation of the problem. The main issue is how to identify 'mail' which you would like to remove.

I would take other approach:

use strict;
use warnings;

use Data::Dumper;

my $debug = 0;

my %url;

while( <DATA> ) {
    print if $debug;

    chomp;

    m|(https{0,1})://([\w\d\.]+)/(.*)\?(.*)|;

    @url{qw(proto dn path query)} = ($1,$2,$3,$4);

    print Dumper(\%url) if $debug;

    # now do whatever your heart desire with $url{query}

    $url{query} =~ /(.*=)/;
    $url{query} = $1;

    print Dumper(\%url) if $debug;

    printf "%s://%s/%s/%s\n",
                $url{proto},    # protocol
                $url{dn},       # domain name
                $url{path},     # directory path
                $url{query};    # query

}

__DATA__
http://www.example.org/en-us?email=abc@gmail.com
https://www.example.org/en-us/sompePage/SomeStructure?query=abc.gmail.com
Polar Bear
  • 6,762
  • 1
  • 5
  • 12
0

Well I have read your question once more and realized that probably akamai allows only regex modification and do not includes full fledged perl scripts.

Then probably what you looking for will be s|=.*|=| or s/=.*/=/

use strict;
use warnings;

while( <DATA> ) {
    s|=.*|=|;
    print;
}

__DATA__
www.mydomain.com/en-us?email=abc@gmail.com
www.mydomain.com/en-us/sompePage/SomeStructure?query=abc.gmail.com

But again this method does not identify e-mail in query. Your question is not complete to make full judgement on the problem.

You could try s/[\w\d\.\-]+@[\w\d\.]+// as e-mail matching substitute regex.

Polar Bear
  • 6,762
  • 1
  • 5
  • 12
  • not sure what the confusion is. All i want is to strip the email id from url. the rest of the url remails same. I am trying to find the regex that would find the email in url – Night Monger Nov 07 '19 at 03:34
  • Will you have any other parameters in query? What do you plan to do with them? See above `s/[\w\d\.\-]@[\w\d\.\*]//` it should strip e-mail address if there is no other parameters provided. Otherwise you need to provide some '_stop_' after e-mail and probably this '_stop_' will be **&** symbol (you need to see how it encoded in query). For example `abc.gmail.com` does not _match_ e-mail address (e-mail should look like **user@domain** where user can include _letters/digits/./-_ and domain _letters/digits/./-_). – Polar Bear Nov 07 '19 at 03:42
  • Please check following [question](https://stackoverflow.com/questions/37703864/regular-expression-to-validate-an-email-in-perl) about verification of e-mail address. Particulary you should refer to [RFC822 Section 6.1](https://www.w3.org/Protocols/rfc822/#z8) for email address 'specification'. Sorry, in previous reply please read `s/[\w\d\.\-]+@[\w\d\.]+//` – Polar Bear Nov 07 '19 at 03:51
0

EDIT:

Looking more closely, I forgot we had added the Match with Regular Expressions. Our RegEx engine defaults to PCRE syntax, so you could theoretically make a comprehensive Match in the whole query string: match on regular expression

Double check and test on the Staging platform before you commit. And double check the RegEx. I took that from emailregex.com and didn't test it myself.

ORIGINAL:

With Akamai, the Property Manager tool lets you do this with a new rule. You should check the documentation and test before deploying, or consult with your account team for more specific questions.

In the rule, you'll want to add a match for the query string like so: property manager with new rule and match

From there, add the behavior to have the Akamai platform do what you want. If it's a simple redirect, then you can use the Redirect behavior and remove the query strings completely. Something like this: redirect behavior

As the info box says, this specific use case might be better handled with the Redirector Cloudlet. But there are many things you can do once you've matched on that query string.

Josh Cheshire
  • 334
  • 1
  • 10
-1

The code for Email::Valid contains a regex for validating an email address. It's rather more complex than most people think though :-)

Dave Cross
  • 68,119
  • 3
  • 51
  • 97