6

I can create the unlimited email alias in Google Accounts (Gmail), ref: http://www.gizmodo.com.au/2014/09/how-to-use-the-infinite-number-of-email-addresses-gmail-gives-you/

But I need to filter email address to avoid that a user uses the same original email for the register in my application.

I would like to know if have anything to solve this? or my unique option is filtering with regex?

Jonas WebDev
  • 363
  • 4
  • 12
  • 2
    You shouldn't do this. There are many domains that use Gmail that you don't know are using Gmail. There are other reasons someone might have a `+` in their username. – Brad Oct 29 '14 at 17:57
  • @Brad Can I check if email domain are linked with Google Accounts? via DNS, eg – Jonas WebDev Oct 29 '14 at 18:07
  • @Jonas Not reliably. A company can use Google Apps and still have some other SMTP server in front of Google. This isn't all that uncommon. A lot of companies will split their mail users between Google Apps and Exchange for example. This is a very common scenario during migration from something to Google Apps. – Brad Oct 29 '14 at 18:19
  • 3
    Gave you an upvote on this question because someone downvoted it and my answer without explaining why. I think it's a valid question based on its regex merits. –  Jan 07 '15 at 14:00
  • nice @tristan, thanks.... I consider this question important too, because some web application provide any service with unique email based, it must check this issue. – Jonas WebDev Jan 08 '15 at 19:48

2 Answers2

14

I don't agree with the comment outright stating that you shouldn't strip out the "filters" (e.g. user_email+some_filter_to_flag_incoming_messages@example.org). "Your use cases aren't the same as my use cases" and so on.[0]

tl;dr: The regex pattern you're looking for is: '(\+.*)(?=\@)'

Explanation:

To start, write a regex that matches the literal '+' sign and any single character, any number of times:

'(\+.*)'

When replacing this pattern with an empty string, this will turn tristan+some_filter@example.org into tristan. If you decide to split on the @ symbol, congrats, concat the resulting string to '@' + domain.TLD to this and you're done. I mention this in case you've already split the e-mail address and it's just hanging around anyway.

If you're not splitting the user-email address on the @ symbol, you need to use a "positive look-ahead" (match this pattern if it's proceeded by this thing I specify) to tell your match when to stop (so we don't take away too much):

'(\+.*)(?=\@)'

with this in place, we get tristan@example.org. Hooray, that wasn't actually so rough.


[0]: In one of my applications, I store the original filter-containing e-mail address that users give me for communications, but track filter usage and consider the canonical account (referenced-internally) to be the version without the filter (e.g. user_email@gmail.com). I do this to make it easier for users that opt-in to be located by e-mail address to search for each other.

I understand why people use aliases/filters:

  • they give the illusion that they can be used to track spammers (as if an e-mail harvester wouldn't remove the filter before selling a list)
  • they're useful in routing emails or triggering events (e.g. sending a text when you get an e-mail from me+package_delivery@domain.tld)
  • the "omg i can do this?" factor

Which is all to say, "I get it, people like filters," but there's valid reasons as an application author or a company to make note of them.

  • 4
    Edit: wow, why the downvote? This is a regex question and now it has an answer that works. –  Jan 07 '15 at 13:59
  • 1
    About your [0] point: exactly, it's very easy manipulate a email list to remove the alias. – Jonas WebDev Jan 08 '15 at 19:51
  • a new question: I can filter just email of Gmail's domains to avoid new registers with same email address real AKA without alias (duplicate email) on my web application, What's you think? .... PS: I says this, taking account the popularity of Gmail – Jonas WebDev Jan 08 '15 at 19:57
  • 1
    New questions mean new questions on Stackoverflow, but for what it's worth, I only remove filters on gmail domains for my application because I know their rules (highly unlikely, but a different service might consider a+bc@example.org to be a different account than a@example.org). If my answer helped you, mark it accepted. –  Jan 09 '15 at 04:51
6

I wrote a function in php that I've been using to do this now too:

function unalias_gmail($email) {
    return preg_replace('/(\+[^\@]+)/', '', $email);
}

This will return the email address without the alias if it's there, or the given email address if there is no alias. I have a dataProvider for phpunit so you can see my tests:

<?php

require_once __DIR__ . '/path/to/helper.php';
use PHPUnit\Framework\TestCase;

class HelperTest extends TestCase
{
    public function data_test_unalias_gmail(): array
    {
        return array(
            // Provided email vs the expected result
            array('foo@gmail.com', 'foo@gmail.com'),
            array('foo_bar@gmail.com', 'foo_bar@gmail.com'),
            array('foo-bar@gmail.com', 'foo-bar@gmail.com'),
            array('foo+bar@gmail.com', 'foo@gmail.com'),
        );
    }

    /**
     * @dataProvider data_test_unalias_gmail
     */
    public function test_unalias_gmail($email, $expected): void
    {
        $actual = unalias_gmail($email);
        $this->assertEquals($expected, $actual);
    }
}

Gives me a happy OK (4 tests, 4 assertions) =]

Note: This will need some love if they start allowing plus signs in domains!

Rohjay
  • 101
  • 1
  • 7