2

ive got some trouble with writting regex for this lines in exim log

 1. 2011-05-12 11:30:26 1QKRHt-0001aD-Vd => mail <mail@mail.example.com> F=<root@example.com> bla bla 
 2. 2011-04-22 12:01:31 1QDCF0-0002ss-Nw => /var/mail/mail <root@mail.mealstrom.org.ua> F=<root@example.com> bla bla 
 3. 2011-05-12 11:29:01 1QKRGU-0001a5-Ok => mail@mail.example.com F=<root@example.com> bla bla

and i want to put to variable this mail@mail.example.com in one regexp. ive tryed to use logic lile this: find last string before 'F=', seperated by whitespaces and can be locked in < >

Can you help me to write this logic?

MealstroM
  • 189
  • 1
  • 10
  • 2
    If you want to validate e-mail addresses according to [RFC 822](http://www.ietf.org/rfc/rfc0822.txt?number=822) it is [not easy at all, a good regex is awesomely long](http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html) – Benoit May 13 '11 at 07:05

4 Answers4

2

You can use the following regex:

# the line should be in variable $line
if ($line =~ /.*?\s+<?(\S+?)>?\s+F=/) {
  # ...
}

And then it is a good idea to validate your match with Mail-RFC822-Address perl module, so the full code could be:

use Mail::RFC822::Address qw(valid);

# the line should be in variable $line
if ($line =~ /.*?\s+<?(\S+?)>?\s+F=/) {
  if (valid($1)) {
    # ...
  }
}
KARASZI István
  • 30,900
  • 8
  • 101
  • 128
  • ive used this package to validate emails. and yours regexp works. you have contributed to this project :D https://github.com/mealstrom/plp2sql/ the emails are valid cos they are already in log. – MealstroM May 13 '11 at 07:26
1

Use:

/(?<=<)\S*(?=>\s*F=)/

The (?<= xxx ) syntax is a lookbehind assertion, and the (?= xxx ) is a lookahead assertion.

this will not check the validity of the e-mail address, just extract that part of the line.

Benoit
  • 76,634
  • 23
  • 210
  • 236
0

Here is a Email Validation Regex

\b[\w\.-]+@[\w\.-]+\.\w{2,4}\b

It will extract an email from anywhere.

I hope this RFC2822 one posts correctly.

[a-z0-9!#$%&'*+/=?^_\`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)\*@(?:\[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+\[a-z0-9](?:[a-z0-9-]\*[a-z0-9])?
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Craig White
  • 13,492
  • 4
  • 23
  • 36
  • I think this is missing some square brackets in the part after the `@` to define character classes – stema May 13 '11 at 06:59
  • It wouldnt seem to post, kept getting cut for some reason. I added another one instead. – Craig White May 13 '11 at 07:00
  • it is non standard. For example `foo+bar@gmail.com` is a valid e-mail address, and you do not support it. See [this !](http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html) – Benoit May 13 '11 at 07:00
  • I have managed to get the RFC2822 one posted. Took a bit but got it to display on the site. – Craig White May 13 '11 at 07:04
0

Regex is not the measure, Email::Valid is.

daxim
  • 39,270
  • 4
  • 65
  • 132