-2

I am trying to create a regular expression which will look for a email address after Cc field in the email header. I don't have programming control on the string, so its not specific to any particular programming language. It's just a part of integration of some software which expects some regular expression in a search criteria

The email header looks like this:

Received: by hermit.cdu-staff.local 
    id <01CCE6E3.19910AB8@hetmit.ere-tyumm.local>; Thu, 9 Feb 2012 13:57:14 +0930
MIME-Version: 1.0
Content-Type: multipart/alternative;
    boundary="----_=_NextPart_001_01CCE6E3.19910AB8"
Content-class: urn:content-classes:message
X-MimeOLE: Produced By Microsoft Exchange V6.5
Subject: Email header example
Date: Thu, 9 Feb 2012 13:57:10 +0930
Message-ID: <6434D994F5A495428AB3B69877565EF97040C469A@hermit.cdi-stann.local>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Email header example
Thread-Index: Aczm4xa7dGVpHUWERSSOuR8HCNmrAw==
From: "Bishnu Paudel" <Bishnu.Paudel@company.com>
To: "Study" <study@company.com>
Cc: "Cameron Loudon" <Cameron.Loudon@company.com>

I have created a regular expression which works great if the string is a one line string (the last line in the header). Here is the expression

(^|,)\s*.*Cc:.*(bishnu.paudel|cameron.loudon)@company[.]com\s*($|,).

Any help would greatly be appreciated.

Bishnu Paudel
  • 2,083
  • 1
  • 21
  • 39

3 Answers3

1

The following regex should solve the problem

\b[A-Z0-9._%+-]+@yourcompany.com\b

which is an adaption of the regex presented here. Please note the comments on the original website on what a 'valid' e-mail address is defined as.

Till Hoffmann
  • 9,479
  • 6
  • 46
  • 64
  • HI Till,I got it finally working by constructing this regular expression. **Cc:.*<(student.admin|study|summer|midyear|changeyourworld)@ourcompany[.]com>** which matches header for example _To: Cc: "studentadmin" , "Bishnu Paudel" _ but it won't match a header where the targeted email (student.admin) is not at he first place of the Cc field. for example : _To: Cc: "Bishnu Paudel" , "studentadmin" _ – Bishnu Paudel Feb 29 '12 at 00:59
0

I got it finally working by constructing this regular expression. Cc:.*<(student.admin|study|summer|midyear|changeyourworld)@ourcompany[.]com> which matches header for example:

To: <paudel_bishnu@hotmail.com> 
Cc: "studentadmin" <student.admin@ourcompany.com>, "Bishnu Paudel" <Bishnu.Paudel@ourcompany.com> 

but it won't match a header where the targeted email (student.admin) is not at he first place of the Cc field. for example :

To: <paudel_bishnu@hotmail.com> 
Cc: "Bishnu Paudel" <Bishnu.Paudel@ourcompany.com>, "studentadmin" <student.admin@ourcompany.com> 

Cheers,

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
Bishnu Paudel
  • 2,083
  • 1
  • 21
  • 39
0

You haven't specified which programming language you're using, but in general, you can write something like this:

(^|,)\s*(admin|clients)@ourcompany[.]com\s*($|,)

That will match admin@ourcompany.com or clients@ourcompany.com, provided that it's preceded by start-of-string or a comma (with optional whitespace) and followed by a comma or end-of-string (with optional whitespace).

Note that e-mail addresses are actually pretty complicated — for example, if I recall correctly, admin@ourcompany.com and "admin"@ourcompany.com are technically equivalent — so I'd be cautious: this sort of ad-hoc parsing approach may not be advisable. (Basically, I'd ask: how big a problem is it if your regex returns a false positive or a false negative? If you need to be very confident of the results, then this approach is probably not the way to go.)

ruakh
  • 175,680
  • 26
  • 273
  • 307
  • Thanks Ruakh, yes your regular expression does match the email address. But I needed a regular expression to search for our email address in the email header which contains CC: , TO: and a bunch of stuffs. I just need to find if any one of our email address is present in the email string followed by "CC:". The email header comes in a string. I don't have programmatic control to do this, so it not related to any programming languages. – Bishnu Paudel Feb 01 '12 at 00:47
  • hi Ruakh,I ended up creating a regular expression that would match any number of characters before/after or between the emails and will look for our email addresses in consideration.(^|,)\s*.*Cc:.*(bishnu.paudel|cameron.loudon)@ourcompany[.]com\s*($|,) But, this does not match a string which would contain a new line character. Any idea please? – Bishnu Paudel Feb 09 '12 at 03:58
  • @BishnuPaudel: By default, `.` means "any non-newline character", but you can specify that it should match newlines as well. Unfortunately, you *still* haven't specified what programming language this is in, so it's hard to give appropriate advice. You can try wrapping the entire regular expression in `(?s:...)` -- e.g., `(?s:(^|,)\s*.*Cc:.*(bishnu.paudel|cameron.loudon)@ourcompany[.]com\s*(‌​$|,))` -- which works in a number of languages. – ruakh Feb 09 '12 at 13:05
  • Hi Ruakh, Thank you very much. I am using an existing windows application which expects regular expression to filter strings while generating reports. I don't have programming control over the string. I just got to know that the application was build in .net. Will give try and let you know how I go . Cheers :) – Bishnu Paudel Feb 09 '12 at 22:45