I am only capturing all valid email addresses from email body using below method.
public static IEnumerable<string> ParseAllEmailAddressess(string data)
{
HashSet<String> emailAddressess = new HashSet<string>();
Regex emailRegex = new Regex(@"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*", RegexOptions.IgnoreCase);
MatchCollection emailMatches = emailRegex.Matches(data);
foreach (Match emailMatch in emailMatches)
{
emailAddressess.Add(emailMatch.Value);
}
return emailAddressess;
}
The problem here is outlook converts the Signature image into some random email address something like (image001.png@01D36870.C9EE4D60) . And my method considering it as valid email address and captures it. I want to strip off such email address while parsing email body.
I can think of splitting the email address with . before @ site and use the first index to match the image extension ".png" to identify valid email or not. But i think it not very efficient. Applying some reg ex to strip signature images content would be fast.
Any help would be appreciate.