I was trying to write a regex to detect email addresses of the type 'abc@xyz.com' in java. I came up with a simple pattern.
String line = // my line containing email address
Pattern myPattern = Pattern.compile("()(\\w+)( *)@( *)(\\w+)\\.com");
Matcher myMatcher = myPattern.matcher(line);
This will however also detect email addresses of the type 'abcd.efgh@xyz.com'. I went through http://www.regular-expressions.info/ and links on this site like
How to match only strings that do not contain a dot (using regular expressions)
Java RegEx meta character (.) and ordinary dot?
So I changed my pattern to the following to avoid detecting 'efgh@xyz.com'
Pattern myPattern = Pattern.compile("([^\\.])(\\w+)( *)@( *)(\\w+)\\.com");
Matcher myMatcher = myPattern.matcher(line);
String mailid = myMatcher.group(2) + "@" + myMatcher.group(5) + ".com";
If String 'line' contained the address 'abcd.efgh@xyz.com', my String mailid will come back with 'fgh@yyz.com'. Why does this happen? How do I write the regex to detect only 'abc@xyz.com' and not 'abcd.efgh@xyz.com'?
Also how do I write a single regex to detect email addresses like 'abc@xyz.com' and 'efg at xyz.com' and 'abc (at) xyz (dot) com' from strings. Basically how would I implement OR logic in regex for doing something like check for @ OR at OR (at)?
After some comments below I tried the following expression to get the part before the @ squared away.
Pattern.compile("((([\\w]+\\.)+[\\w]+)|([\\w]+))@(\\w+)\\.com")
Matcher myMatcher = myPattern.matcher(line);
what will the myMatcher.groups be? how are these groups considered when we have nested brackets?
System.out.println(myMatcher.group(1));
System.out.println(myMatcher.group(2));
System.out.println(myMatcher.group(3));
System.out.println(myMatcher.group(4));
System.out.println(myMatcher.group(5));
the output was like
abcd.efgh
abcd.efgh
abcd.
null
xyz
for abcd.efgh@xyz.com
abc
null
null
abc
xyz
for abc@xyz.com
Thanks.