-1

Weird one but:

Let's say you've a huge html page and if the page contains an email address (looking for an @ sign) you want to return that email.

So far I know I need something like this:

 String email;

 if (myString.contains("@")) {

      email = myString.substring("@")
 }

I know how to get to the @ but how do I go back in the string to find what's before it etc?

  • 1
    Maybe is better using a regex? – Alejandro Alcalde May 25 '14 at 20:00
  • 7
    @algui91 if you parse HTML with a regex you'll be kicked in your ass by prussian horses. I guarantee it. – Thomas Jungblut May 25 '14 at 20:01
  • 5
    I wouldn't use plain String methods or regular expressions to read data from HTML pages. There are specialized tools for web crawling and HTML parsing. Off the top of my head, take a look at [jsoup](http://jsoup.org/) – toniedzwiedz May 25 '14 at 20:02
  • 1
    This doesn't even have to do with html parsing or crawling. You guys are so stigmatic about regex + html. This is just a simple text search really. **except** if the e-mail address will apear in say, the `href` attribute of an `a` tag. – MarioDS May 25 '14 at 20:03
  • @MDeSchaepmeester it's not about stigma, you just don't *parse* a non-regular language with a regular expression. Try a regex for email scraping on the text on this page here, I at least see one particular spot where it already breaks finding emails- no matter how genius your regex is. – Thomas Jungblut May 25 '14 at 20:09
  • @ThomasJungblut you got a point if you can prove what that particular spot is - for starters, off the top of my head I don't know a valid HTML tag or attribute that has an `@`. Furthermore, the OP never said he actually cared about the HTML in question, he only wants to find an e-mailadres in a stream that happens to contain HTML. – MarioDS May 25 '14 at 20:15
  • 1
    May we know the context of this question, and what your eventual goal is? Chances are, you're taking some entirely wrong approach to something. – MarioDS May 25 '14 at 20:19
  • 2
    Not to mention that using `@` for other purposes is extremely common, such as on Twitter and SE. My suspicious side thinks the OP is writing a spambot. – chrylis -cautiouslyoptimistic- May 25 '14 at 21:05

4 Answers4

0

if the myString is the string for email you received from html page then ,

you can return the same string if it has @ right. something like below

String email;

 if (myString.contains("@")) {

      email = myString;
 }

whats the challenge here.. can you explain any challenge if so ?

Karibasappa G C
  • 2,686
  • 1
  • 18
  • 27
0
String email;

if (myString.contains("@")) {
    // Locate the @
    int atLocation = myString.indexOf("@");
    // Get the string before the @
    String start = myString.substring(0, atLocation);
    // Substring from the last space before the end
    start = start.substring(start.lastIndexOf(" "), start.length);
    // Get the string after the @
    String end = myString.substring(atLocation, myString.length);
    // Substring from the first space after the start (of the end, lol)
    end = end.substring(end.indexOf(" "), end.length);
    // Stick it all together
    email = start + "@" + end;
}

This may be a little off as I've been writing javascript all day. :)

JakeSidSmith
  • 819
  • 6
  • 12
0

This method will give you a list of all the email addresses contained in a string.

static ArrayList<String> getEmailAdresses(String str) {
    ArrayList<String> result = new ArrayList<>();
    Matcher m = Pattern.compile("\\S+?@[^. ]+(\\.[^. ]+)*").matcher(str.replaceAll("\\s", " "));
    while(m.find()) {
        result.add(m.group());
    }
    return result;
}
Tesseract
  • 8,049
  • 2
  • 20
  • 37
0

Rather than exact code, I would like to give you an approach.

Checking just by @ symbol might not be appropriate as it might be possible in other cases as well.

Search through internet or create your own, a regex pattern which matches an email. (if you want, you can add a check for email providers as well) [here is a link] (http://www.mkyong.com/regular-expressions/how-to-validate-email-address-with-regular-expression/)

Get the index of a pattern in a string using regex and find out the substring (email in your case).

Community
  • 1
  • 1
Mohit Kanwar
  • 2,962
  • 7
  • 39
  • 59