-2

I am using Ruby on Rails to make a university-exclusive website that categorizes all registered users into their specific universities via their ".edu" email. Nearly all US-based universities have an "xyz.edu" email domain. In essence, everyone that signs up with their ".edu" email would all be categorized with a similar "domain.edu".

I've searched for a regex to look for like-domains.edu and assign them into a variable or specific indexes, but I must be looking in the wrong place because I cannot find how to do this.

Would I use regex for this? Or maybe a method after their email has been verified?

I would appreciate any help or feedback I can get.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • You can use regex to match patterns, not to sort things. Of course, you could use Ruby to sort things based on matches which you've made using regex. – Vasili Syrakis Dec 19 '13 at 02:44
  • What code have you written? "Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See http://SSCCE.org for guidance." – the Tin Man Dec 19 '13 at 03:18

4 Answers4

2

You could use a regex to extract domain names:

"gates@harvard.edu" =~ /.*@(.*)$/

This simple regexp will capture everything after the @ symbol. You can experiment more with this regexp here.

However, what you have to think about is how to handle cases like gates@harvard.edu vs gates@seas.harvard.edu.

My example will parse them out as different entities: harvard.edu vs seas.harvard.edu.

Arman H
  • 5,488
  • 10
  • 51
  • 76
  • You can do `"gates@harvard.edu".scan(/.*@(?.*)$/)` and then `domain.split('.')[-2]`. It will return `harvard` for booth: `gates@harvard.edu` and `gates@seas.harvard.edu`. – Hauleth Dec 19 '13 at 02:58
  • 1
    @ŁukaszNiemier, you can also use negative look-ups in the regexp to parse out only the TLDs. I gave my solutions because it's not clear what the OP needs. Perhaps he wants to preserve 2nd level names... – Arman H Dec 19 '13 at 03:02
  • My solution preserve 2nd level names. More, it saves all domain parts. – Hauleth Dec 19 '13 at 09:48
  • `scan` isn't that useful for this. It wants to find repeated occurrences of the pattern and return an array, forcing us to deal with an array as a result. – the Tin Man Dec 19 '13 at 14:38
1

I would probably go ahead and create an institution/university/group model that would hold those users. It would be easier now than later down the line. But, in an effort to answer your question, you could do something like:

array_of_emails = ['d@xyz.edu', 'a@abc.edu', 'c@xyz.edu', 'b@abc.edu' ]
array_of_emails.sort_by! { |email| "#{email[email.index('@')..-1]}#{email[0..email.index('@')]}" }

EDIT: Changed sort! to sort_by!

kddeisz
  • 5,162
  • 3
  • 21
  • 44
1

Dealing with domains is going to get a lot more complex in the future, with new TLDs coming on line. Assuming that .edu is the only educational TLD will be wrong.

A simple way to grab just the domain for now is:

"gates@harvard.edu"[/(@.+)$/, 1] # => "@harvard.edu"

That will handle things like:

"gates@mail.harvard.edu"[/(@.+)$/, 1] # => "@mail.harvard.edu"

If you don't want the @, simply shift the opening parenthesis right one character:

pattern = /@(.+)$/
"gates@harvard.edu"[pattern, 1] # => "harvard.edu"
"gates@mail.harvard.edu"[pattern, 1] # => "mail.harvard.edu"

If you want to normalize the domain to strip off sub-domains, you can do something like:

pattern = /(\w+\.\w+)$/
"harvard.edu"[pattern, 1] # => "harvard.edu"
"mail.harvard.edu"[pattern, 1] # => "harvard.edu"

which only grabs the last two "words" that are separated by a single ..

That's somewhat naive, as non-US domains can have a country code, so if you need to handle those you can do something like:

pattern = /(\w+\.edu(?:\.\w+)?)$/
"harvard.edu"[pattern, 1] # => "harvard.edu"
"harvard.edu.cc"[pattern, 1] # => "harvard.edu.cc"
"mail.harvard.edu.cc"[pattern, 1] # => "harvard.edu.cc"

And, as to whether you should do this before or after you've verified their address? Do it AFTER. Why waste your CPU time and disk space processing invalid addresses?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0
array_of_emails = ['d@xyz.edu', 'a@abc.edu', 'c@xyz.edu', 'b@abc.edu' ]
x = array_of_emails.sort_by do | a | a.match(/@.*/)[0] end
x.each do |a|
  puts a
end
devanand
  • 5,116
  • 2
  • 20
  • 19
  • Wait did you literally just copy paste my code and then put in sort_by instead of proposing an edit? – kddeisz Dec 19 '13 at 14:21