This particular problem is a great example of how choosing the right algorithm, but even more importantly the right data structure can massively simplify the solution. In fact, in this particular case, choosing the right data structure will make the algorithm so trivial that it basically completely vanishes: the data structure already is the answer.
The data structure I am talking about is a Multiset
: a Multiset
is like a Set
, except it doesn't store only unique items, instead it stores a count of how often each item is in the Multiset
. Basically, a Set
tells you whether a particular item is in the Set
at all, a Multiset
in addition also tells you how often that particular item is in the Multiset
.
Unfortunately, there is no Multiset
implementation in the Ruby core library or standard library, but there are a couple of implementations floating around the web.
You literally just have to construct a Multiset
from your Array
. Here's an example:
require 'multiset'
ary = ["student", "student", "teacher", "teacher", "teacher"]
print Multiset[*ary]
Yes, that's all there is to it. This prints:
#2 "student"
#3 "teacher"
And that's it. Example, using https://GitHub.Com/Josh/Multimap/:
require 'multiset'
histogram = Multiset.new(*ary)
# => #<Multiset: {"student", "student", "teacher", "teacher", "teacher"}>
histogram.multiplicity('teacher')
# => 3
Example, using http://maraigue.hhiro.net/multiset/index-en.php:
require 'multiset'
histogram = Multiset[*ary]
# => #<Multiset:#2 'student', #3 'teacher'>
Another possibility is to use a Hash
, which basically just means that instead of the Multiset
taking care of the element counting for you, you have to do it yourself:
histogram = ary.inject(Hash.new(0)) {|hsh, item| hsh.tap { hsh[item] += 1 }}
print histogram
# { "student" => 2, "teacher" => 3 }
But you can have that easier if instead of counting yourself, you use Enumerable#group_by
to group the elements by themselves and then map the groupings to their sizes. Lastly, convert back to a Hash
:
Identity = ->x { x }
print Hash[[ary.group_by(&Identity).map {|n, ns| [n, ns.size] }]
# { "student" => 2, "teacher" => 3 }