0

My ruby script filters a log and generates a hash like this

scores = {"Rahul" => "273", "John"=> "202", "coventry" => "194"}

by skipping multiple values for a key which is obvious

log file will be like this

Rahul has 273 Rahul has 217 John has 202 Coventry has 194

Is it Possible to generate something like this

scores = {"Rahul" => "273", "Rahul" =>"217",
          "John"=> "202", "coventry" => "194"}

scores = {"Rahul" => "273","217",
          "John"=> "202", "coventry" => "194"}

Is there a way to forcefully write into a hash even though the key is already existing in the hash

I will be grateful to any help or suggestions

Hameed Basha
  • 13
  • 1
  • 7
  • It looks like you want a data structure where the hash values are arrays, e.g. `{"Rahul" => ["273","217"], "John"=> ["202"], "coventry" => ["194"]}`. – Darshan Rivka Whittle Mar 14 '18 at 09:19
  • @Darshan Not necessarily arrays all I want is my script should not skip writing the value into the hash though the key already exists – Hameed Basha Mar 14 '18 at 09:29
  • _"My ruby script filters a log and generates a hash like this"_ – how? Please show your code. – Stefan Mar 14 '18 at 09:36
  • @Stefan Here is the code: Scores = Hash.new; file = 'LOG.txt'; F = File.open(file,'r'); F.each do |line|; score = line.split('has')[1]; player = line.split('has')[0]; scores.store(player,score); end. – Hameed Basha Mar 14 '18 at 10:03
  • 1
    @HameedBasha: "not necessarily array" - yes, necessarily arrays, if you want to keep all the values. If you want to keep only the last value, then you simply write it normally. No "force" needed. – Sergio Tulentsev Mar 14 '18 at 10:22
  • @HameedBasha: "all I want is my script should not skip writing the value into the hash" - it does not skip already. No work needed. :) – Sergio Tulentsev Mar 14 '18 at 10:28
  • @SergioTulentsev If the key is existing in the hash then it is skipping the writing of duplicate key and its value into hash. how can we append a value for an existing key – Hameed Basha Mar 14 '18 at 10:39
  • 2
    @HameedBasha by definition, hashes cannot contain duplicate keys. Keys in a hash are always unique. – Stefan Mar 14 '18 at 10:45
  • @HameedBasha: yes, it doesn't add a duplicate key, but it overwrites the old value for that key. So, not skipping. – Sergio Tulentsev Mar 14 '18 at 10:46
  • 2
    @HameedBasha: "how can we append a value for an existing key" - now that we established that we need arrays as values, you can no longer do `scores.store(player, score)`. Rather something like `scores[player] ||= []; scores[player].push(score)` – Sergio Tulentsev Mar 14 '18 at 10:48
  • I was searching a way to append multiple values to the existing key.Thanks all for your inputs – Hameed Basha Mar 16 '18 at 05:54

2 Answers2

4
"Rahul has 273 Rahul has 217 John has 202 Coventry has 194".
  scan(/(\w+) has (\d+)/).group_by(&:shift)
#⇒ {"Rahul"=>[["273"], ["217"]],
#   "John"=>[["202"]],
#   "Coventry"=>[["194"]]}

For the values flattening please check the comment by Johan Wentholt below.

Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160
  • 4
    `group_by(&:shift)` – the kind of code that keeps me awake at night ;-) – Stefan Mar 14 '18 at 09:53
  • @Stefan :) indeed, but here it’s called on the result of `scan` which is always a temporary array, so here it’s safe. – Aleksei Matiushkin Mar 14 '18 at 09:55
  • 1
    And for removing the nested arrays you can call `.each_value(&:flatten!)`, resulting in: `{"Rahul"=>["273", "217"], "John"=>["202"], "Coventry"=>["194"]}`. – 3limin4t0r Mar 14 '18 at 10:06
  • @Stefan I actually find `group_by(&:shift)` very useful when the first value is the identifier and the continuing values are data points. I use this quite a bit when aggregating data since it is better than `group_by(&:first)` followed by dropping the first value or selecting all but the first value – engineersmnky Mar 14 '18 at 14:40
  • @engineersmnky despite the additional nesting? – Stefan Mar 14 '18 at 14:49
  • @Stefan yes because I am usually aggregating rows of data and I intend to retain the row data in groups so it becomes `{identifier: [[row],[row],[row]]}` – engineersmnky Mar 14 '18 at 14:50
1

To store your scores, you could create a hash which has an empty array as its default value:

scores = Hash.new { |hash, key| hash[key] = [] }

scores['Rahul'] #=> [] <- a fresh and empty array

You can now extract the values from the log and add it to the respective key's value. I'm using scan with a block: (using the pattern from mudasobwa's answer)

log = 'Rahul has 273 Rahul has 217 John has 202 Coventry has 194'

log.scan(/(\w+) has (\d+)/) { |name, score| scores[name] << score.to_i }

scores #=> {"Rahul"=>[273, 217], "John"=>[202], "Coventry"=>[194]}

Although not required, I've converted each score to an integer before adding it to the array.

Stefan
  • 109,145
  • 14
  • 143
  • 218