0

I have a hash with values that's an array. How do I delete repeated elements in the array and the corresponding ids in the most performant way?

Here's an example of my hash

hash = { 
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["Language", "Question", "Question", "Answer", "Answer"],
  "option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}

The idea I have is something like this

hash["options"].each_with_index { |value, index |
  h = {}
  if h.key?(value)
    delete(value)
    delete hash["option_ids"].delete_at(index)
  else 
    h[value] = index
  end
}

The result should be

hash = { 
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["Language", "Question", "Answer"],
  "option_ids" => ["12345", "23456", "45678"]
}

I know I have to put into consideration that when I delete the values of the options and option_ids the indexes of those values are going to change. But not sure how to do this

mechnicov
  • 12,025
  • 4
  • 33
  • 56
C. Yee
  • 127
  • 2
  • 15
  • Is there a reason that `{"Language" => "12345", "Question" => "23456", "Answer" => "45678"}` would not be preferable? – engineersmnky May 01 '19 at 20:49
  • Yes, that would make more sense but that was the problem given to me. – C. Yee May 01 '19 at 21:23
  • What do you mean by "repeated elements"? Is `2` (as well as `1`) a repeated element in `[1,2,2,3,1]`? – Cary Swoveland May 01 '19 at 21:53
  • C., technically, in answering @engineersmnky's question, I think you actually mean "No" (there's no reason). :-) You say, "The result should be...`hash = {...`". That's a bit confusing. If you wrote `hash #=> {...`, that would mean you want to modify the existing hash `hash` in place. If you wrote just `{...` that would imply (unless you stated otherwise) that you wish to create a new hash and leave the existing hash unchanged. When asking questions the general rule is that input objects are not to be modified (aka *mutated*) unless the asker explicitly states they are to be modified. – Cary Swoveland May 01 '19 at 22:45
  • @CarySwoveland Yes, `duplicated` would be a better wording :) Thanks for your comments and help! – C. Yee May 02 '19 at 12:52

3 Answers3

5

The first idea I had is to zip the values and call uniq, then think a way to return back to the initial form:

h['options'].zip(h['option_ids']).uniq(&:first).transpose
#=> [["Language", "Question", "Answer"], ["12345", "23456", "45678"]]


Then, via parallel assignment:
h['options'], h['option_ids'] = h['options'].zip(h['option_ids']).uniq(&:first).transpose

h #=> {"id"=>"sjfdkjfd", "name"=>"Field Name", "type"=>"field", "options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}

These are the steps:

h['options'].zip(h['option_ids'])
#=> [["Language", "12345"], ["Question", "23456"], ["Question", "34567"], ["Answer", "45678"], ["Answer", "56789"]]

h['options'].zip(h['option_ids']).uniq(&:first)
#=> [["Language", "12345"], ["Question", "23456"], ["Answer", "45678"]]
iGian
  • 11,023
  • 3
  • 21
  • 36
1
hash = { 
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["L", "Q", "Q", "Q", "A", "A", "Q"],
  "option_ids" => ["12345", "23456", "34567", "dog", "45678", "56789", "cat"]
}

I assume that "repeated elements" refers to contiguous equal elements (2 only in [1,2,2,1]) as opposed to "duplicated elements" (both 1 and 2 in the previous example). I do show how the code would be altered (simplified, in fact) if the second interpretation applies.

idx = hash["options"].
  each_with_index.
  chunk_while { |(a,_),(b,_)| a==b }.
  map { |(_,i),*| i }
  #=> [0, 1, 4, 6]

hash.merge(
  ["options", "option_ids"].each_with_object({}) { |k,h| h[k] = hash[k].values_at(*idx) }
)
  #=> {"id"=>"sjfdkjfd",
  #    "name"=>"Field Name",
  #    "type"=>"field",
  #    "options"=>["L", "Q", "A", "Q"],
  #    "option_ids"=>["12345", "23456", "45678", "cat"]}

If "repeated elements" is interpreted to mean that the values of "options" and "option_ids" are to only have the first three elements shown above, calculate idx as follows:

idx = hash["options"].
  each_with_index.
  uniq { |s,_| s }.
  map(&:last)
    #=> [0, 1, 4]

See Enumerable#chunk_while (Enumerable#slice_when could be used instead) and Array#values_at. The steps are as follows.

a = hash["options"]
  #=> ["L", "Q", "Q", "Q", "A", "A", "Q"] 
e0 = a.each_with_index
  #=> #<Enumerator: ["L", "Q", "Q", "Q", "A", "A", "Q"]:each_with_index> 
e1 = e0.chunk_while { |(a,_),(b,_)| a==b }
  #=> #<Enumerator: #<Enumerator::Generator:0x000055e4bcf17740>:each> 

We can see the values the enumerator e1 will generate and pass to map by converting it to an array:

e1.to_a
  #=> [[["L", 0]],
  #    [["Q", 1], ["Q", 2], ["Q", 3]],
  #    [["A", 4], ["A", 5]], [["Q", 6]]] 

Continuing,

idx = e1.map { |(_,i),*| i }
  #=> [0, 1, 4, 6] 

c = ["options", "option_ids"].
      each_with_object({}) { |k,h| h[k] = hash[k].values_at(*idx) } 
  #=> {"options"=>["L", "Q", "A", "Q"],
  #    "option_ids"=>["12345", "23456", "45678", "cat"]} 
hash.merge(c)
  #=> {"id"=>"sjfdkjfd",
  #    "name"=>"Field Name",
  #    "type"=>"field",
  #    "options"=>["L", "Q", "A", "Q"],
  #    "option_ids"=>["12345", "23456", "45678", "cat"]}
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
0

Using Array#transpose

hash = {
  "options" => ["Language", "Question", "Question", "Answer", "Answer"],
  "option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}

hash.values.transpose.uniq(&:first).transpose.map.with_index {|v,i| [hash.keys[i], v]}.to_h
#=> {"options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}

After the OP edit:

hash = {
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["Language", "Question", "Question", "Answer", "Answer"],
  "option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}

hash_array = hash.to_a.select {|v| v.last.is_a?(Array)}.transpose
hash.merge([hash_array.first].push(hash_array.last.transpose.uniq(&:first).transpose).transpose.to_h)
#=> {"id"=>"sjfdkjfd", "name"=>"Field Name", "type"=>"field", "options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}
mechnicov
  • 12,025
  • 4
  • 33
  • 56