-2

I have an array of hashes called array_of_hash:

array_of_hash = [
 {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
 {:name=>"5", :address=>"UK", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
 {:name=>"6", :address=>"CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"},
 {:name=>"29", :address=>"GERMANY", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
 {:name=>"30", :address=>"CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}
]

I wish to group these hashes by consecutive value of the key :name. The first group would be "1" alone since there is no key with :name => "1".succ #=> "2". The second group would contain hashes with values of "5" and "6". The third group would be the last two hashes in the array, for which :name=>29 and :name=>30.

My desired array of hashes should look like this:

[
   {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
   {:name=>"5-6", :address=>"UK,CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC,CD"},
   {:name=>"29-30", :address=>"GERMANY,CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE, FG"},
]

Use case II

array_of_hash = [
 {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
 {:name=>"2", :address=>"UK", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
 {:name=>"3", :address=>"CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"},
 {:name=>"29", :address=>"GERMANY", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
 {:name=>"30", :address=>"CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}
]

Desired result for use case II

[
   {:name=>"1-3", :address=>"USA,UK,CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB,BC,CD"},
   {:name=>"29-30", :address=>"GERMANY,CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE, FG"},
]

What I did so far:

new_array_of_hashes = []
new_array_of_hashes << { name: array_of_hashes.map {|h| h[:name].to_i}} << {address: array_of_hashes.map {|h| h[:address]}} << {collection: array_of_hashes.map {|h| h[:collection]}} << {sequence: array_of_hashes.map {|h| h[:sequence]}}

[{:name=>[1, 5, 6, 29, 30]},
 {:address=>["USA", "UK", "CANADA", "GERMANY", "CHINA"]},
 {:collection=>
[["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAPTOP", "SHIP", "MOUNTAIN"],
["LAPTOP", "SHIP", "MOUNTAIN"]]},
 {:sequence=>["AB", "BC", "CD", "DE", "FG"]}]

I am only able to combine it.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
kavin
  • 3
  • 5
  • 1
    How do you determine which elements to combine and which to leave separate? – moveson Mar 23 '17 at 21:46
  • @moveson If the `:collection` value is same then combine – kavin Mar 23 '17 at 21:48
  • 2
    In that case, shouldn't the first three elements all be combined? – moveson Mar 23 '17 at 21:54
  • that's the most difficult part. if they are in sequence only they will be combined else they should be separated. – kavin Mar 23 '17 at 21:57
  • Don't use comments to display code. Instead, edit your question and add the information requested into the question as if it'd been there originally. Don't use "edit" or "update" tags to mark the changes. We can see what changed and when if we need to. – the Tin Man Mar 23 '17 at 21:58
  • 1
    Please read "[mcve]". Your explanation of what should be joined doesn't make sense. 1, 5 and 6 have matching `:collection` arrays so they should all be combined but your example counters that. – the Tin Man Mar 23 '17 at 22:01
  • Does "in sequence" mean when the `:name` values are consecutive? – Sagar Pandya Mar 23 '17 at 22:02
  • @sagarpandya82 yes! – kavin Mar 23 '17 at 22:03
  • @theTinMan Sorry kind of new here. So trying to learn. – kavin Mar 23 '17 at 22:07
  • No need to be sorry (and don't use comments for thanks). Remember that SO isn't a "help me" site, it's a reference site to create articles of programming problems and their associated answers for others in the future. As such it means questions have to be complete so you have to do your research and write a detailed, complete question. If there is a lot of back and forth trying to figure it out then we'll close the question because it's broad or not defined or doesn't meet "[mcve]". https://meta.stackoverflow.com/questions/260263 is also important. – the Tin Man Mar 23 '17 at 22:37

3 Answers3

2

First, let's make an array of the groups that we ultimately want. We'll use Ruby's Array#slice_when method, which iterates over an array with the current and next array element, allowing us to compare the two. Our conditional will instruct Ruby to slice the array if the names (converted to integers) are not sequential or if the collections are not identical.

>> groups = array_of_hash.slice_when { |i, j| i[:name].to_i + 1 != j[:name].to_i || i[:collection] != j[:collection] }.to_a

But because you are using Ruby 2.1, you'll need to use slice_before and use local variables to keep track of previous elements. Per the documentation, we can accomplish this by first priming a local variable:

>> prev = array_of_hash[0]

and then resetting it and a second local variable as we iterate over the array:

>> groups = array_of_hash.slice_before { |e| prev, prev2 = e, prev; prev2[:name].to_i + 1 != prev[:name].to_i || prev2[:collection] != prev[:collection] }.to_a

In either case, groups should now look like this:

=> [[{:name=>"1",
   :address=>"USA",
   :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
   :sequence=>"AB"}],
 [{:name=>"5",
   :address=>"UK",
   :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
   :sequence=>"BC"},
  {:name=>"6",
   :address=>"CANADA",
   :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
   :sequence=>"CD"}],
 [{:name=>"29",
   :address=>"GERMANY",
   :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
   :sequence=>"DE"},
  {:name=>"30",
   :address=>"CHINA",
   :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
   :sequence=>"FG"}]]

Now we take the resulting array and map its elements to a new hash, formatted as you specified.

For :name, we take the first and last elements of the group, call .uniq to eliminate duplicates, and join them with a hyphen. (If only one element exists, join returns the single element unchanged.)

For :collection, we simply use the collection found in the first element of the group.

For :sequence, we join the sequences of each element of the group with a comma. (Again, single elements are returned unchanged.)

>> groups.map { |group| {name: [group.first[:name], group.last[:name]].uniq.join('-'), 
                         collection: group.first[:collection], 
                         sequence: group.map { |e| e[:sequence] }.join(',') } }

=> [{:name=>"1",
  :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
  :sequence=>"AB"},
 {:name=>"5-6",
  :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
  :sequence=>"BC,CD"},
 {:name=>"29-30",
  :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
  :sequence=>"DE,FG"}]
moveson
  • 5,103
  • 1
  • 15
  • 32
0
def slice_when(array)
  big = []
  small = []
  last_index = array.size - 1
  (0..last_index).each do |i|
    small << array[i]
    if last_index == i || yield(array[i], array[i + 1])
      big << small
      small = []
    end
  end
  big
end

You can try using this, if you do not want to use slice_before. Keep in mind that it already returns an Array, not an Enumurator.

Adam Zapaśnik
  • 633
  • 4
  • 9
0

Code

def aggregate(array_of_hash)
  array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }.
    flat_map { |a| a.chunk { |g| g[:collection] }.map { |_c,b| combine(b) } }
end

def combine(arr)
  names     = values_for_key(arr, :name)
  addresses = values_for_key(arr, :address)
  sequences = values_for_key(arr, :sequence)
  arr.first.merge {
    name: names.size==1 ? names.first : "%s-%s" % [names.first, names[-1]],
    address:  addresses.join(','),
    sequence: sequences.join(',')
  }
end

def values_for_key(arr, key)
  arr.map { |h| h[key] }
end

Example

aggregate(array_of_hash)
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
  #    {:name=>"5-6", :address=>"UK,CANADA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC,CD"},
  #    {:name=>"29-30", :address=>"GERMANY,CHINA",
  #     :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE,FG"}]   

Here's second example.

array_of_hash[2][:collection] = ['dog', 'cat', 'pig']
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
  #    {:name=>"5", :address=>"UK",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
  #    {:name=>"6", :address=>"CANADA",
  #     :collection=>["dog", "cat", "pig"], :sequence=>"CD"},
  #    {:name=>"29-30", :address=>"GERMANY,CHINA",
  #     :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE,FG"}]

In this example the hashes with :name=>"5" and :name=>"6" cannot be grouped because the values of :collection are different. The question does not state whether this could this situation could occur. If it could not the code is still correct, but it could be simplified to the following.

def aggregate(array_of_hash)
  array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }.
    map { |a| combine(a) }
end

Explanation

For the example above the steps are as follows.

e0 = array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fa25e022f30>:each>

See Enumerable#chunk_while, which made its debut in Ruby v.2.3.

This enumerator will generate the following elements to be passed to Enumerable#flat_map.

e0.to_a
  #=> [[{:name=>"1", :address=>"USA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}],
  #    [{:name=>"5", :address=>"UK",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
  #     {:name=>"6", :address=>"CANADA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"}],
  #    [{:name=>"29", :address=>"GERMANY",
  #      :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
  #     {:name=>"30", :address=>"CHINA",
  #      :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}]
  #   ] 

e0.flat_map { |a| a.chunk { |g| g[:collection] }.map { |_,b| combine(b) } }

returns the array of hashes obtained in the example. Consider the first element generated by e0 and passed to the block, and assigned to the block variable, by flat_map.

a = e0.next
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}] 

The block calculation is therefore

e1 = a.chunk { |g| g[:collection] }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fa25c857158>:each> 
e1.to_a
  #=> [[["LAND", "WATER", "OIL", "TREE", "SAND"],
  #     [{:name=>"1", :address=>"USA",
  #       :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}]
  #    ]
  #   ] 

_c,b = e1.next
  #=> [["LAND", "WATER", "OIL", "TREE", "SAND"],
  #    [{:name=>"1", :address=>"USA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}]
  #   ] 
  # _c
  #   #=> ["LAND", "WATER", "OIL", "TREE", "SAND"] 
  # b #=> [{:name=>"1", :address=>"USA",
  #         :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}] 
combine(b)
  #=> {:name=>"1", :address=>"USA",
  #    :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}

The remaining calculations are similar.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100