Ruby: Merging nested array between each other depending on a condition

Question

What would be the best way to merge arrays nested in an array that shares at least an element ? Here's an example:

some_method([[1, 2], [2, 3], [4, 5]])
#=> [[1, 2, 3], [4, 5]]
some_method([[1, 2], [2, 3], [3, 4], [5,6]])
#=> [[1, 2, 3, 4], [5, 6]]

what should happen when the array is `[[1, 2], [2, 3], [4, 5], [2, 6]]`? — davegson, Jun 16 '16 at 12:02
I'd like it to merge (and call uniq on) all elements sharing at list one value => [[1, 2, 3, 6], [4, 5]] — David B., Jun 16 '16 at 12:06

Stefan · Accepted Answer · 2016-06-16T13:22:26.200

5

This would work:

def some_method(arrays)
  h = Hash.new { |h, k| h[k] = [] }
  arrays.each do |array|
    tmp = h.values_at(*array).push(array).inject(:|)
    tmp.each { |k| h[k] = tmp }
  end
  h.values | h.values
end

Examples:

some_method([[1, 2], [2, 3], [4, 5]])          #=> [[1, 2, 3], [4, 5]]    
some_method([[1, 2], [2, 3], [3, 4], [5, 6]])  #=> [[1, 2, 3, 4], [5, 6]]    
some_method([[1, 3], [3, 4], [2, 5], [4, 5]])  #=> [[1, 3, 4, 2, 5]]

I'm using a hash h to store the array that correspond to a given element. The hash returns [] if a key doesn't exist.

After inserting [1, 2], the hash looks like this:

{
  1 => [1, 2],
  2 => [1, 2]
}

When inserting [2, 3], the arrays for 2 and 3 are fetched via:

h.values_at(2, 3)
#=> [[1, 2], []]

then [2, 3] itself is added:

h.values_at(2, 3).push([2, 3])
#=> [[1, 2], [], [2, 3]]

and everything is |-ed:

h.values_at(2, 3).push([2, 3]).inject(:|)
#=> [1, 2, 3]

This result is stored in tmp. It becomes the new value for the contained keys:

tmp.each { |k| h[k] = tmp }

Which is equivalent to:

h[1] = tmp
h[2] = tmp
h[3] = tmp

Afterwards, h looks like this:

{
  1 => [1, 2, 3],
  2 => [1, 2, 3],
  3 => [1, 2, 3]
}

At the end, the distinct values are returned via h.values | h.values.

edited Jun 16 '16 at 13:22

answered Jun 16 '16 at 13:16

Stefan

109,145
14
143
218

Is there a particular reason why you use 'h.values | h.values' instead of '#uniq' ? – David B. Jun 16 '16 at 14:46
@DavidB. no, not really. Maybe it's even faster to compare by object identity, i.e. `h.values.uniq(&:object_id)`. – Stefan Jun 16 '16 at 14:55
Is there a particular reason why you use 'h.values | h.values' instead of 'h.values & h.values' (or `h.invert.keys`)? – Cary Swoveland Jun 16 '16 at 16:16
@CarySwoveland the only reason is that I've used `|` earlier in the method to remove duplicates. – Stefan Jun 17 '16 at 04:59
Yes, I had noticed. Nice answer. – Cary Swoveland Jun 17 '16 at 05:54

Aleksei Matiushkin · Answer 2 · 2016-06-16T14:04:12.207

2

arr = [[1, 2], [2, 3], [3, 4], [5, 6]]

arr.map(&:dup).sort.each_with_object([]) do |a, memo|
  (idx = memo.index { |m| !(m & a).empty? }) ? memo[idx] |= a : memo << a
end
#⇒ [[1, 2, 3, 4], [5, 6]]

or, more expressive:

arr.map(&:dup).sort.each_with_object([]) do |a, memo|
  (memo.detect { |m| !(m & a).empty? } << a).
    flatten!.uniq! rescue memo << a
end

the most precise solution, that works for any permutations, but consumes more time:

loop.inject(arr.map(&:dup)) do |acc|
  result = (acc.each_with_object([]) do |a, memo|
    (idx = memo.index { |m| !(m & a).empty? }) ? memo[idx] |= a : memo << a 
  end)
  result == acc ? (break result) : result
end

edited Jun 16 '16 at 14:04

answered Jun 16 '16 at 12:16

Aleksei Matiushkin

119,336
10
100
160

This returns `[[1, 2, 3, 4], [2, 3], [3, 4], [5, 6]]`, not `[[1, 2, 3, 4], [5, 6]]` – Tom Lord Jun 16 '16 at 12:18
1

@TomLord it returns `[[1, 2, 3, 4], [5, 6]]`, run it and make sure yourself. – Aleksei Matiushkin Jun 16 '16 at 12:20
2

But it returns `[[1, 2, 3], [3, 4]]` for `arr = [[1, 2], [3, 4], [2, 3]]`. You can't assume consecutive pairs. – Stefan Jun 16 '16 at 12:25
Ahhh ok sorry, I was checking the value of `arr`, not the value of the enumerator. (It's still worth noting, however, that you mutated the original object!) – Tom Lord Jun 16 '16 at 12:25
@Stefan indeed, fixed with `.sort`. – Aleksei Matiushkin Jun 16 '16 at 12:27
@TomLord added `.dup` to both variants to avoid mutating original array, thanks. – Aleksei Matiushkin Jun 16 '16 at 12:30
1

@mudasobwa Your solution is still fundamentally flawed. It needs to be a recursive method. For example, consider the input: `[[1, 3], [3, 4], [2, 5], [4, 5]]` -- your code will return `[[1, 3, 4, 5], [2, 5]]`, not `[[1,2,3,4,5]]` – Tom Lord Jun 16 '16 at 12:32
@TomLord ok, for cumbersome permutations `while` loop does the trick, there is still no need for recursion. Updated an answer. – Aleksei Matiushkin Jun 16 '16 at 12:56
Yeah, my point was that you'd need to repeat the procedure indefinitely until the result stabilised. Not necessarily with recursion, although it does seem like the most logic approach to me. – Tom Lord Jun 16 '16 at 13:09
@Stefan `while` returns `nil`, the `result` is stored in `res` variable. Updated. – Aleksei Matiushkin Jun 16 '16 at 13:33
@mudasobwa are you sure it works? I keep getting `nil` for `res`. – Stefan Jun 16 '16 at 13:38
@Stefan yes, I am sure it works here (ruby 2.1.8.) What input do you test it on? – Aleksei Matiushkin Jun 16 '16 at 13:42
@Stefan indeed, thanks. I seem to have `res` already set and was just re-running the snippet without changing anything. `while` loop is evil: I rewrote it with normal `loop` and now it seems to work smoothly. Sorry for bringing mess. http://ideone.com/2lIaX1 – Aleksei Matiushkin Jun 16 '16 at 13:57

Cary Swoveland · Answer 3 · 2016-06-16T23:54:52.607

Here's a very simple approach. The steps are as follows.

Beginning with an array a = arr.map(&:uniq), arr being the initial array of arrays, look for two arrays of a that share an element, among all combinations of two arrays of a. If none are found, return a (fini!); else go to step 2.
If a[i] and a[j] are found to contain a common element, a[i] becomes a[i].concat(a[j]).uniq and a[j] is deleted.
Repeat #1.

def group_unique(arr)
  a = arr.map(&:uniq)
  loop do
    (_,i),(_,j) = a.each_with_index.to_a.combination(2).find {|(a,_),(b,_)|(a&b).any?}
    return a if i.nil?
    a[i] = a[i].concat(a.delete_at(j)).uniq
  end
end

arr = [[1,2], [5,6], [2,3], [4,5], [4,1], [7,8], [11,13], [8,10]]
group_unique(arr)
  #=> [[1, 2, 3, 4, 5, 6], [7, 8, 10], [11, 13]]

score 1 · Answer 4 · answered Jun 16 '16 at 12:34

It's a little verbose, but here is a recursive method that solves the problem properly:

def merge_shared_elements(list) 
  changed = false 
  result = list.each_with_object([]) do |item, new_list| 
    if existing_item = new_list.find {|new_item| !(new_item & item).empty?} 
      existing_item.concat(item).uniq! 
      changed = true 
    else 
      new_list << item 
    end 
  end 

  changed ? merge_shared_elements(result) : result 
end

This will keep re-iterating through the list, so the order of inputs is irrelevant.

Ruby: Merging nested array between each other depending on a condition

4 Answers4

Linked