13

Let's say I have an array of numbers, e.g.

ary = [1, 3, 6, 7, 10, 9, 11, 13, 7, 24]

I would like to split the array between the first point where a smaller number follows a larger one. My output should be:

[[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

I've tried slice_when and it comes quite close:

ary.slice_when { |i, j| i > j }.to_a
#=> [[1, 3, 6, 7, 10], [9, 11, 13], [7, 24]]

But it also splits between 13 and 7, so I have to join the remaining arrays:

first, *rest = ary.slice_when { |i, j| i > j }.to_a
[first, rest.flatten(1)]
#=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

which looks a bit cumbersome. It also seems inefficient to keep comparing items when the match was already found.

I am looking for a general solution based on an arbitrary condition. Having numeric elements and i > j is just an example.

Is there a better way to approach this?

Stefan
  • 109,145
  • 14
  • 143
  • 218
  • 1
    Just so I understand, should it only slice on the first occurrence and not the latter? – Anthony Oct 13 '17 at 15:29
  • @Anthony yes exactly. That's why I would like to avoid `slice_when`. Slicing the array into multiple parts just to undo everything but the first one doesn't feel right. – Stefan Oct 13 '17 at 15:32
  • Gotta say, I actually think your initial approach is pretty elegant @Stefan – SRack Oct 13 '17 at 15:42
  • 1
    @SRack well, the code is short. But the algorithm keeps comparing every pair of elements, although it could stop at the first match, which seems inefficient (what if the array is huge?). Just like `array.select { ... }.first` when you simply want `array.find { ... }` – Stefan Oct 13 '17 at 15:52
  • Very good point! Thanks for the response. – SRack Oct 13 '17 at 15:53
  • To join the remaining arrays appending `.partition.with_index { |_,i| i.zero? }.map(&:flatten)` is an option, but probably better ways. – Sagar Pandya Oct 13 '17 at 16:29
  • Dear downvoter, would you leave a comment as to what is wrong with my question, please? – Stefan Oct 13 '17 at 16:47
  • 1
    @Stefan Maybe someone fat-fingered it, there's no confirmation required on down-votes. – tadman Oct 13 '17 at 17:11

7 Answers7

10

There's probably a better way to do this but my first thought is...

break_done = false
ary.slice_when { |i, j| (break_done = i > j) unless break_done }.to_a
#=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]
Stefan
  • 109,145
  • 14
  • 143
  • 218
SteveTurczyn
  • 36,057
  • 6
  • 41
  • 53
  • Any chance to get rid of the temporary variable? – Stefan Oct 13 '17 at 15:35
  • I did try to work around that... initializing the variable in the block it doesn't persist through iterations. :( – SteveTurczyn Oct 13 '17 at 15:36
  • You can monkey patch it into the Array class: `class Array; def slice_when_once; break_done = false; slice_when { |i, j| break_done = yield(i, j) unless break_done }; end; end;`. Then you can do: `ary.slice_when_once { |i, j| i > j }.to_a`. – 3limin4t0r Oct 13 '17 at 16:39
  • If you don't like to smudge the default classes you can also create a module for the helper method, and extend the object: `ary.extend(EnumExtensions).slice_when_once { |i, j| i > j }.to_a`. – 3limin4t0r Oct 13 '17 at 16:45
  • 2
    I came up with `f = false; ary.slice_when { |i, j| !f && f ||= i > j }.to_a` which is pretty much the same idea, only avoiding the `unless`. – tadman Oct 13 '17 at 16:50
  • I like! Alternatively, `break_not_done = true` followed by `break_not_done && break_not_done = i <= j`. – Cary Swoveland Nov 12 '19 at 01:01
5

I'm not sure you'll find this more elegant, but it prevents the split-and-rejoin maneuver:

def split(ary)
  split_done = false
  ary.slice_when do |i, j|
    !split_done && (split_done = yield(i, j))
  end.to_a
end

Usage:

ary = [1, 3, 6, 7, 10, 9, 11, 13, 7, 24]    
split(ary){ |i, j| i > j }
#=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

Update:

Some may find this variant more readable. #chunk_while is the inverse of #split_when and then I just applied De Morgan's Law to the block.

def split(ary)
  split_done = false
  ary.chunk_while do |i, j|
    split_done || !(split_done = yield(i, j))
  end.to_a
end
Stefan
  • 109,145
  • 14
  • 143
  • 218
hoffm
  • 2,386
  • 23
  • 36
  • that is the longer version of @SteveTurczyn answer :| – z atef Oct 13 '17 at 15:40
  • Ah, he posted as I was writing mine. Mine is a bit more general as it provides a function that takes an arbitrary comparison condition as a block. – hoffm Oct 13 '17 at 15:41
  • Guess it serves me right for taking the time to do it test-first. ;) – hoffm Oct 13 '17 at 15:42
  • 1
    Actually, I like this better as the temporary variable is confined to the scope of the method. Also could be implemented as a patch to Array class. – SteveTurczyn Oct 13 '17 at 15:51
4

Here's another version. Not particularly elegant or efficient, but is quite efficient (see comments).

break_point = ary.each_cons(2).with_index do |(a, b), idx|
  break idx if b < a # plug your block here
end + 1

[
  ary.take(break_point), 
  ary.drop(break_point)
] # => [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]
Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
4

One more alternative:

index = ary.each_cons(2).find_index { |i, j| i > j }
[ary[0..index], ary[index + 1..-1]]
#=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

I believe space is O(n) and time is O(n)

Benchmark:

Warming up --------------------------------------
             anthony    63.941k i/100ms
             steve_t    98.000  i/100ms
              tadman   123.000  i/100ms
              sergio    75.477k i/100ms
               hoffm   101.000  i/100ms
Calculating -------------------------------------
             anthony    798.456k (± 4.0%) i/s -      4.028M in   5.053175s
             steve_t    985.736  (± 5.0%) i/s -      4.998k in   5.083188s
              tadman      1.229k (± 4.1%) i/s -      6.150k in   5.010877s
              sergio    948.357k (± 3.7%) i/s -      4.755M in   5.020931s
               hoffm      1.013k (± 2.9%) i/s -      5.151k in   5.089890s

Comparison:
              sergio:   948357.4 i/s
             anthony:   798456.2 i/s - 1.19x  slower
              tadman:     1229.5 i/s - 771.35x  slower
               hoffm:     1012.9 i/s - 936.30x  slower
             steve_t:      985.7 i/s - 962.08x  slower

code for the benchmark:

require 'benchmark/ips'

def anthony(ary)
  index = ary.each_cons(2).find_index { |i, j| i > j }
  [ary[0..index], ary[index + 1..-1]]
end

def steve_t(ary)
  break_done = false
  ary.slice_when { |i, j| (break_done = i > j) unless break_done }.to_a
end

def tadman(ary)
  ary.each_with_object([[],[]]) do |v, a|
    a[a[1][-1] ? 1 : (a[0][-1]&.>(v) ? 1 : 0)] << v
  end
end

def sergio(ary)
  break_point = ary.each_cons(2).with_index do |(a, b), idx|
    break idx if b < a # plug your block here
  end + 1

  [
    ary.take(break_point),
    ary.drop(break_point)
  ]
end

def split(ary)
  split_done = false
  ary.chunk_while do |i, j|
    split_done || !(split_done = yield(i, j))
  end.to_a
end

def hoffm(ary)
  split(ary) { |i, j| i > j }
end

ary = Array.new(10_000) { rand(1..100) }
Benchmark.ips do |x|
  # Configure the number of seconds used during
  # the warmup phase (default 2) and calculation phase (default 5)
  x.config(:time => 5, :warmup => 2)

  # Typical mode, runs the block as many times as it can
  x.report("anthony") { anthony(ary) }
  x.report("steve_t") { steve_t(ary) }
  x.report("tadman") { tadman(ary) }
  x.report("sergio") { sergio(ary) }
  x.report("hoffm") { hoffm(ary) }

    # Compare the iterations per second of the various reports!
  x.compare!
end

Fascinating that #take and #drop from @sergio's answer is slightly faster than Array#[range..range], they both use the same c method underneath so I can't explain it.

Stefan
  • 109,145
  • 14
  • 143
  • 218
Anthony
  • 15,435
  • 4
  • 39
  • 69
2

It shouldn't be as hard a problem as it's proving to be. The slice_when doesn't take a maximum number of slices as an argument, but if it did it'd be easy.

Here's one optimized around two partitions:

def slice_into_two(ary)
  ary.each_with_object([[],[]]) do |v, a|
    a[a[1][-1] ? 1 : (a[0][-1]&.>(v) ? 1 : 0)] << v
  end
end
tadman
  • 208,517
  • 23
  • 234
  • 262
  • You could use `each_with_object` to get rid of that stray `a`. – Stefan Oct 13 '17 at 17:06
  • @Stefan A good optimization there on this iteration of it, adjusted accordingly. – tadman Oct 13 '17 at 17:10
  • 1
    Or, if you want to use yield to pass a custom block: `a[a[1][-1] ? 1 : (a[0][-1] && yield(a[0][-1], v) ? 1 : 0)] << v`, so you can: `slice_into_two(ary) { |i, j| i > j }` – 3limin4t0r Oct 13 '17 at 18:05
2

First approach

Just thought I'd post this way, an enumerator within an enumerator creating a partition. The first if-branch is (as others such as tadman have implemented) in case of an empty array.

arr = [1, 3, 6, 7, 10, 9, 11, 13, 7, 24]

Enumerator.new { |y|
  if arr.empty?
    y << []
  else
    enum = arr.each
    a = enum.next

    #collect elements until rule is broken
    arr1 = loop.with_object([a]) { |_,o|
      break o if enum.peek < a
      o << a = enum.next
    }

    #collect remainder of elements
    arr2 = loop.with_object([]) { |_,o| o << enum.next }

    #incase the rule is never met; just return arr's elements
    arr2 == [] ? arr.each { |e| y << e } : y << arr1; y << arr2

}.entries

#=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

Second approach

This is somewhat derived from tadman's approach i.e. the partition is predefined and emptied and filled appropriately.

arr = [1, 3, 6, 7, 10, 9, 11, 13, 7, 24]

loop.with_object([[],arr.dup]) { |_,o|
  if o.last == []
    break o
  elsif o.last[0] < o.last[1]
    o.first << o.last.shift
  else
    o.first << o.last.shift
    break o
  end
}

#=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

Looping through the array (albeit a duplicate), returning a partitioned array as soon as the rule is broken.

Sagar Pandya
  • 9,323
  • 2
  • 24
  • 35
  • I think the way I'm dealing with an empty array is a bit clumsy but hey ho. Using loops was interesting enough though. – Sagar Pandya Oct 14 '17 at 08:47
2
first, *last = ary
first = [first]
while last.any? && first.last <= last.first do
  first << last.shift 
end
[first, last]
  #=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]]

Another way:

f = ary.lazy.slice_when { |i, j| i > j }.first
  #=> [1, 3, 6, 7, 10] 
[f, ary[f.size..-1]]
  #=> [[1, 3, 6, 7, 10], [9, 11, 13, 7, 24]] 
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100