2

I'm using Ruby 2.4. How do I check if at least 80% of my elements in an array match a certain pattern? I want to see if each element matches the regex

/\d\d?\s*-\s*\d\d?/
Andrey Deineko
  • 51,333
  • 10
  • 112
  • 145
Dave
  • 15,639
  • 133
  • 442
  • 830

3 Answers3

2

You can use Enumerable#grep in conjunction with simple math:

array.grep(/\d\d?\s*-\s*\d\d?/).size / array.size.to_f >= 0.8

To shorten this further you can use Numeric#quo or Numeric#fdiv:

array.grep(/\d\d?\s*-\s*\d\d?/).size.quo(array.size) >= 0.8
Andrey Deineko
  • 51,333
  • 10
  • 112
  • 145
1

If performance does matter, you don't need to check all the elements to know if at least 80% of them are maching a condition

With ruby 2.3 this implementation is a bit faster than the count solution and twice faster than the grep solution :

def pareto_match(array, proportion: 0.8)
  min_success= (array.count * proportion).round(4)
  max_failures= (array.count * (1 - proportion)).round(4)
  success= 0
  failure= 0
  array.each do |element|
    if yield(element)
      success+= 1
      return true if success >= min_success
    else
      failure+= 1
      return false if failure > max_failures
    end
  end
end

pareto_match(array){|e| e =~ /\d\d?\s*-\s*\d\d?/}
Thomas
  • 1,613
  • 8
  • 8
  • 1
    Beware of floating point arithmetics. For 10 elements, `min_success` will be `8.0`, whereas `max_failures` will be `1.9999999999999996`. – Stefan Jan 10 '17 at 07:55
  • Indead, I've edited the answer. I use to use Bignum for this kind of problem – Thomas Jan 10 '17 at 10:19
  • Since you are counting array elements, you could just round the numbers to integers. – Stefan Jan 10 '17 at 10:20
  • with `to_i` instead of `round(x)` this case is not working : `pareto_match(%w(1 2 3 a)){|e| e =~ /\d/}` – Thomas Jan 10 '17 at 10:22
  • `round(x)` doesn't make much sense, because an array can't contain a fractional number of elements. I'd use `min_success = (array_count * proportion).round` and `max_failures = array.count - min_success`. You could also use `ceil` or `floor` instead of `round`. – Stefan Jan 10 '17 at 10:30
0

I would write:

(array.count{|item| item =~ /\d\d?\s*-\s*\d\d?/} / array.size) >= 0.8
hirolau
  • 13,451
  • 8
  • 35
  • 47