11

At least in Ruby 1.9.3, Enumerable objects do not have a length attribute. Why is this?

kdbanman
  • 10,161
  • 10
  • 46
  • 78
  • 1
    There is no "Enumerable class". By "any Enumerable class", do you mean any class for which the `Enumerable` module has been mixed in? Such classes (and others) have a `length` (aka `size`) method. – Cary Swoveland Mar 03 '15 at 18:25
  • 4
    There's [`Enumerable#count`](http://ruby-doc.org//core-1.9.3/Enumerable.html#method-i-count). – cremno Mar 03 '15 at 18:32
  • @ChrisHeald's answer reminded me that, where I said "Such classes have a `length` method, I failed to mention that I was referring to built-in classes, but even of that I am now unsure. Does anyone know of a built-in class that mixes in `Enumerable` that does not have a `length` method? – Cary Swoveland Mar 03 '15 at 18:59
  • @CarySwoveland Range – Max Mar 03 '15 at 19:02
  • @Max, ...or `size`, e.g., [Range#size](http://ruby-doc.org//core-2.2.0/Range.html#method-i-size). :-) – Cary Swoveland Mar 03 '15 at 19:05
  • @CarySwoveland `IO` and `Dir` – Max Mar 03 '15 at 19:13
  • @cremno, thank you. That's the answer I was looking for. How silly of me. – kdbanman Mar 03 '15 at 19:30
  • @CarySwoveland thanks, edited to reflect what I meant. My brain got a little tangled between inheritance and composition while I was writing that. – kdbanman Mar 03 '15 at 19:33
  • I'm not sure if "Enumerable things*` is much of an improvement over "Emumerable class". :-) – Cary Swoveland Mar 03 '15 at 19:37
  • 1
    Thanks, @Max. I think there are quite a few (e.g., `Integer`, `Numerica`, `Proc`, `CSV`, `Matrix`), perhaps a class defining `length` is an exception rather than a rule. – Cary Swoveland Mar 03 '15 at 19:42
  • @CarySwoveland. I think it is. Say class `List` includes `Enumerable`, and `list` is an instance of `List`. Then this holds: `list.is_a? Enumerable == true`. Hence, I think it's semantically clean to call `list` an enumerable thing. – kdbanman Mar 03 '15 at 20:30

3 Answers3

18

Enumerable has the count method, which is usually going to be the intuitive "length" of the enumeration.

But why not call it "length"? Well, because it operates very differently. In Ruby's built-in data structures like Array and Hash, length simply retrieves the pre-computed size of the data structure. It should always return instantly.

For Enumerable#count, however, there's no way for it to know what sort of structure it's operating on and thus no quick, clever way to get the size of the enumeration (this is because Enumerable is a module, and can be included in any class). The only way for it to get the size of the enumeration is to actually enumerate through it and count as it goes. For infinite enumerations, count will (appropriately) loop forever and never return.

Max
  • 21,123
  • 5
  • 49
  • 71
  • Thanks. I feel pretty silly for overlooking that. I'm glad I asked, though. I wouldn't have realized there was such a difference in behaviour. – kdbanman Mar 03 '15 at 19:32
  • `Enumerator` on the other hand has a `size` method and it works as expected for infinite enumerations: `loop.size #=> Infinity` – Stefan Mar 03 '15 at 19:55
  • @Stefan, ...and for finite enumerators it need be done lazily. For example, `enum = [1,2,3].to_enum; enum.size #=> nil`. – Cary Swoveland Mar 03 '15 at 20:02
  • @CarySwoveland you're right, the enumerator has to return its size. But it seems to work for the built-in methods: `[1,2,3].each.size #=> 3` – Stefan Mar 03 '15 at 20:35
3

Enumerables are not guaranteed to have lengths - the only requirement for an object which Enumerable is mixed into is that it responds to #each, which causes it to return the next item in the series, and #<=> which allows comparison of values provided by the enumerable. Methods like #sort will enumerate the entire collection over the course of sorting, but may not know the bounds of the set ahead of time. Consider:

class RandomSizeEnumerable
  include Enumerable
  def each
    value = rand 1000
    while value != 500
      yield value
      value = rand 1000
    end
  end

  # Not needed for this example, but included as a part of the Enumerable "interface".
  # You only need this method if #max, #min, or #sort are used on this class.
  def <=>(a, b)
    a <=> b
  end
end

This enumerable will be called until the iterator generates the value "500", which will cause it to stop enumerating. The result set is collected and sorted. However, a #length method is meaningless in this context, because the length is unknowable until the iterator has been exhausted!

We can call #length on the result of things like #sort, since they return an array, though:

p RandomSizeEnumerable.new.sort.length # 321
p RandomSizeEnumerable.new.sort.length # 227
p RandomSizeEnumerable.new.sort.length # 299

Conventionally, #length is used when the length is known and can be returned in constant time, whereas #count (and sometimes #size) tend to be used when the length may not be known ahead of time and needs to be computed by iterating the result set (thus, taking linear time). If you need the size of the result set provided by an Enumerable, try using .to_a.length #count.

Chris Heald
  • 61,439
  • 10
  • 123
  • 137
  • 1
    But there is [`Enumerable#count`](http://ruby-doc.org/core-2.2.0/Enumerable.html#method-i-count) which is almost the same as `length`. – mu is too short Mar 03 '15 at 18:48
  • Generally, `#count` is O(n), whereas `#length` is O(1); the documentation and provided sample make this clear, since you would have to iterate the enumerable to discover how many invocations it takes to terminate. – Chris Heald Mar 03 '15 at 18:59
  • But `count` is still preferred over `to_a.length` – Max Mar 03 '15 at 19:00
  • Fair point. I think the more complete answer would be "Enumerable doesn't support #length because it can't provide that answer in constant time". – Chris Heald Mar 03 '15 at 19:01
  • Good answer. A detail: defining `<=>` is only a requirement if `Enumerable` methods that need it are to be used. There are many `Enumerable` methods (such as `count`) that do not use `<=>`. – Cary Swoveland Mar 03 '15 at 19:19
  • Yup - I just included it because it is an expected part of the Enumerable "interface", even though it's not used in this case. – Chris Heald Mar 03 '15 at 19:26
  • ...and other `Enumerable` methods might be used where `<=>` applies to objects of another class (e.g, @Renato Zannon's example of `map`). Consider a clarification. – Cary Swoveland Mar 03 '15 at 19:28
0

Enumerable isn't really a class, it's a module - a collection of cross-cutting functionality that is used by multiple classes.

For example, Array, Set and Hash all include it - you can call any of the Enumerable methods on them.

Enumerable is notable in that it requires very little of the "host" class. All you need to do is define the each method and include Enumerable, and you get all those methods for free! Example:

class CountUntil
  def initialize(number)
    @number = number
  end

  include Enumerable

  def each
    current = 0
    while current < @number
      yield current
      current += 1
    end
  end
end

# Usage:

CountUntil.new(10).map { |n| n * 5 }
# => [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]

As you can see, I never defined CountUntil#map, but I got that for free from including Enumerable.

To address your question about length: not all classes that include Enumerable have defined length, even though most do. For example, Enumerator can be used to create infinite streams.

Renato Zannon
  • 28,805
  • 6
  • 38
  • 42
  • 1
    You might want to address the existence of [`Enumerable#count`](http://ruby-doc.org/core-2.2.0/Enumerable.html#method-i-count). – mu is too short Mar 03 '15 at 18:46