23

Suppose I have a stream of [acme] objects that I want to expose via an API. I have two choices, callbacks and iterators.

API #1: Callbacks

// API #1
// This function takes a user-defined callback 
// and invokes it for each object in the stream.
template<typename CallbackFunctor>
void ProcessAcmeStream(CallbackFunctor &callback);

API #2: Iterators

// API #2
// Provides the iterator class AcmeStreamIterator.
AcmeStreamIterator my_stream_begin = AcmeStreamIterator::begin();
AcmeStreamIterator my_stream_end   = AcmeStreamIterator::end();

API #1 takes the control flow of the program from the user's hand and will not return until the entire stream is consumed (forgetting exceptions for the moment).

API #2 retains the control flow in the user's hand, allowing the user to move forward the stream on his own.

API #1 feels more higher level, allowing the users to jump to the business logic (the callback functor) right away. On the other hand, API #2 feels more flexible, allowing the users lower-level of control.

From a design perspective, which one should I go with? Are there more pros and cons that I have not seen yet? What are some support/maintenance issues down the future?

kirakun
  • 2,770
  • 1
  • 25
  • 41

5 Answers5

9

The iterator approach is more flexible, with the callback version being easily implemented in terms of the first one by means of existing algorithms:

std::for_each( MyStream::begin(), MyStream::end(), callback );
David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • I had actually considered mentioning that the callback version could easily be provided on top of the iterator one, but dismissed it, because (as we all know from `std::string`) an API providing multiple ways to do the same thing is just too confusing. I'd recommend deciding for one way. – sbi Mar 01 '11 at 20:33
6

IMO, the second is clearly superior. While I can (sort of) understand your feeling that it's lower level, I think that's incorrect. The first defines its own specific idea of "higher level" -- but it's one that doesn't fit well with the rest of the C++ standard library, and ends up being relatively difficult to use. In particular, it requires that if the user wants something equivalent to a standard algorithm, it has to be re-implemented from the ground up rather than using existing code.

The second fits perfectly with the rest of the library (assuming you implement your iterators correctly) and gives the user an opportunity for dealing with your data at a much higher level via standard algorithms (and/or new, non-standard algorithms that follow the standard patterns).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Yes, iterators are indeed more idiomatic. However, is it really _better_? I doubt it. C++ idea of iterators pairs are not the easiest way to tackle iteration, and I believe the only way this route was chosen was the ease with which pointers fit into this. In hindsight I consider this decision wrong, ranges would have been much better. (Compared to the herculean task of inventing the STL in the first place this is a minor critique, though.) – sbi Mar 01 '11 at 20:37
  • @sbi: Windows, for one example, uses callbacks a great deal. I've used both iterators and callbacks a great deal, and see essentially *no* room for question that yes, iterators are a *lot* better. I'd like ranges, but their real improvement over iterators is fairly small -- the improvement going from callbacks to iterators is *much* greater. – Jerry Coffin Mar 01 '11 at 21:36
5

One advantage of callbacks over iterators is that users of your API can't mess up iteration. It's easy to compare the wrong iterators, or use the wrong comparison operation or fail in some other way. The callback API prevents that.

Canceling enumeration is easily done using a callback, BTW: Just let the callback return a bool and continue only as long as it returns true.

sbi
  • 219,715
  • 46
  • 258
  • 445
  • 2
    I've encountered this kind of issue during maintenance. I needed to prioritize the order in which items were processed. Instead of just changing the class so that everyone got the change for free, I had to find all the code that used it, and change that. – Tim Mar 01 '11 at 16:26
  • Also, if you want to add multi-threaded capability, it's easier to do if you've hidden the implementation, than if you've provided iterators into your internals. – Tim Mar 01 '11 at 16:27
  • 3
    IMO, the reasoning here is flawed. If you're going to assume a C++ programmer doesn't know how to use iterators, you probably also need to assume he doesn't know how to write a function or functor to use as a call-back. Conversely, if he knows C++, then he *does* know iterators. – Jerry Coffin Mar 01 '11 at 18:23
  • But I like the point about hiding how the iteration is done via the callback approach, and be able to provide benefits for future enhancements such as the multithreading mentioned by Tim. – kirakun Mar 01 '11 at 18:23
  • @Kirakun: locking *could* be added via iterators without modifying the calling code, although as Tim says it's easier with the Process function. The trick is for iterators to hold the lock. It's often a bad idea to silently add locking, though. If the caller doesn't know that a lock is being taken, he might create a locking inversion by taking another lock in the "wrong order" from the callback. Changing the order of iteration of course is also possible when you write the iterators, but might again be harder than doing it in a Process function if computing the order is complex. – Steve Jessop Mar 01 '11 at 18:53
  • So Tim has fair points, but the difference is that it's probably *easier* to make certain changes in `ProcessAcmeStream` than to make the equivalent changes to an iterator, not that the latter can't be done. – Steve Jessop Mar 01 '11 at 18:56
  • @Jerry: Your argument, if valid, could be applied to `std::for_each()` as well. Yet, most C++ programmers whose opinion I value see merits in `std::for_each()` once we have lambda functions. Note that I'm not saying programmers don't know how to deal with iterators. But the fact is that all of us got it wrong once in a while. Which is where seemingly superfluous algorithms like `std::for_each()` come in. – sbi Mar 01 '11 at 20:30
  • @Tim: To be fair, a change in the order of iteration can also often be achieved by changing either the class providing the iterators or by changing the iterators themselves. However, I suppose there are cases where it is much easier to change an iteration function 8which calls a callback) than that, and I can't think of a setup where it's the other way around. – sbi Mar 01 '11 at 20:40
  • @sbi: How exactly is `std::for_each` related to anything here? I have to admit that I've yet to see a situation where I found it of much use, but I'm not sure what that has to do with anything either. – Jerry Coffin Mar 01 '11 at 21:33
  • @Jerry: "IMO, the reasoning here is flawed. If you're going to assume a C++ programmer doesn't know how to use iterators, [...]. Conversely, ifhe knows C++, then he does know iterators." If that was true, why do we have `std::for_each()`? – sbi Mar 02 '11 at 07:05
  • @sbi: IMO, `std::for_each` belongs in nearly the same category as exception specifications. Some people thought it would be a good idea at the time, but they were mistaken. In fairness, it does have a *few* uses when you use the fact that the functor is returned, so you can use it as an accumulator. Sometimes that can work out a little simpler than `std::accumulate` (but its still too rare to justify its inclusion, IMO). – Jerry Coffin Mar 02 '11 at 07:41
  • @Jerry: In that, I disagree. I have too many hours spent hunting down bugs that turned out as someone (which includes me) not comparing their iterators right. If I see a loop while looking for a bug, I need to look at it closely. If I see `std::for_each()` I know there's no need to go into it and look whether it's right. – sbi Mar 02 '11 at 08:27
  • @sbi: hmmm...I find exactly the opposite -- `std::for_each` is nearly *always* a sign that something's wrong. At least 90% of the time, it's used as a clumsy (and yes, often incorrect) way to implement something that should have been done with `std::copy`. – Jerry Coffin Mar 02 '11 at 14:32
  • @Jerry: Then you would have to look at the functor passed to `std::for_each()`, but not the loop inside it. That's equivalent to the loop's body being wrong (which I hadn't even mentioned), so no difference there between either way of doing things. – sbi Mar 02 '11 at 14:45
  • @sbi: I think you're mostly playing with words now. When it's obvious code needs to be thrown out and rewritten, how much does it matter which part of the code was more worthless? – Jerry Coffin Mar 02 '11 at 15:40
  • @Jerry: No, I'm not. If I see `std::for_each(v.begin(),v.end(),do_it)` there's only `do_it` that can be wrong. Replace that by a `for` loop and just the definition of the loop variable by itself might exceed an 80 char line length limit, making it really hard to see whether the loop is right. (And that's skipping over the question whether repeatedly calling `v.end()` is a performance hit or not. And is this the same for all container types? How about our own container type over there?) All this is of no concern with `std::for_each()`. – sbi Mar 02 '11 at 18:11
  • @sbi: This is getting pointless. None of that matters, because as I've already pointed out, it's pretty much a given that using `std::for_each` is the wrong thing to start with. It may do the wrong thing efficiently and accurately, but it's still doing the wrong thing. – Jerry Coffin Mar 02 '11 at 20:43
4

C++ standard library idiom is to provide iterators. If you provide iterators, then ProcessAcmeStream is a simple wrapper around std::for_each. Maybe worth the trouble of writing, maybe not, but it isn't exactly boosting your caller into a radical new world of usability, it's a new name for an application of a standard library function to your iterator pair.

In C++0x, if you also make the iterator pair available through std::begin and std::end then caller can use range-based for, which takes them into the business logic just as quickly as ProcessAcmeStream does, perhaps quicker.

So I'd say, if it's possible to provide an iterator then provide it - the C++ standard does inversion of control for you if the caller wants to program that way. At least, for a case where the control is as simple as this it does.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • +1. It is the c++ way, also, by providing an iterator api, you're allowing the user to use all of http://www.cplusplus.com/reference/algorithm/ and http://www.cplusplus.com/reference/numeric/ on your acme objects. – matiu Feb 08 '15 at 11:21
2

From a design perspective, I would say that the iterator method is better, simply because it's easier and also more flexible; it's really annoying to make callback functions for without lambdas. (Now that C++0x will have lambda expressions, though, this may become less of a concern, but even still, the iterator method is more generic.)

Another issue with callbacks is cancellation. You can return a boolean value to indicate whether you'd like to cancel enumeration, but I always feel uneasy when the control is out of my hands, since you don't always know what might happen. Iterators don't have this issue.

And of course, there's always the issue that iterators can be random-access whereas callbacks aren't, so they're more extensible as well.

user541686
  • 205,094
  • 128
  • 528
  • 886