How to efficiently and generally implement multiple dispatch?

Question

Multiple dispatch is a useful generalization of traditional single dispatch that is found in any OOP language. However, it's not nearly so clear how it can be implemented in an efficient and simple way. Single dispatch is easy: any old website or textbook can tell you how objects have a vtable with a list of methods that you index into at runtime, etc. On the other hand, there are a number of resources that explain how to use multiple dispatch and why it's useful, but none give a satisfactory explanation on how it works. My question is, then, how do we do it?

Constraints

I think a satisfactory solution to the problem has a few minimally necessary traits:

It can handle external types. So, I could have an Animal class in a library, and in separate code, I could create a Cat subclass that could be used in any functions in that library without the library needing modification or recompilation.
It can handle external method overrides. If there is a attack(Animal predator, Animal prey) function in the library, then I can make a attack(Cat predator, Mouse prey) in my own code that will be called if anyone, me or the library or anyone else, calls attack(new Cat(), new Mouse()).
It can handle subtypes properly in dispatch. If I make a Lion class but don't override attack(Cat predator, Mouse prey), then attack(new Lion(), new Mouse()) will call attack(Cat predator, Mouse prey).

And ideally, an efficient implementation would have the following, although these may not always be possible:

Multiple dispatch is not O(n) on the number of types or the number of method overrides, which could cause dispatch to be prohibitively slow as they increase.
If a method is dispatched only on a single parameter, method dispatch is O(1) like vtable-based single dispatch.

If a multiple dispatch implementation has these traits, then I would say that it has at least the same amount of power as vtable-based single dispatch and could be called a satisfactory implementation.

Invalid solutions

There's a few solutions that I have seen thrown around that don't satisfy the above:

The visitor pattern. This fails the first necessary trait, namely that it can handle external types. I have to have a separate method for every type I want to dispatch on. If I want more types, I need more methods, forcing recompilation of the library.
A N-dimensional array with types on each dimension. So, if Animal = 0, Cat = 1, Lion = 2, and Mouse = 3, then attack(new Lion(), new Mouse()) will look up at attack_vtable[2][3], which contains a pointer to attack(Cat predator, Mouse prey). Again, this presupposed a fixed number of types by merit of being able to assign a unique, consecutive number to each type and have a fixed-size array.
Do a linear search over the available methods and find the best match. This is not efficient, and is difficult to reconcile with the second trait, since the library doesn't know about the attack(Cat predator, Mouse prey) in our code.

I've also found some papers about multiple dispatch that I attempted to understand, but they are, at best, very dense and hard to understand, and at worst, incomprehensible to anyone without a basis in theoretical computer science.

Is there any way to implement multiple dispatch that can satisfy these constraints, or at least most of them?

"*I can define an implementation in my own code that will be called if anyone calls the declared method*" - how do you want to prevent multiple different implementations for the same type combination (defined by multiple people in different locations)? This is a tough problem to avoid. — Bergi, Jul 02 '23 at 20:46
If you declare a method `attack(Cat predator, Rodent prey)`, which is overridden for `attack(Lion predator, Rodent prey)` and also for `attack(Cat predator, Mouse prey)`, which implementation should be called if someone calls `attack(new Lion(), new Mouse())`? — Bergi, Jul 02 '23 at 20:49
Those are problems, but I intentionally left them out to keep the question simpler. For simplicity, assume that those will never happen because they will always be caught at compile time/link time/runtime/whatever, or, if you prefer, favor the left-hand arguments over the right-hand ones, so `attack(Lion predator, Rodent prey)` would be called, and duplicate methods are ignored. — v-rob, Jul 02 '23 at 21:06

score 0 · Answer 1 · answered Jul 03 '23 at 01:21

I don't think you will get around the multidimensional array (or variations of it, like nested arrays, possibly even referenced by the vtable).

However, you presuppose that it needs to be a fixed-size array. This is not true, it does not need to be generated at compile time of each separate module. You can create it at link time (which of course requires support in the linker) or even dynamically initialise it at load time. The array indices to use (numbers assigned to the types) can be stored as an entry in the vtable of each type, so that the compiled code can use them without knowing their values.

How to efficiently and generally implement multiple dispatch?

Constraints

Invalid solutions

1 Answers1