Is there a functional algorithm which is faster than an imperative one?

Question

I'm searching for an algorithm (or an argument of such an algorithm) in functional style which is faster than an imperative one.

I like functional code because it's expressive and mostly easier to read than it's imperative pendants. But I also know that this expressiveness can cost runtime overhead. Not always due to techniques like tail recursion - but often they are slower.

While programming I don't think about runtime costs of functional code because nowadays PCs are very fast and development time is more expensive than runtime. Furthermore for me readability is more important than performance. Nevertheless my programs are fast enough so I rarely need to solve a problem in an imperative way.

There are some algorithms which in practice should be implemented in an imperative style (like sorting algorithms) otherwise in most cases they are too slow or requires lots of memory. In contrast due to techniques like pattern matching a whole program like a parser written in an functional language may be much faster than one written in an imperative language because of the possibility of compilers to optimize the code.

But are there any algorithms which are faster in a functional style or are there possibilities to setting up arguments of such an algorithm?

What do you mean by "functional" or "imperative" algorithm? Take any imperative one, perform an SSA-transform on it, translate basic blocks into a set of mutually recursive functions, and you'll get a purely functional version of the very same algorithm with an equal performance profile. Reverse translation is even more trivial. — SK-logic, Jan 26 '11 at 11:51
And here I thought that was the reason for all the functional programming hype. If there is no affirmative answer to this question, that confirms my suspicion that it's just a fad :-) — phkahler, Jan 26 '11 at 13:35
@phkahler - Functional algorithms are easier to reason about, just like C is easier to reason about than assembly language. Thus, even though compiled C is never faster than assembly could be in principle, C is not "just a fad". — Rex Kerr, Jan 26 '11 at 14:45
You're setting up a false dichotomy. Algorithms are not divided into "functional" and "imperative". — Apocalisp, Jan 26 '11 at 17:29
"Development time" is **NOT** "more expensive than runtime". If only because code will spend, for any decent program, orders of magnitude more time executing than being written/maintained. Esp. if you factor in a large number of users (any number of whose time might be more valuable than yours). — Lawrence Dol, Jan 27 '11 at 09:44
More time spent does not imply more dollars spent. If the decision is between having to pay five developers for a month or having customers wait 0.5s longer for a request, what would you choose? — Raphael, Jan 27 '11 at 10:26
@Raphael: I should have is "is NOT automatically". But, for argument's sake, assume your scenario equals 900 man-hours (40*5*4.5): It would only take 1000 customers 6,480 executions to break even in terms of *time*. The question is more who is willing to pay and how much. And how much are customers paying for new hardware to run sloppy software? Now, why is it that I require no more of my word processor today, than I did 10 years ago, and yet it now takes *longer* to load on my PC which is 1000x faster in every way? — Lawrence Dol, Jan 27 '11 at 16:51
Still, users don't charge you money for minor imperfections. Most won't even notice the difference between functional and imperative coded programs if they are programmed properly, resp. Oh, wait, maybe: you won't get segfaults. — Raphael, Jan 27 '11 at 16:56
@Raphael: True, users don't *charge* for minor imperfections, but we sure as hell pay for them hand over fist. I just spent 45 minutes waiting for 1500 files to delete off a network server over a VPN because someone at MS could not design an efficient file-system protocol to save his life. But sure, it was *fast enough* when he tested it... on his fibre-connected SAN or 10 MB (at the time) LAN. What I wouldn't give for MS to have invested 5 extra man-months into their network protocol design to squeeze out every last byte. — Lawrence Dol, Jan 27 '11 at 17:24
@Raphael: Oh, and the 45 minutes? Well that was just the estimate; an hour later we're 1/2 way through and now the remaining time est. is 2 hours (I expect it will actually done in 1). — Lawrence Dol, Jan 27 '11 at 18:32
Your time wasted is a shame, but besides the point. Commercial software has bugs *by commercial decision* because it is too expensive to fix them (and that includes suboptimal implementations). A profit-oriented company will always only fix those bugs that decrease profit. — Raphael, Jan 27 '11 at 18:44

score 14 · Accepted Answer · answered Jan 26 '11 at 11:17

14

A simple reasoning. I don't vouch for terminology, but it seems to make sense.

A functional program, to be executed, will need to be transformed into some set of machine instructions.
All machines (I've heard of) are imperative.
Thus, for every functional program, there's an imperative program (roughly speaking, in assembler language), equivalent to it.

So, you'll probably have to be satisfied with 'expressiveness', until we get 'functional computers'.

answered Jan 26 '11 at 11:17

Nikita Rybak

67,365
22
157
181

1

@Nikita: Don't withdraw the [Lisp machine](http://en.wikipedia.org/wiki/Lisp_machine) from an account! ;-) – YasirA Jan 26 '11 at 11:21
@Yasir Fantastic :) It seems from the description, though, Lisp Machine is as imperative as any other, just 'optimized' for Lisp. But I do agree that there can be 'functional computer' someday (or even now) Thanks for the link! – Nikita Rybak Jan 26 '11 at 11:24
@Yasir: physically, even a Lisp machine follows the imperative model. It has a state (location of electrons etc), which changes over time. It seems likely therefore that at some level it can be described in terms that would yield an equivalent imperative program on the same hardware: admittedly you might have to heavily hack the machine to supply that program for execution. If you can perform your computation on a Platonic ideal of a functional computer, though: that is just grab the computer containing the result direct from in infinite set of all computer states, then you're in business ;-) – Steve Jessop Jan 26 '11 at 13:50
Is it strictly accurate to describe multi-core CPU, or even an HPC cluster as imperative? The paradigm seems to be based on a strict order of evaluation, which just doesn't happen in these cases... For that matter, don't out-of-order optimidations and pipelining break it to some extent? – Kevin Wright Jan 26 '11 at 14:01
@Kevin Pipelining is done by CPU regardless of whether currently executed instructions are compiled from Lisp program or C program or pure assembler. – Nikita Rybak Jan 26 '11 at 14:45
After reading this answer I'm not sure whether it is better to specify the question to "are there functional algorithms which are faster after compiling than imperative ones because of optimizing compilers"? A compiler has a lot of possibilities to optimize the source code and it does not produce errors into the code (given that it has no bugs). – kiritsuku Jan 26 '11 at 17:19
@Antoras: compiler quality of implementation depends less on the language and more on the time spent on it, and there imperative languages are leading the way... though gcc (for example) recently switched to a SSA intermediate representation because it's easier to reason on :) – Matthieu M. Jan 26 '11 at 18:52
@nikita - Quite! So in both cases this means that the opcodes aren't seen by the core of the CPU in strictly imperative order, meaning that it's dodging the truth a bit to claim that all machines are imperative. The very fact that we need memory barriers and trickery with volatile variables is a testament to this. – Kevin Wright Jan 26 '11 at 18:56
@Kevin The question whether modern computers are 'imperative' or 'functional' inside is irrelevant here (and more philosophical than scientific anyway): they work on transistors and therefore analog devices in essence. The important thing is, all instruction they accept can be called 'imperative'. That is enough for the argument in my answer. – Nikita Rybak Jan 26 '11 at 19:18
@Kevin Besides, the whole imperative-multithreaded (or multicore) contraposition seems made up. – Nikita Rybak Jan 26 '11 at 19:22
One might categorize CPU instructions as "functional" or "imperative" depending on whether they either modify a register that were involved in the computation, or perform a computation and store the result in an unrelated register. – Knut Arne Vedaa Jan 26 '11 at 20:27
1

@Knut Or one might categorize them as "red" and "blue", that really makes no difference. Frankly, as **Apocalisp** notes, this whole dichotomy is made up. – Nikita Rybak Jan 26 '11 at 20:37
It makes a difference if you want to create a "functional" CPU. – Knut Arne Vedaa Jan 26 '11 at 21:01
@Knut: for what you say to become a definition of a "functional" CPU, it would have to store the result not just in an unrelated register, but in an unrelated register *which has never been used before* (or which has been used and garbage-collected). Otherwise it's changing the value in the register, which is the imperative model. Machines have a "small" number of registers (i.e. not billions), so really it would have to operate memory->memory, not register->register. This level of abstraction for functional programs is better implemented in software than hardware, I suspect. – Steve Jessop Jan 28 '11 at 16:23
@Knut: that said, with some modification you might be able to re-state your definition for a stack-based CPU, since it does put results in memory that's either new or has been released since its last use (when the stack popped). I do not know how you would implement memoization in hardware under such a definition, and any tail-recursion optimization would be stretching the limits of the definition of the hardware as somehow "not imperative". It seems easier to permit the hardware to "be imperative", since it's all a definition game anyway with no practical benefit. – Steve Jessop Jan 28 '11 at 16:28

Kevin Wright · Answer 2 · 2011-01-26T21:33:40.687

5

The short answer:

Anything that can be easily made parallel because it's free of side-effects will be quicker on a multi-core processor.

QuickSort, for example, scales up quite nicely when used with immutable collections: http://en.wikipedia.org/wiki/Quicksort#Parallelization

All else being equal, if you have two algorithms that can reasonably be described as equivalent, except that one uses pure functions on immutable data, while the second relies on in-place mutations, then the first algorithm will scale up to multiple cores with ease.

It may even be the case that your programming language can perform this optimization for you, as with the scalaCL plugin that will compile code to run on your GPU. (I'm wondering now if SIMD instructions make this a "functional" processor)

So given parallel hardware, the first algorithm will perform better, and the more cores you have, the bigger the difference will be.

edited Jan 26 '11 at 21:33

answered Jan 26 '11 at 11:17

Kevin Wright

49,540
9
105
155

2

While true, that doesn't really answer the question. Quicksort can be done in parallel if implemented correctly. In-place quicksort is not even functional (swapping modifies state) but can be run in parallel. – phkahler Jan 26 '11 at 14:16
@phkahler I thought that the very nature of imperative-ness is sequential execution of commands. Doesn't that mean anything parallel can't be imperative? – AnnanFay Apr 17 '12 at 00:01
@Annan: Parallel is something different from the imperative/functional discussion. Each thread is still imperative. – phkahler Apr 17 '12 at 14:32

score 3 · Answer 3 · edited Jun 09 '19 at 06:45

FWIW there are Purely functional data structures, which benefit from functional programming.

There's also a nice book on Purely Functional Data Structures by Chris Okasaki, which presents data structures from the point of view of functional languages.

Another interesting article Announcing Intel Concurrent Collections for Haskell 0.1, about parallel programming, they note:

Well, it happens that the CnC notion of a step is a pure function. A step does nothing but read its inputs and produce tags and items as output. This design was chosen to bring CnC to that elusive but wonderful place called deterministic parallelism. The decision had nothing to do with language preferences. (And indeed, the primary CnC implementations are for C++ and Java.)

Yet what a great match Haskell and CnC would make! Haskell is the only major language where we can (1) enforce that steps be pure, and (2) directly recognize (and leverage!) the fact that both steps and graph executions are pure.

Add to that the fact that Haskell is wonderfully extensible and thus the CnC "library" can feel almost like a domain-specific language.

It doesn't say about performance – they promise to discuss some of the implementation details and performance in future posts, – but Haskell with its "pureness" fits nicely into parallel programming.

score 1 · Answer 4 · answered Jan 26 '11 at 17:47

One could argue that all programs boil down to machine code.

So, if I dis-assemble the machine code (of an imperative program) and tweak the assembler, I could perhaps end up with a faster program. Or I could come up with an "assembler algorithm" that exploits some specific CPU feature, and therefor it really is faster than the imperative language version.

Does this situation lead to the conclusion that we should use assembler everywhere? No, we decided to use imperative languages because they are less cumbersome. We write pieces in assembler because we really need to.

Ideally we should also use FP algorithms because they are less cumbersome to code, and use imperative code when we really need to.

No need to stop your reasoning at machine code. Everything changes once you look at CPU optimizations, and then again once you get to quantum scales (by which point it's seriously weird) — Kevin Wright, Jan 26 '11 at 18:59

score 0 · Answer 5 · answered Jan 26 '11 at 15:23

Well, I guess you meant to ask if there is an implementation of an algorithm in functional programming language that is faster than another implementation of the same algorithm but in an imperative language. By "faster" I mean that it performs better in terms of execution time or memory footprint on some inputs according to some measurement that we deem trustworthy.

I do not exclude this possibility. :)

score 0 · Answer 6 · answered Jan 27 '11 at 00:44

To elaborate on Yasir Arsanukaev's answer, purely functional data structures can be faster than mutable data structures in some situations becuase they share pieces of their structure. Thus in places where you might have to copy a whole array or list in an imperative language, where you can get away with a fraction of the copying because you can change (and copy) only a small part of the data structure. Lists in functional languages are like this -- multiple lists can share the same tail since nothing can be modified. (This can be done in imperative languages, but usually isn't, because within the imperative paradigm, people aren't usually used to talking about immutable data.)

Also, lazy evaluation in functional languages (particularly Haskell which is lazy by default) can also be very advantageous because it can eliminate code execution when the code's results won't actually be used. (One can be very careful not to run this code in the first place in imperative languages, however.)

Is there a functional algorithm which is faster than an imperative one?

6 Answers6