7

Here seems to be the two biggest things I can take from the How to Design Programs (simplified Racket) course I just finished, straight from the lecture notes of the course:

1) Tail call optimization, and the lack thereof in non-functional languages:

Sadly, most other languages do not support TAIL CALL OPTIMIZATION. Put another way, they do build up a stack even for tail calls.

Tail call optimization was invented in the mid 70s, long after the main elements of most languages were developed. Because they do not have tail call optimization, these languages provide a fixed set of LOOPING CONSTRUCTS that make it possible to traverse arbitrary sized data.

a) What are the equivalents to this type of optimization in procedural languages that don't feature it? b) Do using those equivalents mean we avoid building up a stack in similar situations in languages that don't have it?

2) Mutation and multicore processors

This mechanism is fundamental in almost any other language you program in. We have delayed introducing it until now for several reasons:

  • despite being fundamental, it is surprisingly complex

  • overuse of it leads to programs that are not amenable to parallelization (running on multiple processors). Since multi-core computers are now common, the ability to use mutation only when needed is becoming more and more important

  • overuse of mutation can also make it difficult to understand programs, and difficult to test them well

But mutable variables are important, and learning this mechanism will give you more preparation to work with Java, Python and many other languages. Even in such languages, you want to use a style called "mostly functional programming".

I learned some Java, Python and C++ before taking this course, so came to take mutation for granted. Now that has been all thrown in the air by the above statement. My questions are:

a) where could I find more detailed information regarding what is suggested in the 2nd bullet, and what to do about it, and b) what kind of patterns would emerge from a "mostly functional programming" style, as opposed to a more careless style I probably would have had had I continued on with those other languages instead of taking this course?

Will Ness
  • 70,110
  • 9
  • 98
  • 181
kcoul
  • 150
  • 5

3 Answers3

9

As Leppie points out, looping constructs manage to recover the space savings of proper tail calling, for the particular kinds of loops that they support. The only problem with looping constructs is that the ones you have are never enough, unless you just hurl the ball into the user's court and force them to model the stack explicitly.

To take an example, suppose you're traversing a binary tree using a loop. It works... but you need to explicitly keep track of the "ones to come back to." A recursive traversal in a tail-calling language allows you to have your cake and eat it too, by not wasting space when not required, and not forcing you to keep track of the stack yourself.

Your question on parallelism and concurrency is much more wide-open, and the best pointers are probably to areas of research, rather than existing solutions. I think that most would agree that there's a crisis going on in the computing world; how do we adapt our mutation-heavy programming skills to the new multi-core world?

Simply switching to a functional paradigm isn't a silver bullet here, either; we still don't know how to write high-level code and generate blazing fast non-mutating run-concurrently code. Lots of folks are working on this, though!

John Clements
  • 16,895
  • 3
  • 37
  • 52
  • 1
    "but you need to explicitly keep track of the "ones to come back to." " This does not sound like a tail call. A tail call should not have things "to come back to". So such a case would not be tail-call-optimized in a functional language anyway – newacct Dec 15 '11 at 11:45
  • 1
    @newacct: That's precisely John's point: those cases are _not_ tail calls. If you use a loop-only paradigm, you'd have to track the return context. And if you use a recursion-only paradigm, you'd always eat up call frames even in tail calls. By having what Scheme has---tail calls where possible, recursive calls otherwise, you get the best of both worlds. – C. K. Young Dec 15 '11 at 14:51
  • @Chris Jester-Young: but the situation John described can be *exactly* duplicated with looping constructs in a procedural language without tail-call optimization: the places where you would make a tail recursive call in Scheme (i.e. along the right-most path of the binary tree), use the loop to traverse that; and the other places where you would make non-tail calls in Scheme, which eat up stack frames anyway, just make a plain recursive call – user102008 Dec 17 '11 at 21:32
  • @user102008: Correct. John's point is that in Scheme, you don't have to separate out those two cases with different-looking code for each: Scheme will do it for you, depending on whether the call is in tail position or not. And visually, both kinds of code will look exactly the same. Consistency for the win. – C. K. Young Dec 18 '11 at 18:44
4

To expand on the "mutability makes parallelism hard" concept, when you have multiple cores going, you have to use synchronisation if you want to modify something from one core and have it be seen consistently by all the other cores.

Getting synchronisation right is hard. If you over-synchronise, you have deadlocks, slow (serial rather than parallel) performance, etc. If you under-synchronise, you have partially-observed changes (where another core sees only a portion of the changes you made from a different core), leaving your objects observed in an invalid "halfway changed" state.

It is for that reason that many functional programming languages encourage a message-queue concept instead of a shared state concept. In that case, the only shared state is the message queue, and managing synchronisation in a message queue is a solved problem.

C. K. Young
  • 219,335
  • 46
  • 382
  • 435
0

a) What are the equivalents to this type of optimization in procedural languages that don't feature it? b) Do using those equivalents mean we avoid building up a stack in similar situations in languages that don't have it?

Well, the significance of a tail call is that it can evaluate another function without adding to the call stack, so anything that builds up the stack can't really be called an equivalent.

A tail call behaves essentially like a jump to the new code, using the language trappings of a function call and all the appropriate detail management. So in languages without this optimization, you'd use a jump within a single function. Loops, conditional blocks, or even arbitrary goto statements if nothing else works.

a) where could I find more detailed information regarding what is suggested in the 2nd bullet, and what to do about it

The second bullet sounds like an oversimplification. There are many ways to make parallelization more difficult than it needs to be, and overuse of mutation is just one.

However, note that parallelization (splitting a task into pieces that can be done simultaneously) is not entirely the same thing as concurrency (having multiple tasks executed simultaneously that may interact), though there's certainly overlap. Avoiding mutation is incredibly helpful in writing concurrent programs, since immutable data avoids a lot of race conditions and resource contention that would otherwise be possible.

b) what kind of patterns would emerge from a "mostly functional programming" style, as opposed to a more careless style I probably would have had had I continued on with those other languages instead of taking this course?

Have you looked at Haskell or Clojure? Both are heavily inclined to a very functional style emphasizing controlled mutation. Haskell is more rigorous about it but has a lot of tools for working with limited forms of mutability, while Clojure is a bit more informal and might be more familiar to you since it's another Lisp dialect.

C. A. McCann
  • 76,893
  • 19
  • 209
  • 302