21

There is at least three well-known approaches for creating concurrent applications:

  1. Multithreading and memory synchronization through locking(.NET, Java). Software Transactional Memory (link text) is another approach to synchronization.

  2. Asynchronous message passing (Erlang).

I would like to learn if there are other approaches and discuss various pros and cons of these approaches applied to large distributed applications. My main focus is on simplifying life of the programmer.

For example, in my opinion, using multiple threads is easy when there is no dependencies between them, which is pretty rare. In all other cases thread synchronization code becomes quite cumbersome and hard to debug and reason about.

  • 3
    Add one more to your list: software transactional memory (http://en.wikipedia.org/wiki/Software_transactional_memory). There is also a Java version available: http://multiverse.codehaus.org/ – Neeme Praks Oct 24 '10 at 18:15
  • Added as subitem to 1). It doesn't look like a conceptually new way of writing concurrent applications, rather as an alternative to locking constructs. –  Oct 25 '10 at 15:03
  • @Serge: I'm not sure I see why it is less conceptually different than message passing. – jalf Oct 25 '10 at 15:04
  • @Neeme: Also re STM, see DeuceSTM http://sites.google.com/site/deucestm/ . – andersoj Oct 25 '10 at 15:18
  • 1
    @jalf: I just don't see how it would change application architecture in any drastic way. It still relies on threads and memory synchronization. I am not sure transaction semantics make working with the shared state much easier than locks. Personally, I am very interested in concepts that try to avoid shared state. –  Oct 25 '10 at 15:27
  • @Serge: true, it still works with shared state, but in my experience it *does* make it fundamentally much easier to deal with and reason about, in that your shared state is always updated in user-defined atomic, and not least, *composable*, transactions. – jalf Oct 25 '10 at 16:59
  • I guess the most accurate way to look at it is that STM is a different paradigm for *synchronization*, but not for *concurrency* (which STM is pretty much agnostic about. Most STM implementations expect you to use threads, but there's nothing in STM that inherently requires this. – jalf Oct 25 '10 at 17:02

4 Answers4

8

I'd strongly recommend looking at this presentation by Rich Hickey. It describes an approach to building high performance, concurrent applications which I would argue is distinct from lock-based or message-passing designs.

Basically it emphasises:

  • Lock free, multi-threaded concurrent applications
  • Immutable persistent data structures
  • Changes in state handled by Software Transactional Memory

And talks about how these principles influenced the design of the Clojure language.

mikera
  • 105,238
  • 25
  • 256
  • 415
5

Read Herb Sutter's Effective Concurrency column, and you too will be enlightened.

Frédéric Hamidi
  • 258,201
  • 41
  • 486
  • 479
  • Great find. While I appreciate how deeply he digs into thread synchronization he is mostly focused on threading, locking, and atomic operations, which is just one of many approaches to concurrency. He mentions asynchronous messaging in a couple of articles, but IMHO doesn't go very far. Also, his articles show how hard it is to get multi-threading right. I am still having nightmares about various memory models. –  Oct 25 '10 at 14:51
  • Yeah, Sutters columns have a definite focus on making traditional lock-based synchronization more manageable, rather than exploring *alternative* techniques. Still a good read though. – jalf Oct 25 '10 at 15:06
2

With the Java 5 concurrency API, doing concurrent programming in Java doesn't have to be cumbersome and difficult as long as you take advantage of the high-level utilities and use them correctly. I found the book, Java Concurrency in Practice by Brian Goetz, to be an excellent read about this subject. At my last job, I used the techniques from this book to make some image processing algorithms scale to multiple CPUs and to pipeline CPU and disk bound tasks. I found it to be a great experience and we got excellent results.

Or if you are using C++ you could try OpenMP, which uses #pragma directives to make loops parallel, although I've never used it myself.

balexand
  • 9,549
  • 7
  • 41
  • 36
  • Note that parallel and concurrent are not synonyms. tbb also provides `parallel_for`, `parallel_map` and `parallel_reduce` but does not help with concurrency. Similarly, there have been languages such as newsqueak that were concurrency oriented, but didn't provide parallel processing. – Dustin Oct 24 '10 at 19:18
  • +1, good book and advice - it's surprisingly rare that you *need* to write fiddly, complex thread-safe code using low-level primitives in Java nowadays. Reviewing and reasoning about code that uses these newer APIs is much easier, too. – SimonJ Oct 24 '10 at 19:19
  • 2
    Microsoft released TPL with .NET 4.0, which is a set of higher-level abstractions over threads similar to Java's. While it does simplify coding of embarrassingly parallel problems it doesn't help much with managing shared state and rather gives you a false impression of simplicity. –  Oct 25 '10 at 14:46
1

In Erlang and OTP in Action, the authors present four process communication paradigms:

  • Shared memory with locks

    A construct (lock) is used to restrict access to shared resources. Hardware support is often required from the memory system, in terms of special instructions. Among the possible drawbacks of this approach: overhead, points of contention in the memory system, debug difficulty, especially with huge number of processes.

  • Software Transactional Memory

    Memory is treated as a database, where transactions decide what to write and when. The main problem here is represented by the possible contentions and by the number of failed transaction attempts.

  • Futures, promises and similar

    The basic idea is that a future is a result of a computation that has been outsourced to a different process (potentially on a different CPU or machine) and that can be passed around like any other object. In case of network failures problem can arise.

  • Message passing

    Synchronous or asynchronous, in Erlang style.

Roberto Aloi
  • 30,570
  • 21
  • 75
  • 112