Communicating Sequential Processes (CSP) and the Multicore World

30+ years ago Tony Hoare and colleagues developed CSP, a mathematically sound model of how to structure software so as to deal as safely as possible with concurrency. Parallelism wasn't such an issue then as it is now, post the beginning of the Multicore Revolution; at the time of development of CSP very few computers offered applications more than one processor.

Now as we enter a period of massive change in computer hardware structures (uniprocessor architectures are giving way to multiprocessor architectures), it is becoming increasingly clear that shared-memory multi-threading cannot be the software architecture of future applications. The need to correctly manage locks, mutexes and monitors, as is required in shared-memory multi-threading, is just too complicated to get right.

Erlang has been the major proponent of using lightweight processes and message passing over the last 25+ years. It's successes in the telecoms industry have generally been ignored by the wider computing industry as it rushed headlong into the quagmire that is shared-memory multi-threading. Joe Armstrong has noted in the past that although Erlang does not directly realize the Actor Model or CSP they were very strong inspirations in the development of the language.

The HPC (high performance computing) community have been working with parallel systems since almost the beginning of computing. Many models have been tried over the years, but the current winner is a combination of SPMD (single program multiple data) architecture with annotations to handle local thread management: MPI and OpenMP are the tools of handling parallelism with C, C++ and Fortran the programming languages of choice. No Actor Model, and certainly no CSP. Given the unwillingness of the HPC community to rewrite any of the codes they have developed over the last 40+ years, it seems likely that the traditional HPC community will become an increasingly irrelevant backwater of computing innovation.

On the Go (the programming language being developed by Google) programming language users mailing list, Rob 'Commander' Pike reminded us in one thread that although the goroutine concept in Go appears to have some inspiration from CSP, most of the ideas were developed independently during the development of the Squeak, Newsqueak, Alef and Limbo programming languages. He also pointed out that occam and Erlang were separate developments with there being no interchange of ideas.

Parallel and independent emergence of essentially the same idea is a strong indicator that some version of that idea is a great innovation. This has been shown many times in many different fields of human endeavour.

In the data mining arena, there is increasing dissatisfaction with SQL-based approaches; they are too slow to be useful. Employing a dataflow software architecture (an event-based approach based on lightweight independent operators, which is basically a subset of CSP), there is a revolution in the approach to data mining. SQL-based approaches gain no benefit from parallel processors, dataflow approaches gain huge benefit, and are now massively outperforming SQL-based approaches.

Thus it seems there is an "emergent property" appearing to come form all this: independent lightweight processes with no shared state is a good model of applications programming in a world increasingly dominated by parallel computers. Actor Model, dataflow and CSP are, increasingly, the future for software. The JVM world is heading that way, Scala has Actor Model, GPars provides Actor Model, dataflow and CSP to Groovy and Java programmers. Even the Python world is heading that way, cf. Python-CSP.

CSP is not a _silver bullet_, there are none in computing, but it is likely one of the best ways forward for structuring software in the post Multicore Revolution, massively parallel world of applications development.

Copyright © 2005–2020 Russel Winder - Creative Commons License BY-NC-ND 4.0