Yesterday evening (starting 2012-01-31T18:34) I presented a session with the above title at Skills Matter. It was scheduled in their "In the Brain" series, so I assumed licence to be very personal and idiosyncratic. The overall aim of the session was to convince people that although Fortran and C++ are seen as the standard languages of high performance computing (HPC), Python has a place. I opened with promise of a pair of arguments that then came together as a form of "proof of case".
The first argument was to address "high performance" as a term implying the ability to create correct (and fully tested) solutions to problems quickly and with small amounts of code. The implication being that high performance is about the ability of programmers to use the tools to great effect. I emphasized "ceremony", in particular "low ceremony" and "high ceremony" as a major factor: "ceremony" here being code that has to be written to satisfy the compiler that contributes nothing to the code that actually executes. Python is very definitely a "low ceremony" language associated with the fact that it is a dynamic language. Traditionally statically typed languages have been "high ceremony" and dynamically typed languages have been "low ceremony". The question is whether the drive to type inference in all the major statically typed languages (D, C++, Java, Scala, Haskell) reduces the level of ceremony to be equal to that of the dynamically typed languages.
This led to the issue of meta-object protocols (MOPs). All the major dynamically typed languages (Python, Ruby, Groovy, Lisp, Clojure) have run time MOPs in one guise or another, and this gives them great capabilities in creating domain specific languages (DSLs) as internal languages. C++ and D even though they are statically typed languages have MOPs; it is just they are compile time MOPs rather than run time ones. So there can be no dynamism to the DSL, but they are still very capable of creating DSLs - just static ones. Although not stated explicitly the issue of internal DSLs leads directly to the idea of coordination languages, and in the context of HPC to parallelism coordination languages.
I then switched tack to address computational performance as the focus of "high performance" ; arguably the more traditional interpretation of the term. Python performs badly compared to C++ and Fortran, generally about 50 to 200 time slower, at least using the CPython interpreter. PyPy performs somewhat better being 10 to 20 times faster on the small microbenchmarks I showed that CPython. As ever for me this was calculating a quadrature approximation to the value of π. The problem is small, embarrassingly parallel and never ceases to be able to be used for showing issues associated with parallelism in any number of languages. (All the code for this problem is available in a Bazaar branch here: feel free to browse or branch, if you branch please let mw know of any amendments or additions.) Both CPython and PyPy have a global interpreter lock (GIL) which means no parallel execution of threads at all. This can be got round in three fundamental ways:
- Remove the GIL. This is unlikely to happen in CPython, but the PyPy folk are experimenting with software transactional memory (STM) as a way of being able to remove the GIL.
- Use the multiprocessing package (or Parallel Python) to make use of multiple Python virtual machines (PVMs) each of which runs a single threaded process.
- Use native code PVM extensions for the parallel computations.
Using Python for the bulk of the code, and C for the core loop meant that execution was about the same as using C for all the application. Thus Python is a high performance programming language. OK, Python is not going to be used for the computationally intensive part, but it can be the coordination language.
Skills Matter videoed the session, see here.