I will be leading a sprint session at PyConUK 2016 entitled "Supercomputer in a case". This blog is the first of a series trying to explain the idea, and thus introduce the sprint.
I suspect the first question is: what is a supercomputer? Rapidly followed by the second question: what is a case?
What is a Supercomputer?
If we ask Wikipedia, we get the answer:
A supercomputer is a computer with a high-level computational capacity compared to a general-purpose computer.
which, sadly, begs so many questions. However reading through that article there is a lot of good material surrounding what a 'supercomputer' is compared to a 'computer', and introducing a few examples from over the years.
Trying to summarize in a different way: supercomputers are computers that compute hugely bigger problems vastly quicker than your laptop or desktop ever could. Supercomputers are generally massive, require huge amounts of electricity, and thus cost more than anyone except governments and huge companies can afford. In the early days (1960s, e.g. Cray-1) supercomputers were specially designed to be significantly faster, but were otherwise not that dissimilar from ordinary computers. Very rapidly though supercomputers became parallel systems, i.e. were effectively many computers able to communicate with each other. In effect they were clusters, but special clusters with very high speed communications – but they were effectively just clusters.
Desktops, and some high end laptops, are now able to have one, two, or more, processor chips, but supercomputers comprised hundreds often thousands of processor chips. With the rise of multicore processors over the last decade, desktops, laptops, and even phones have been able to have 2, 4, 8 cores in each processor chip – each core effectively being a processor (this core/processor issue is subtle and too deep for this article, so lets not go into it). Also over the last decade graphics computing has evolved: instead of just being dedicated graphics processors, they have become specialized computers that do graphics very well, but can also be used for some specific forms of general computation (GPGPU). Supercomputers have taken all this on board and are now huge arrays of multicore multiprocessors often with integrated GPGPUs, e.g. Sunway TaihuLight.
What is a Case?
Originally the title for the sprint was "Supercomputer in a briefcase" but then it was realized that briefcases are a bit small and that maybe suitcases should be allowed so as to get a bigger supercomputer. Hence "Supercomputer in a case". Obviously we aren’t going to create a supercomputer like Orac (from Blake’s 7). We are not going to attempt to create a computer able to compete with Blue Gene. In fact, we are not going to create a supercomputer at all. It is just not possible in 2016 to create the computational capability of a modern supercomputer in something as small as a brief case or even a suitcase.
So Why Discuss This?
All modern computers (supercomputers, desktops, laptops, even smartphones) are parallel computers in one way or another, i.e. a computer these days always has more than one processor and/or core. Yet creating parallel programs to solve problems is seen as a specialist thing that only people working with supercomputers do. But if every computer is a parallel computer aren’t we missing something fundamental: supercomputers have just returned to being very big, very expensive versions of ordinary computers. Supercomputers are no longer quite as special in computing terms as they were in the 1980s and 1990: with the advent of multicore, every computer is just a tiny little supercomputer. Think that most people’s smartphones have more computing capability than the 1960’s supercomputers such as Cray-1.
Most people being taught how to program, are taught about sequential programs: programs that do one and only one thing at the same time. Parallelism and concurrency is treated as an advanced topic. Very 1960s.
A note on jargon: In the English language 'parallel' means two lines that never meet, a meaning taken from mathematics; and 'concurrent' means "at the same time". In computing, 'parallel' means "at the same time"; and 'concurrent' labels a way that programs may behave. A book would have to be written, indeed there are many excellent books on the subject, so this article is not the place to delve deeper.
Whilst it is feasible to practically study some aspects of concurrency and parallelism on today’s laptops and workstations, it is very restricted and doesn’t open up the world in the way big computers and especially supercomputers do. So is it possible to create a small machine cheaply that is nonetheless able to allow us to write concurrent and parallel programs to solve problems as we would on a supercomputer: can we create a tiny supercomputer-like thing in a case?
What’s the Gig?
The idea is to enable people to bring together a collection of really cheap computers to create a cluster that is effectively a miniature supercomputer. Can we create a supercomputer by bringing together Raspberry Pis, laptops, and smartphones? There are already 'backplanes' for putting together Raspberry Pis to create a cluster, but this is a specialist bit of hardware. The idea of this sprint is to do something ad hoc. Just connect some computers together with standard networking and, viola, we have a cluster that is a miniature supercomputer. It is about being cheap and easy to assemble and disassemble.
So the idea of the sprint at PyCon UK 2016 is to get people to bring some Raspberry Pis and connect them to a local network and see if we can turn them into a single cluster on which we can run some programs. The idea was first mooted (partly as a joke) in a tweet by Zeth. He and I worked it up a bit using direct messages, and then I took it on as a non-joke project. It has excited the PyCon UK committee, I hope it excites people that come to PyCon UK 2016. I would hate to be in a room on my own with only one Raspberry Pi.