Russel Winder's Website

Ad Hoc Clusters of RaspberryPis


At PyConUK 2016 it is planned to run a sprint on Monday 2016-09-19 to create and use a "Supercomputer in a Briefcase". There will be an opportunity to do some pre-planning as there is a short session scheduled for Thursday 2016-09-15T10:30+01:00.


Trying to create a supercomputer using RaspberryPis is clearly impossible. Yet people have tried it before (sort of). For example:

All of these are projects clustering RaspberryPis together in a network to create a parallel processing computer – not a supercomputer per se, but a small scale parallel computer. The clusters in these cases have been created with fixed infrastructure. This makes certain aspects of creating a cluster easier, however it requires more hardware, and it ties up the RaspberryPis.

Many places (schools, etc.) have a number of RaspberryPis normally set out with monitors, keyboards, and mice as individual workstations. Usually these workstations are connected to the Internet, or at least in an internal network. They are effectively already in a cluster. This then raises the question: is it possible to use this collection of workstations as a single parallel computer. Clearly the answer must be yes – there is no effective difference between a collection of networked workstations and a dedicated cluster. Being used in this cluster way, the monitors, keyboards, and mice are redundant. So can we then store the RaspberryPis and networking bits and pieces into a briefcase so we have "a supercomputer in a briefcase"? Definitely.

What is going to be done at the sprint?

On the assumption that we have some RaspberryPis and laptops and a private network with all the devices connected, we have to write some software to discover what the nodes on the network are. We will also need some software to push bits of the overall job out to the nodes, and collect the results. We are also going to need some programs to try out.

We might want to try executing the code on a single RaspberryPi and then the cluster to see how much faster (or slower!) the parallel version runs compared to the sequential (run on a single processor) version.

The reason…

The thinking behind all this, is that it would be good for schools to have the ability to introduce students to some of the more straightforward ideas of parallelism in computing and programming. This is not to have these people try and do "supercomputing", or "Big Data", per se, but to introduce them to some of the core ideas in those topics. Data science, business intelligence, etc. all depend on processing large amounts of data and to be done in reasonable time parallelism is required of the computer. With the post-2014 computer science curriculum, topics that were once seen as having to be introduced as final year of university study, can have initial introductions at probably sixth form level to be taken much further in university courses.


In principle, we can have anything as nodes in the network as long as they are in the network and can run programs. So we can use laptops, high-end smartphones, etc. as well as RaspberryPis.

And Finally…

Come and join in.

Copyright © 2017 Russel Winder -