I made a cluster.
I had pretty much all the pieces I need, but they were scattered on other projects or experiements and it was time to pull them together. For science! Or at least for my own professional education.
I’ve thought for a long time that small single-board computers like Raspberry Pis are the best learning tools around. Cheap, easy to configure, easy to re-configure if you destroy something, and because they’re not particularly fast and don’t have huge amounts of memory or storage, they force you to sometimes consider things that in the past we always had to consider: memory and storage use, greatest algorithmic efficiency, proper choice of data structures, etc.
I’ve recently submitted a conference talk proposal titled “Can you do it on a Raspberry Pi?” pointing out exactly those reasons for teaching things on this and other limited platforms. Now I’m out to prove not only that you can learn most things this way, but that you will learn them better this way. After many months of working with neural networks that do require heavy-duty hardware to run well, going back to basics is refreshing.
It’s easy to go to a geeky tech conference like SCaLE, then go hog-wild building things you saw. Lots of companies have figured out that Pis are a great way to display whatever they’re doing on a scale that can fit entirely into a display booth. Whether it’s big data solutions, dev-ops tools or anything else, there was a vendor there showing how their stuff works on a really great looking cluster. The Postgres people opted for Pine64 boards running CentOS Linux, Docker and a suite of monitoring tools, but everybody else was using Raspberry Pis. My favorite? The 12-unit cluster running Kubernetes and Docker Swarm that SaltStack used to show off their automation tools in their booth.
There’s a lot to be said for the other types of boards available and the versatility or power some of them bring to the table, but the Pis have become the de-facto standard in low-cost Unix-based single-board computers. They won’t be ideal if you want to mimic a CentOS/Docker production environment, but for the goal of learning and demonstrating Docker Swarm, Hadoop, or any other package that uses distributed servers, they’re more than adequate and their fairly-standard Debian-based Linux is familiar to most of us.
Besides, I have five Pis sitting around and a sixth, — the new 3B+ version just announced this week on Pi Day — is on order. So a Raspberry Pi cluster it will be. The four Pis in the cluster are all fairly recent Pi 3Bs, and each is currently equipped with a 32GB micro-SD card. I wired them to a five-port USB power supply that can provide over 2 amps at 5v to each of them. Those and the network switch are all held together with velcro and duct-tape. Maybe not quite “presentation ready” but neat and more than enough for my purposes.
There are a handful of companies making full-blown cluster enclosures for Raspberry Pis and they’ll sell a fully built and tested one if you want. While the five-unit cluster kits from PicoCluster look great, spending over $300 on something I plan to use for my personal education would be pretty crazy. (The cost of the cluster enclosure with power supply and network switch exceeds the cost of the five Raspberry Pis it contains, including the cost of 32GB micro-SD cards!) Picocluster does include pre-configured software of your choice, but in my case the whole point is learning to configure and run it myself, so that would be a bit of a waste. Also, their documentation leaves much to be desired. It’s hard to tell exactly what’s included or what their software configurations really are. And despite being founded a Utah-based company, the English on their current website says “Shenzen” not “Salt Lake.”
I mounted the Pis into a four-level stacking rack I purchased on Amazon some time back. It’s a modular thing that theoretically would allow me to stack more of them just by adding layers into the stack. One thing I like about this arrangement is that the Pi boards are suspended 5mm above the plastic boards that hold them in place, leaving decent ventilation space for the circuitry on the bottom of the board. There’s an 18mm gap between the board and the bottom of the next layer which will allow decent space for a heat sink. The CPUs on Pi 3’s have a reputation for running a bit hot so I’ve ordered appropriate heat sinks from Adafruit, the US source of all things Pi and I’ll add them when they come in.
I was planning to use one of my old indestructible metal Netgear five-port switches, using four ports for the cluster and one to connect to my router. But as I was going through my box of leftover network equipment, I noticed that one of my other switches used a 5v power supply rather than the more standard 12v models. With a bit of creative wiring, this would allow me to power the switch off the same power block as the Pis and save me a power connection. USB wiring is well documented, so all I needed to do was cut the end off the power-supply cable and splice it on to the end of a USB cable.
Took about 10 minutes of soldering and a bit of heat shrink tubing and I now have a setup that looks reasonably professional and is just about the right length to wrap around from the front of the power block to the back of the switch.
Beyond that, it was very simple. The Trendnet “green” switch serves as the base. The stack of Pis is attached to it with velcro that holds it in place but not so permanently that I’ll break things if I try to access anything that needs an update. (The way I have things set up, I have to remove the stack of Pis to access the Micro SD cards.) The power brick is attached to the switch with double-sided tape as I can’t think of any reason I’d need to regularly move it. Short USB power cables attach to the Pis from the front and short network cables wrap around from the back of the switch to the side of the Pis. A few zip-ties keep things nice and orderly.
It’s been a while since any of these ran and they were used for different things so they were not consistently set up. I started by imaging Raspbian onto the four 32GB micro-SD cards that I had lying around. At some point I think I bought a 10-pack and with the OS only taking a bit more than 4GB, they should be more than adequate. If not, I guess I can supplement the space by plugging in USB storage. In fact, I’m sure that’s exactly what I’ll do as I move along.
The installation of Raspbian Linux is well enough documented on the Raspberry Pi site and elsewhere around the internet so I won’t address it much here. As always, Raspbian is delivered with UK specs as a standard. So the first thing to do is change the keyboard settings, otherwise lots of special keys like the “@” and “~” are in the wrong places! Once that’s done, I ran
and went through the rest of the configuration setup. Many of these settings can be changed from the Preferences panel or elsewhere later, but it’s nice to get them right in the first place.
In addition to this, once each Pi was up and running I ran rpi-update to bring the firmware up to the latest spec. This ensures they’re all pretty much the same in all respects. Not strictly necessary but never hurts. I then ran an apt-get update/upgrade on all of them to bring the software 100% current and correct for anything in the new firmware that the operating system might care about. That took the better part of a couple of hours, mostly sitting back and watching things run.
Networking and SSH
I named the four Pis in the cluster rpi0 through rpi3 (this is done during initial configuration). I assigned each of them a dedicated address on my network, 192.168.1.200-203. To make it easier to address one from the other, I added the following to the /etc/hosts file on each Pi:
192.168.1.200 rpi0 rpi0.lan rpi0.local 192.168.1.201 rpi1 rpi1.lan rpi1.local 192.168.1.202 rpi2 rpi2.lan rpi2.local 192.168.1.203 rpi3 rpi3.lan rpi3.local 192.168.1.126 3nix 3nix.lan 3nix.local
That final line is a small PC I made from spare parts that’s running Ubuntu Linux. I’m isolating this experiment from my work environment, but the little PC (4th generation Pentium with 8GB of RAM and a big disk) could make for an interesting member of the cluster at some point. Maybe use it as a controller and let all four Pis be nodes. But I’m probably getting ahead of myself. In the meantime, it’s just a convenient place from which to connect to the Pis via SSH.
SSH makes it possible to control one computer when logged into another. Login is simplified if you share rsa authentication keys that let the one you’re trying to control to definitively identify you as authorized. This may be overkill in my little one-person tabletop cluster but is a correct practice and one to be familiar with.
I generated SSH keys on each of the five computers and shared each key to all the others. That makes it possible for me to SSH from any of them into any other depending on where I’ve got the monitor and keyboard connected right now. I am not sharing these keys with anything else on my network, so they’re relatively isolated. This is fairly easy. First I executed:
ssh-keygen -t rsa -C pi@<computer name>
That generates the key for each computer. After I generated keys on all of them (including the non-Pi computer), I ran
cat ~/.ssh/id_rsa.pub | ssh pi@<computer name> 'cat >> .ssh/authorized_keys'
using the name of every other computer in the cluster. So in my case, I ran it from each of the computers four times, once for each of the others. I should note that I’m using the standard “pi” user name on all the members of the cluster and it would need to change if you use something else. Note that you’ll have to type in the password for the computer you’re logging into just the one time. Once the key is shared, it’s no longer necessary.
There’s more information about passwordless SSH access on the Raspberry Pi site, and while it isn’t specifically written with other systems in mind, it’s identical to the process on my Ubuntu-based computer.
Also, this is really overkill. In all likelihood I will only use either the PC or rpi0 as my interface points to the rest of the cluster.
So now I have five computers all talking to each other nicely, each of which allows you to take control of any other just by typing
ssh pi@<computer name>
Enough for one night. More software fun in the coming days.