Raspberry Pi Cluster Node – 10 More Advanced Connection Handling

This post builds on my previous posts in the Raspberry Pi Cluster series by improving the connection code so it wont crash when the master or slave disconnects.

Using Exceptions to handle socket issues

The improvements to the cluster code are going to add an exception to the communication code. This will be thrown when there is an issue with a socket, such as a client disconnection.

Currently the master and slave will crash if there are communication issues. Without the proper handling of this they wont be reliable in the cluster. We need to be sure that wherever there is a chance for communications to fail, the scripts handle it properly.

By having all communication code throw a standard exception we can easily catch this at the higher level and handle it appropriately. This means that all connection code will fail in the same way, by throwing the new exception.

This simplifies the connection handling code so it just needs to catch the single exception, and handle it as appropriate.

Creating an exception in python

To create a basic exception in python you need to create a class that inherits from Exception or any subclass of it. I can create my new exception as follows.

class DisconnectionException(Exception):
    pass

This inherits from the base Python exception class and provides no further methods or properties. This can now be raised and caught like any standard Python exception.

Improving the Data Packager to throw Disconnection Exceptions

Now there is an exception to throw, I am going to improve the Data Packager module to throw the newly created Disconnection exception.

Currently the slave and master use socket.send to send all messages. I am going to create a new method in the data packager to handle the connection issues.

def send_message(clientsocket, payload):
    try:
        clientsocket.send(payload)
        return True
    except socket.error:
        raise DisconnectionException("Failed to send message")

With this new send_message method the master and slave can now use the same function to send data to each other. If there is an issue with the socket the DisconnectionException is raised. This will be handled by the method caller.

The get_message method now will also wrap the receive call in a try block, and throw the new exception if there is an issue.

try:
    data = clientsocket.recv(512) #Get at max 512 bytes of data from the client
except socket.error: #If we failed to get data, assume they have disconnected
    raise DisconnectionException("Failed to receive messages, client has disconnected")

Modifying the slave and master to handle Disconnection Exceptions

Now the message sending primitive functions will throw these exceptions the slave and masters need to be modified to handle the exception.

For the slave we wrap the socket communication code with a try block, and then print out an error if an issue occurs.

try:
    logger.info("Sending an initial hello to master")
    # Message handling code described in previous tutorial 
except DisconnectionException as e:
    logger.info("Got disconnection exception with message: " + e.message)
    logger.info("Shutting down slave")

This simple implementation currently only prints out if there is an issue, but it ensures that the slave doesn’t crash out.

Modifying the master to handle the Disconnection is handled in the RpiClusterClient class. Here the run method has the try except wrapped around the inner code and again just nicely handles the disconnect by printing out a message.

try:
    # Message handling code described in previous tutorial
except DisconnectionException as e:
    logger.info("Got disconnection exception with message: " + e.message)
    logger.info("Shutting down slave connection handler")

Summary of changes to the Rpi Cluster

These changes go through the slave and master code to create a send_message primitive which handles communication problems. By abstracting both the send and receive both the master and slave can use the same code.

In addition the slave and master are improved to more properly handle the disconnection so they will not crash out when one or the other disconnects. This is an important feature for the cluster where hardware might fail and cause issues.

In the next tutorial we will look at improving the slave so that if the master goes down, it will rejoin at the next opportunity.

The full code is available on Github, any comments or questions can be raised there as issues or posted below.

One Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.