Earlier this week, I came across this question on softwareengineering.stackexchange.com. This has some relevance for me, since at work two of our main projects follow the two sides of this design philosophy: one project is a more monolithic application, and the other follows more of a microservices model(e.g. many applications). There are some reasons for this which I will now attempt to explain.
Option 1: Monolithic Application
The first project that I will explain here is our project that has a more monolithic application. First of all, a brief overview of how this project works. Custom hardware(running Linux) collects information from sensors(both built-in and often third-party over Modbus) and aggregates it and classifies it. The classification is due to the nature of the project – sensor data falls into one of several classes(temperature, voltage, etc.). This information is saved off periodically to a local database, and then synchronized with a custom website for viewing. The project, for the most part, can be split into these two main parts:
- Data collector
- Web viewer
- Local viewer(separate system, talks over ethernet)
Due to the nature of the hardware, there is no web interface on the hardware directly, it is on a cloud server.
Now, the data collector application is a mostly monolithic application. However, it is structured similarly to the Linux kernel in that we have specific ‘drivers’ that talk with different pieces of equipment, so the core parts of the application don’t know what hardware they are talking to, they are just talking with specific interfaces that we have defined.
In this case, why did we choose to go with a monolithic application? Well, there are a few reasons and advantages.
Reason 1: As primarily a data collector device, there’s no real need to have different applications send data to each other.
Reason 2: The development of the system is much easier, since you don’t need to debug interactions between different programs.
Reason 3: Following from the first two, we often have a need to talk with multiple devices on the same serial link using Modbus. This has to be siphoned in through a single point of entry to avoid contention on the bus, since you can only have one modbus message in-flight at a time.
Reason 4: All of the data comes in on one processor, there is no need to talk with another processor. Note that this is not the same as talking with other devices.
Reason 5: It’s a lot simpler to pass data around and think about it conceptually when it is all in the same process.
Now that we have some reasons, what are some disadvantages to this scheme?
Disadvantage 1: Bugs. Since our application is in C++(the ability to use C libraries is important), a single segfault can crash the entire application.
Disadvantage 2: The build can take a long time; the incremental build and linking isn’t bad, but a clean build can take a few minutes. A test build on Jenkins will take >10 minutes, and it can still take several minutes to compile on a dev machine if you don’t do parallel make.
Overall, the disadvantages are not show-stoppers(except for number 1, there is some bad memory management happening somewhere but I haven’t figured out where yet). The separation into three basic parts(data collection, local GUI, web GUI) gives us a good separation of concerns. We do blend in a little bit of option 2 with multiple applications, but that is to allow certain core functionality to function even if the main application is down – what we use that for is to talk with our local cell modem. Given that the data collection hardware may not be easily accessible, ensuring that the cellular communications are free from bugs in our main application is important.
Option 2: Multiple Applications
If you don’t want to make a monolithic application, you may decide to do a lot of small applications. One of my other primary projects uses this approach, and the reason is due to the nature of the hardware and how things need to interact.
In our project with multiple applications, we have both multiple compute units and very disparate sensor readings that we are taking in. Unlike the monolithic application where data is easily classified into categories, this project has even more disparate data. Moreover, we take in a lot of different kinds of data. This data can come in on any processor, so there is no ‘master’ application per se. This data also needs to be replicated to all displays, which may(or may not) be smart displays. We also want to insulate ourselves from failure in any one application. A single bug should not take down the entire system.
To handle this, we essentially have a common data bus that connects all of the processors together. We don’t use RabbitMQ, but the concept is similar to their federation plugin, in that you can publish a message on any processor and it will be replicated to all connected processors. This makes adding new processors extremely easy. All of the data is basically made on a producer/consumer model.
Advantage 1: Program resiliency. With multiple applications running, a bug in one application will not cause the others to exit.
Advantage 2: You can easily add more processors. This is not really a problem for us, but since data is automatically synchronized between processors, adding a new consumer of data becomes very simple.
Advantage 3: Data can come and go from any connected system, you need not know in advance which processor is giving out information.
This design is not without some caveats though.
Disadvantage 1: Debugging becomes much harder. Since you can have more than one processor in the system, your producer and your consumer can be on different processors, or you could have multiple consumers.
Disadvantage 2: Because this is a producer/consumer system(it’s the only way that I can see to effectively scale), there’s no way to get data directly from an application(e.g. there’s no remote procedure call easily possible over the network).
There are two very different use cases for these two designs. From my experience, here’s a quick rundown:
- Generally easier to develop, since you don’t have to figure out program<->program interactions
- Often needed if you need to control access to a resource(e.g. physical serial port)
- Works best if you only have to run on one computer at a time
- Harder to develop due to program<->program interactions
- Better at scaling across multiple computers
- Individual applications are generally simpler
Due to the nature of engineering, there’s no one way to do this that is best. There are often multiple ways to solve a given problem, and very rarely is one of them unequivocally the best solution.