Monoliths and service-oriented architectures

General Commentary, Programming Add comments

Apr 302020

For some reason, I’ve been seeing a lot of people piping in and saying and/or predicting that monolithic software architecture is going to be making a comeback starting this year. It probably doesn’t help that once I started reading the first article about companies moving from services to monoliths, Google kept highlighting more, but I think the impetus of this was Kelsey Hightower’s unpopular opinion segment arguing that most companies that switch to service-oriented architectures end up creating “distributed monoliths” that are the worst of both worlds. By the way, you’ve likely noted that I’m just using the term “service-oriented architecture” (OK, “SOAs”, because I’m lazy) for both service-oriented architectures and microservices – that’s just to make my life easier – as far as I’m concerned a microservice is just an SOA service with a very small scope.

So why would anyone still want to build a monolith?

The idea behind monolithic architectures is that they’re simpler, and better suited for situations where you want all your developers to be able to effectively work on all parts of your software. Another idea is that the code is easier to understand (because you only have 1 codebase to wrap your head around). That one codebase makes debugging, bugfixing, and testing easier since you’re not having to simulate or mock a bunch of other services, or syncing versions of services in a testing environment since you only have 1 codebase to change and push.

Of course, if your application gets big enough, then there are still going to be parts of the codebase that people aren’t familiar with simply because they don’t spend any time in them, effectively creating delineated segments of the codebase that, for all intents and purposes, have different maintainers (you know, the types of things that get broken out into services later). With a monolith, the theory is that any developer on the team would be able to update any part of the whole application codebase (as opposed to a service-oriented approach where developers only really know the codebase to the service they work on). The fact of the matter is, there are limits to how much code people can develop enough expertise in to be considered an…well, expert. Once you get past that point, you’re going to end up breaking up parts of your monolith and divvying the work out to different teams. Whether or not you decide to break those out into separate deployments or even formal services is up to you.

Big enough monoliths (e.g. monoliths so big you’re divvying different sections of the code out to different teams, see above) can sometimes mean that you now have to coordinate between multiple teams to implement some new features, but that’s an issue you run into with service-oriented architectures. Another potential concern is coordinating between the different teams working on your monolith for each release to make sure they’re ready to go. The scenario here would be the teams have done a few stories towards a new feature or epic, but aren’t completely done and thus aren’t ready to ship the feature, and pushing it to production would actually be a problem. This could be solved with good source code discipline, but it’s something you’re going to have to enforce on the project.

1 of the biggest arguments people make against monolithic architectures is that if there’s a problem in 1 part of your monolith it impacts all other parts of your application. That’s true – if the problem causes the server running your application to crash or reduce resources available on the server. The big complaint is that if a monolithic application goes down, everything is down, regardless of where in the monolith the problem originated. It’s easy and glib as it is to say that you should make sure your application never goes down, the fact is that everyone has outages, and a monolithic application’s biggest weakness is that when it comes to uptime, it’s all or nothing. It’s second-biggest weakness is that when it comes to scaling, it’s also all or nothing.

As badly as they’re maligned, there’s nothing inherently wrong with monoliths, particularly for projects with a small number of developers or organizations that only have 1 development team. Probably the most well-known monolith application running right now is StackOverflow, which handles heavy traffic with little to no downtime, and is being regularly updated. It’s a very successful software application (there’s a good bit of grumbling about how it’s moderated, but that has nothing to do with its architecture or the underlying software). So while there’s a lot of people who swear by service-oriented architectures, and a lot to be said for them, don’t discount monoliths out of hand.

So what’s so special about service-oriented architectures anyways?

1 of the biggest advantages of service-oriented architectures is that if you have multiple teams working on loosely-related portions of your application, you can break things up into different codebases and let teams work independently with minimal need for coordination. Now you don’t have to worry about ensuring you have a process that prevents partially-completed work from going out when you deploy your updates, nor do you need to coordinate with anyone else other than your team on when to go to production (I’m assuming you have proper, cross-functional teams for these services, including operations). Because you’re deploying these components independently, you can also scale them independently (as opposed to monoliths, where you have to scale everything).

1 of the nice things about a service-oriented architecture is that it lets you break your codebase into logically-grouped repositories. This generally leads to any individual repository being smaller and easier for new developers to understand than monolith codebases. Everything else is explicitly abstracted out in black-box fashion – “call this API.” The downside to this approach is that service teams not only need to write the service, but if anyone else is going to use it (and if not, why are you writing this service again?), they also have to write and maintain an API (which is not easy to do well), write and maintain documentation for that API, and to be really useful to other developers, have at least 1 SDK (for any lingua francas in your company). Oh, and you’ll have to at least become familiar with the APIs of every other service that your service interacts with, as well as keep an eye and ear out for any changes to those APIs. In other words, service-oriented architectures don’t save you on manpower, and they’re not saving you work either.

1 of the biggest selling points about service-oriented architectures is the line that failures in 1 service won’t impact the others. That’s true…to a point. The part where a service crashing won’t take any other services down with it is true. However, just because you broke your code into services doesn’t mean that there’s no dependency whatsoever between any given service and every other service your organization is running. At some point, something is going to need to call out a downed service, and when it does, that calling service will be impacted (you’re checking for this, right?). Service-oriented architectures reduce the blast radius of a catastrophic failure, they don’t eliminate it. Now, the nice thing about service-oriented architectures is that by treating that service dependency as an external dependency (there shouldn’t be any difference between calling another service built and run by your company vs. a service built and run by another company), it’s psychologically easier for us to remember to wrap the calls around proper error handling and fallback logic, so ideally your software is going to be more resilient to failure overall, assuming you’re not in the habit of blindly trusting external code running somewhere over the network.

1 of the downsides of using services is that you can end up duplicating data. This seems to fly in the face of service-oriented architectures. After all, each service should have its own database and be the canonical source of truth for its data. Other services shouldn’t be duplicating that data, they should be keeping IDs so they can efficiently query the host service for the details. But because we’re talking about an external (to us) service, we have to assume that it’s unreliable. That means we need more than IDs so if the service is down (or very slow), we have enough data in our database to at least load the basic information in a timely manner, even if the user can’t then drill down for more details. This means storing your own copy of things like a user-friendly name, and anything else that needs to appear on the screen when the user first references an object returned by your service. If the underlying data is changed in the original service (for example, someone goes and renames an item in a product service for an e-commerce site), your local copy of the data is going to be out-of-sync. To fix that, you’ll need some sort of mechanism for picking up on these changes, either querying for updates in the background and refreshing the UI when (or if) they come, or having a way of listening for these changes so that your service can update itself behind the scenes.

By the way, all these services leads to a much more complicated architecture for what the user considers to be the application they’re using than a monolith would have. It’s to be expected, by breaking things up into services, you’re adding more pieces. That complexity also extends to doing local development. For all the hype about services being self-contained, at some point they need to talk with other services (you know, in order to provide a coherent user experience). That means to get code running on your machine, you’ll likely need either local copy of other services, or at least a shared set of development services that you can point your local environment to for all remote calls.

Docker helps here, as you can set up a docker-compose file to create all your ancillary of dependent services to have them run in separate containers that your service can ping (treating them as a remote, even though they’re on your machine), and grouping dependent services together with the code being developed. You’re still dependent on other teams to have up-to-date Docker images, but that’s a problem that can be solved by your build machine. The key point is that you need to be able to easily connect your development environment with any upstream services that it interacts in order to do any real development.

So what architecture should I be using for all of my projects again?

So now we get to the root of the issue – which architecture should you commit yourself to in perpetuity? DHH swears by monoliths, but companies like Netflix and Google do services, and they say it lets them innovate. Well, first and foremost, you’re choosing the architecture for a specific project, not marrying someone, let go of the idea of a lifetime commitment.

Look at how all the people who are going to build and maintain this endeavor are organized. Do you have a bunch of teams that are intended to operate largely independently, with different priorities and schedules, that can benefit from separate release cadences? If so, do you have the manpower to support building, running, and monitoring multiple applications (QA, documentation, and operations, ideally all dedicated to individual teams, not separate teams to handle their respective work for all the independent development teams). If so, do your developers have the discipline needed to treat their code as APIs first, including being willing and able to deploy and support multiple versions until other teams can upgrade (at their own pace, since they have their own schedules and priorities), to guarantee thorough and complete documentation, and provide tools for other teams to call their API? Then a service-oriented approach is probably a good fit for you.

If that didn’t describe your organization, then consider this. Do you have 1 team that’s responsible for all the moving parts of your software? Does your project not break out into logical components that would best be served by treating them as an entirely different product handled by an entirely different team? Do you not need nor want to provide an API for your application to other developers (regardless of whether or not they work for the same company that you do)? Then what you want is a monolith.

Odds are neither of the scenarios I described sound like your organization, which means a purist approach isn’t the best way. DHH tweeted about an update to the Majestic Monolith, what he referred to as the “citadel architecture,” I’m guessing as a play on his old “Majestic Monolith” post. Personally, I don’t think that’s a particularly good description, so I’m going to refer to it as “solar system architecture.” In the solar system architecture, you have a “sun” that’s the center of everything. Odds are it’s an initial monolith that your company started with (the only people who started with service-oriented architectures are the new kids on the block). If it is a service, that service had better be the beating heart of what makes your company, your company. Whatever you’re defining as your “sun,” it should be the most important part of your business because, as you may have guessed, everything else revolves around it.

So what about the rest of the solar system? After all, ours has planets, asteroids, comets, moons, and all sorts of other stuff. Well, those would be everything else you deploy. Again, those other deployments should “revolve” around whatever you’ve defined as your “sun.” In practical terms, that means you shouldn’t put anything in production that doesn’t make the core of your business better. This could range from tools to make development and deployment easier, to components deployed separately (because they need to be able to scale, or fail, independently of your main application) but aren’t actual independent services, to full-fledged services (complete with their own cross-functional teams, APIs, and everything). Yes, that means I’m saying it’s possible to have both a monolith and services, living together in harmony. That’s because what matters isn’t architectural purity, but that you pick a strategy that your team is suited to execute, and that fits the business and technical requirements.

When it comes to monoliths vs. service-oriented architectures, it’s not so much a question of which is technically superior as it is a question of which best matches your organizational structure. Almost every complaint about the quality of the architectures are best handled through better software engineering discipline instead of completely changing how your code is organized and deployed. Once you have that in place, then you can identify places where it makes sense to break parts of the code out into separate deployables or combine pieces into 1 more coherent deployment. Whatever architecture you pick, make sure you’re not using it as a substitute for good software development practices, and don’t fall into the trap of thinking that you actually have to go “all-in” on 1 or the other.