The Rise of the Monorepo

Tags: development

One Repository to Rule them all

The power of the One Ring in Tolkein’s The Lord of the Rings trilogy was so potent that it’s assumed that most are vulnerable to it’s corrupting influence by simply touching or being in the presence of the item. That’s what it feels like with Monorepo technology.

I’m fascinated and tempted by the allure of the monorepo: one place where all the code lives for the interconnected systems you’re building. One place where deployment and automation are spooled out and orchestrated. One place to collaboratively build and develop your new product.

Recently, I worked on a project that met with a lot of success in integrating its frontend and backend repositories into a larger monorepository. I also ran into some problems that you might be able to avoid in your own projects.

Is it as powerful as a dark lord’s essence? Is it as useful as it’s cracked up to be?

The Risk of Power

One of the defining themes of The Lord of the Rings is the corrupting nature of power. Peter Jackson leans harder into this theme than Tolkien may have, but the core point that I take out of it is that power has a destructive element to it.

The saying goes

Power corrupts. Absolute power corrupts absolutely.

it’s no different with monorepos.

With the monorepo searching for interconnected systems, confirming code contracts, and building off of established interfaces is _powerful and easy. Most IDEs will pick up on the declarations and definitions within this monorepo, and most workstations have the horsepower to deal with the mass information.

In reality, the addition of all the power consolidated into one repository is a risk, and some specific pitfalls tend to be more apparent when using the monorepo.

I’ll list some of the pain points that a monorepo will exacerbate.

Mournful now and slow: DOOM!

With any large project, the speed that an individual developer can move at (safely) is reduced. Monorepositories that grow and grow must also slow.

This isn’t obvious from the architecture at the outset. In fact, Monorepositores often speed up the immediate delivery and coordination for complex projects. But as more and more components are added to the project, or as inevitably happens the project adds more and more features. Even repositories that stand on their own, slow down as they age.

Monorepos amplify the growing pains as they often include many teams or many projects. Now with each pull request, build, or bugfix there’s the potential for breakdowns to occur between the components. Mature software teams will often handle this by a rigorous code review process, but if a team doesn’t have these in place there’s more chance for the changes introduced to cause unforeseen consequences. More often, as large systems grow and outgrow the teams that originally built them the problems compound and you slow down. Either you’re fixing problems, or trying desperately to not retread old code

Monorepos are not the source of this problem.

This happens whenever you have multiple projects and codebases that need to work together. I’ll call it the interface problem, but I’m sure that other authors have more descriptive names.

One Place To Find Them

I think the only problem that monorepos actually exacerbate of their own accord is finding things. I remember working on large code repositories, and back in the day grep-ing for a declaration (If your IDE couldn’t figure out where the code was coming from) was not always a viable option. Some codebases span hundreds (thousands) of files, and mature ones span years (sometimes decades!). It is unrealistic for one single developer to know all of the models, methods and the agreed shorthands that may have gone in and out of style as team members, trends, and the needs of the customer ebb and flow. In monorepos, and especially in monorepos that span a large organization, the hidden and non-code information that’s sometimes invaluable will be impossible to find unless your group or organization also takes the time to organize and write helpful documentation.

With a monorepo, finding the things you need to get a job done and especially finding the right things goes up.

The problem of information hunting isn’t unique or even a consequence of the monorepo, but piling everything together makes it easy to lose helpful and necessary information.

Forged In the Fire - Slowly

Creating new things in software is often easy, quick, and “painless.”

A new problem arises that your software does not currently handle… What to do but make another piece of software. Repeat forever.

When a team takes this approach, you’ll end up with a confused mess of tools and domains, many of them crossing over and intersecting. In more disciplined teams this tendency is nipped in the bud and instead features or fixes are applied appropriately, and new projects are only opened when the necessary resources are available to support those projects.

In Monorepos, the decision to make new stuff can be muddled and confused. When a team is tasked with a new purpose and a new challenge the question of where such a tool should go isn’t often obvious. Unless you’re also taking the microservices route, (which I’m not sure about, YMMV) it gets difficult to know whether or not to make something new or upgrade something existing.

When mistakes are made, they compound and the ripples may reach all ends of the project. Both making a new project when it should have been joined to another, and grafting on an unwanted extension bring their own problems.

Unfortunately, mistakes are often only known in hindsight.

If I must make recommendations, when making new features that could be projects it’s often best to keep things split into smaller sub-projects. The big opportunity that monorepos open up is coordinated builds and releases. It’s usually simpler and easier to know how to coordinate and release co-dependent pieces of software. The pain of removing pieces that should not have been set together is not relieved by the monorepo unfortunately.

Corruption and Immense weight

This article talks a bit about the unavoidable downsides of monorepos (if your code base gets as huge as Twitter’s). There’s also a downside that when monorepos get too big, then you can’t actually hold the entire repo on your computer!

We also run into some rebuttals of the potential upsides. Personally, I think the benefits of Monorepos can be worth the time and resource investment, and the benefits for small to medium projects are substantial. Plus, AirBnB apparently does a duorepo, where portions of the backend and frontend are in their own sub-monorepos which cuts down on the over-coupling. It should also say that Connway’s Law is probably bidirectional, that is implementing a flat monorepo should encourage a flat organization, while implementing a multirepo structure would encourage many divisions and sub-projects.

Whether this is a good thing is up to the reader.

Rebuttals aside, immense monorepos have serious downsides; working with a massive amount of code and running tests and builds for larger and larger systems often becomes unnecessarily cumbersome. If you have this problem though, you’d better also be making enough money to employ some of the smarter people in the industry. It seems like the giants such as Facebook and Google really only run into this problem.

If you’re the next Facebook or Google then watch out, I guess.

A Power Beautiful and Terrible

It can’t be understated that the Monorepo is a powerful organization and collaboration technique. Google was one of the (popular) pioneers of the practice, and it finds co-practitioners in Twitter, AirBnB (in a way), and Uber.

Semaphore wrote a great summary of monorepos and how some of these companies organize themselves using this technique.

Speed

One of the huge benefits of the Monorepo is a shared library of solid components. If you’ve worked on large web projets, this can be very helpful because you usually don’t want to be rewriting the same simple components, especially if you’re trying to present a consistent facade for your brand or company.

This article on Microsoft’s blog has a case study which talks in-depth about the speed and consistency that resulted from using the monorepository.

Furthermore, when launching new applications it was noted that they “started to onboard sooner than expected.” This Case Study also made use of GraphQL and NodeJS microservices to support these quickly spun-up applications.

Yet another benefit to the monorepo method is reuse of deployment and build tools. If you’re making a new web app and you have a monorepo consisting of at least 1 other that uses a CI/CD script to get that app up and running… well then you can do a little copy+paste to get your new web app running. Some even more advanced deployment systems (i.e. Terraform-style infrastructure-as-code systems) make the deployment even simpler.

Cooperation

One of the first applications that I moved to the monorepo style was a Typescript project with both a frontend and backend. The benefits were more than I could have expected, and the immediate collaboration of front and backend code was something I’d never personally set up.

Traditionally, I would have the backend code as a repository that’s running off of NodeJS. Simultaneously I’d make a React frontend app. Both are typescript-based and both use Prettier. In all ways but the actual execution of the javascript code the code is pretty similar. The trickiest part in organizing this dance is making sure that the frontend and the backend will talk to eachother politely. It’s not the hardest thing in the world, but making the API contract and maintaining the code that agrees with that backend isn’t 0 work.

Now with my newfound power of unification both frontend and backend are the same code. They agree automatically on the types that should be emitted and accepted. They both agree the types that make up the JSON payloads both in and out of the API and frontend.

As the project continued it would be easier and easier to add more functions using the cooperative powers of the monorepo. Furthermore, when API changes published the frontend would ship near simultaneously. More importantly, I would know that it would work (at least from the interface level, I still wrote bugs).

This level of magical cooperation wasn’t something that I’d played with before, and it’s a great feeling.

Consistency

Another benefit is all-in-one testing and tooling.

I’ve become accustomed to a certain style of programming that’s served me well. While setting up new projects was never a problem for me it does tend to drag. There are templating systems like Hygen that speed this up but it’s also nice to only have to do it once for a multitude of projects.

With a monorepo getting linting, formatting, and integration testing between modules is pretty easy. There are tools like Nx, Bazel, and Turbo that help with coordinating these activities, but for the Javascript ecosystem they’re not strictly necessary.

I’m a fan of consistent toolchains because it helps to keep me focused on the things that matter: delivering code, solving tough problems, picking the right solutions. It also helps future me read what I’ve written!

Should You Monorepo?

If you can plan for such a thing - that is, if you know you’ll have a collection of subsystems that must be propped up separately, and must work off of each other, then the clear answer is almost always yes.

Monorepos are not infallible cure-alls: other authors have written about the pitfalls and upsides, and the jury is out on figuring out exactly when to use them.

In my opinion, they’re helpful for some select cases:

  1. Enforcing a uniform style
  2. Establishing a collaborative co-development environment
  3. Coordinating closely-coupled components

The main roadblocks you’ll run across when trying to implement your monorepo future is going to come from constraints on what technologies you’re allowed to use. As a toy example, if you’re going to set up an AI service using Python (as is currently popular) then your stack now has to incorporate some python. Does that mean that your backend should be Django, and risk bifurcating your code base?

This isn’t an easy question to answer, but I hope I’ve illuminated some of the benefits and pitfalls.

I personally am a fan of the technique, and I’d encourange you to consider and try it out.