Self-hosting vs Managed Services: Deciding how to host your database
Like all good things in infrastructure, picking whether or not to self-host your database is full of trade-offs.
On the one hand, you have the absolute freedom to do whatever it is you want with your database - whether it's adding a useful Postgres extension, or experimenting with new technologies. On the other hand, you now have to dedicate resources to keeping your database reliably online.
This article aims to provide an unbiased dive into the benefits of each side, as well as tasks that need doing regardless of what you decide to do, to help you decide whether or not it's worth self-hosting in your circumstances.
Note that while I mention certain services, it's purely out of familiarity - I'm not paid for mentioning them.
You might even decide a traditional database isn't worth the hassle, and opt for an entirely different approach (like MongoDB or CockroachDB) - though that's for another article.
Table of Contents
- What I mean by self-hosting and managed services
- Benefits of self-hosting
- Benefits of managed services
- Things you'll have to do regardless of which option you pick
What I mean by self-hosting and managed services
Not everyone means the same thing when they say "self-hosting" and "managed services", so I just want to be clear on what this article is talking about.
When I say self-hosting, I'm talking about running database software on VMs that you have control over. In other words, running the database on AWS EC2, Google Cloud Engine, or Azure Virtual Machines or a similar VPS provider.
While you could take the term further, and have "run your own physical hardware" fall under self-hosting, I have no intention of discussing that here.
When I say managed services, I'm talking about a "database-as-a-service" type platform, such as AWS RDS, Cloud SQL, or Azure Database.
Benefits of self-hosting
One of the biggest factors driving folks to self-host their database is price.
At the low end, the cheapest tier for AWS RDS (the AWS managed database service) is around $15 USD per month. In comparison, installing PostgreSQL or MySQL on your application server is "free". In doing so, you lose the ability to scale parts of your application independently, but for smaller projects this is fine.
Once your application needs a bit more CPU and RAM, and you start looking for say, a 4 vCPU 16 GB RAM server (db.m4.xlarge on AWS RDS), assuming you've got 1TB of data, you're looking at around $762 USD per month on AWS RDS. Over on AWS EC2, that configuration (t4g.xlarge) would cost you only around $200 USD per month.
By self-hosting, you also minimise vendor lock-in. Say one day you notice that your 4 vCPU/16GB RAM/1TB storage configuration costs only $80 USD per month at VPS hosts like Hetzner, moving your self-hosted setup is as simple as running your setup scripts again and restoring from backup.
Of course, using a managed database service doesn't prevent you from using another service, it's just a lot easier when your database setup can be quickly spun up on any commodity Linux VM.
Pick your own adventure! By self-hosting, you get to decide:
- Which OS to run (and ensuring your OS is configured correctly, so your database comes back up after a restart)
- How often to update/patch your OS
- Which other software runs on the same server as your database
- Which database extensions to run (for example, TimescaleDB took years to be supported by major managed service providers)
- How often to upgrade/update/patch your database software
- How often to run backups
- Which disk configuration to use (to get more performance), as well as how to log (ensuring your logs go to the right place, so they don't fill up your data directory)
- and more!
Benefits of managed services
With managed services, you pick the version of the database you want to run, your desired instance size, click "Create", and that's it.
Backups, minor updates, maintenance tasks and more are all run automatically, leaving you to focus on building your application. You do of course pay for the convenience of having these tasks automated, but it frees up your engineers to work on features of your product, rather than keeping the lights on.
If your core business isn't running a database, why are you wasting your focus on it, when it can be outsourced?
One of the biggest benefits of managed services is the ease of scaling your database up or down at a moment's notice.
With the press of a button (and a couple of minutes of downtime, depending on how you've set up your database), you can go from the smallest database tier to whichever size you need for your workload, and back down again once you understand the resource requirements of your workload.
You also only pay for the resources you've used - so if you decide you need a significantly beefier machine for 12 hours, you only pay for those 12 hours.
When you decide to self-host, the best free support you're likely to get is on mailing lists and forums. Whereas when you use a managed service, part of the cost covers basic access to dedicated support staff that specialise in your database.
While they won't be able to give you free bespoke advice on how to architect your application, they can assist with root cause analysis when incidents happen, and provide general "this is how most people use our databases" advice.
Especially in larger organisations, being able to assign blame to the vendor when things go wrong can save a lot of stress.
In some organisations this doesn't give you a "free pass" for the blame - after all if you decide to outsource to a managed service and it goes down, since you made the decision, you still get the blame for making the decision. At the same time, plausible deniability via "nobody got fired for buying IBM" is still a thing: if you picked a reputable party, how were you to know that they'd screw up?
(While I personally wouldn't use a managed service provider as a scapegoat, there are those that would, so it is worth mentioning.)
Things you'll have to do regardless of which option you pick
Whether you decide to self-host your database, or use a managed service, you'll still need to:
- Secure access to your database
- Monitor basic metrics like CPU usage, RAM usage, disk usage, etc as well as slow queries
- Ensure your backups are running, and tested for recovery (as well as off-site backups if you want to be particularly strict about your data)
- Major updates
- While most managed services will perform minor updates for you during your scheduled maintanence window, you still need to handle major updates yourself
Are you a startup, or a business where you'd prefer all of your engineers to be working on features, rather than keeping a database operational? Comfortable with paying extra to make the problem go away?
Then managed database services such as AWS RDS might be right for you.
Are you an established business with engineers working at a steady pace (i.e. not moving fast and breaking things)? Looking to build something cost efficient in the long-term, that's optimised specifically for the needs of your application?
Then self-hosting with your own employees managing and operating the database may be a better choice.
If you're a large enterprise with a stable product that uses managed services, it might be worth reflecting on whether it's worth bringing your database in-house by self-hosting, as the cost savings can be significant.
Want to discuss this article? Let me know on Twitter @rozenmd