On server and web app reliability, and the rise of the Site Reliability Engineer (SRE)

03 October, 2018

Author: Max Rozen

On server and web app reliability, and the rise of the Site Reliability Engineer (SRE)

What is SRE?

Google defines SRE as what would happen if you tasked software engineers with the day-to-day running of a web application (operations). As Andrew Widdowson, a veteran Site Reliability Engineer at Google notes, it's all about keeping your services super fast, and super reliable, all the time - "We change the tires of a race car as it’s going 100mph".

OnlineOrNot for SRE

If you're reading this, chances are, you're not operating at Google's planet-wide scale, nor do you intend to. So what's the deal with OnlineOrNot? With OnlineOrNot, I intend to provide Site Reliability Engineering (that is, availability, latency, performance, and capacity) monitoring for general internet business owners that just want an alert when things go wrong.

While I can't handle the emergency response of getting your web application back up and running, I can alert you at any time via Email, Text Message or Slack that there are issues with your website (assuming you haven't set Do Not Disturb hours).

Side-note

If you've ever heard a new term, and then suddenly saw it everywhere, you know how I felt in 2015, finishing off my Honours thesis on Web Application Performance Testing and seeing Site Reliability Engineering (SRE) suddenly becoming a thing (Google seems to have come up with the term way back in 2003, funnily enough).

So there I was essentially creating and measuring my own Denial of Service attacks against my own WordPress server to figure out how resilient it was, and how many concurrent users it could support (10 - it ran on a used laptop) while industry had come across this issue about 12 years earlier.

SRE vs DevOps - Search Volume

It seems however that DevOps (which I consider similar in function to SRE) as a search term only took off at the start of 2011, which sort of coincides with the sort of simultaneous realisation everyone from Google to Atlassian had, that you could just host your own software and let users access it via a web application and you'd have to invest less in customer support calls since installing and licensing the software was less of an issue.

Enjoyed this post? Receive the next one in your inbox!