How OnlineOrNot uses OnlineOrNot to run OnlineOrNot

Jumping into monitoring software for the first time can be pretty overwhelming. If you're not in an exploring mood, it can be easy to get lost, and you're not entirely sure what all these knobs and buttons do.

To help lighten this feeling for OnlineOrNot, I thought it might be useful to let folks know how I use OnlineOrNot, to monitor OnlineOrNot (as part of running OnlineOrNot day to day).

You might think it's silly to monitor your own site as an uptime monitoring service, however as our monitoring infrastructure is kept separate from our marketing website and web app, I actually get notified when it goes down.

To start with, I monitor:

As OnlineOrNot's marketing site is mainly static HTML (not powered by anything server-based like WordPress), monitoring the main landing page covers almost every page that could go down. I also monitor the sitemap as it's generated by a script at build time, and that script has failed in the past.

Settings for Landing Pages

I use the following settings for both my main landing page, and the sitemap.xml file.

To start with, I have OnlineOrNot check its own landing page every minute: OnlineOrNot landing page monitoring settings

Things can get noisy on the internet, and it's possible for a website to "go offline" for a minute or two without it being a particular drama (assuming it's not a regular occurrence). As a result, I only want to be notified if my landing page check fails 5 times in a row:

OnlineOrNot landing page monitoring advanced settings

To be sure it's actually the page I expect that OnlineOrNot is checking, I also set the 'Text to search for' to look for part of my main heading.

Settings for APIs

For APIs, things are a little bit different. If the API check fails, I know something is wrong, and needs investigating immediately.

I have OnlineOrNot check its own API every minute: OnlineOrNot API monitoring settings

To make OnlineOrNot actually check the API correctly, I have OnlineOrNot make an API request as a real (test) user, with a valid GraphQL query.

I set the following HTTP request settings:

OnlineOrNot API HTTP Request settings

As GraphQL APIs can return 200 OK even when things are going horribly wrong, it's important to set assertions to check the data you queried is coming back correctly:

OnlineOrNot API assertion settings

Finally, in advanced settings, I set the check to monitor from a location close to my database (for fastest results), and set it to only alert me if two checks in a row fail.

As I'm already checking the response via Assertions, I don't set 'Text to search for' for my API check.

OnlineOrNot API monitoring advanced settings

Alert settings

As I check my phone (way too much), I find email notifications to work quite well when things go wrong (no additional settings required).

For added redundancy though, I also have alerts sent to Slack and Discord, which I've added as integrations for my account:

OnlineOrNot alert settings

Interested in reading more about monitoring?

I send one email every month with an article like this one, to help improve how you and your team monitors your website

Lots of folks in DevOps and SRE like them, and I'd love to hear what you think. You can always unsubscribe.

    You can unsubscribe at any time. Read our privacy policy.