How to get alerted when your EC2 instance shuts down

Some of your most critical infrastructure runs on AWS EC2, so it's pretty damn important to know when your EC2 instances shut down.

Sure, chances are someone in your organisation will start kicking and screaming within 30 minutes of a particularly important instance shutting down, but we can do better than that.

When it comes to monitoring and customers (whether inside your org or outside), being proactive wins you a lot of points.

AWS Native

There are a few ways to track whenever your EC2 instance shuts down natively within AWS:

CloudWatch

Your first idea might be to set up a CloudWatch alarm whenever CPU Utilization is at 0% for 5 minutes, which is close, but doesn't actually do the job. An instance shutting down doesn't actually send data about its CPU Utilization.

What you actually want to do is to tell CloudWatch to treat missing data as breaching the threshold.

You can see the official AWS CloudWatch docs for how to do that.

EventBridge

Using AWS EventBridge, you can get notified directly when your EC2 instance changes state.

Basically, you create an SNS topic with a subscription for however you want to receive the notification, then add an EventBridge rule to react to when your EC2 instance changes state, and point it at your SNS topic.

You can see the official AWS EC2/EventBridge docs for how to do that.

(don't use) CloudTrail

You might be tempted to use AWS CloudTrail for this, but it's not the right tool for the job. CloudTrail is an audit log. It's more for tracking who in your organization ran which API call against your AWS resources, a while after the fact.

It won't catch your instance deciding to shut itself down, and it'll take a while to get data from it.

Why you shouldn't focus on server status

It doesn't directly answer your question (how do you get alerted when an EC2 instance shuts down), but you should probably reconsider your approach. Ideally, a human shouldn't get involved unless you actually need human intervention to restart your application.

What you should do instead is set up EC2 autoscaling with health checks so that the system recovers automatically whenever an instance shuts down (EC2 has docs for this).

Basically: stop looking for infrastructure issues, and actually worry about application availability instead.

Once your application can heal itself by adding resources as needed, you can start to worry about getting alerts when your application becomes unreachable.

External monitoring services

I'm not going to sugar-coat it, this article was made possible because the author runs an external monitoring service, but even if I didn't, I would still recommend this route by default.

If your application runs an HTTP service, this is the quickest win by far.

Set up an external check to visit your application's HTTP endpoint/URL, and you'll receive an SMS, Email, Slack notification, Pager alert, and more if the application stops being available on the internet.

You can see a quick guide on how to set that up.

Health checks and heartbeat monitors

If your application doesn't expose a HTTP server, you can still use an external monitoring service.

You would just need to set up your application to ping a healthcheck endpoint on a regular basis (either at the instance-level using something like cron , or from within your application)

There's also a quick guide for setting that up.

Interested in reading more about monitoring?

I send one email every month with an article like this one, to help improve how you and your team monitors your website

Lots of folks in DevOps and SRE like them, and I'd love to hear what you think. You can always unsubscribe.

    You can unsubscribe at any time. Read the privacy policy.