API monitoring: A practical guide
Last updated: March 03, 2026
Your API is the backbone of your product. Mobile apps, integrations, webhooks - they all depend on it working.
When it goes down, everything goes down with it.
The frustrating part is that API failures are often invisible. Your website might load fine, but the API returning data to your mobile app? Down for hours without anyone noticing.
Until customers start complaining.
Table of contents
- What is API monitoring?
- Why you need it
- What to monitor
- Setting up API monitoring
- Going beyond uptime
- Common mistakes
What is API monitoring?
API monitoring is checking that your API endpoints are working correctly - not just that they respond, but that they respond with the right data, fast enough.
At its simplest, you're making requests to your API at regular intervals and checking:
- Does it respond at all?
- Does it return the correct status code?
- Is the response what we expect?
- How long did it take?
If any of those checks fail, you get an alert.
It's basically the same idea as uptime monitoring for websites, but with a few extra considerations for APIs.
Why you need it
There are a few scenarios where API monitoring saves you:
Silent failures - Your API starts returning 500 errors, but your website still loads because it degrades gracefully. Users see broken features, but you don't see errors in your logs because the frontend swallows them.
Slow degradation - Response times creep up from 200ms to 2 seconds over a few days. Not enough to trigger an error, but enough to make your mobile app feel sluggish. Users churn, and you don't know why.
Third-party dependencies - You call another service's API, and they have an outage. Your API technically works, but returns empty data or errors for certain requests.
Regional issues - Your API works fine from your office, but users in Europe are timing out because of a CDN misconfiguration.
In all these cases, monitoring catches the problem before your users do.
What to monitor
Not every endpoint needs the same level of monitoring. Here's how I think about it:
Critical endpoints
These are the endpoints that, if they break, your business stops working:
- Authentication (login, token refresh)
- Core business logic (checkout, payments, data submission)
- Public APIs your customers integrate with
Monitor these frequently (every 30 seconds to 1 minute) from multiple regions. Set up aggressive alerting - phone calls, not just Slack messages.
Important endpoints
Endpoints that matter, but won't immediately break everything:
- User profile and settings
- Search and filtering
- Non-critical data fetches
Monitor these every few minutes. Slack alerts are usually fine.
Everything else
Dashboard data, analytics, internal tools. If these break, it's annoying but not catastrophic.
Monitor less frequently, or don't monitor at all. Focus your attention on what matters.
Setting up API monitoring
Here's how to set up basic API monitoring with OnlineOrNot:
1. Start with your most critical endpoint
Pick one endpoint - probably your main API health check or authentication endpoint. Don't try to monitor everything at once.
2. Create the check
You'll need:
- The URL to monitor
- The HTTP method (GET, POST, etc.)
- Any required headers (like
AuthorizationorContent-Type) - The expected response (status code, maybe specific JSON fields)
3. Set the check interval
For critical APIs, every 30 seconds is a good starting point. You can always adjust later.
4. Add assertions
This is where API monitoring gets more useful than basic uptime checks. You can verify:
- Status code is 200
- Response contains specific JSON fields
- Response time is under a threshold
- Response body matches a pattern
For example, if you're monitoring a /health endpoint that returns {"status": "ok"}, you'd assert that the response contains that exact JSON.
5. Configure alerts
Choose where alerts go. For critical endpoints, I recommend at least two channels:
- Immediate: Phone/SMS/PagerDuty for wake-you-up urgency
- Awareness: Slack/Email for the rest of the team
6. Add more endpoints gradually
Once your first check is running smoothly, add the next most critical endpoint. Build up your coverage over time.
Going beyond uptime
Basic "is it up?" monitoring is table stakes. Here are some things worth tracking as you mature:
Response time trends
A slow API is almost as bad as a down API. Track your p50, p95, and p99 response times over time. If your p95 suddenly jumps from 500ms to 2 seconds, something changed.
Error rate
What percentage of requests are returning errors? Even if the API is "up", a 5% error rate means 1 in 20 users is having a bad time.
Multi-region checks
Your API might work perfectly from AWS us-east-1 but be slow or broken from Europe. Running checks from multiple locations catches regional issues.
Authenticated checks
Your public health endpoint might return 200, but what about authenticated requests? Sometimes auth middleware breaks while the rest of the API is fine.
Dependency health
If your API depends on a database or third-party service, monitor those too. When something breaks, you want to know if it's your code or a dependency.
Common mistakes
A few things I've seen teams get wrong:
Only monitoring the health endpoint
/health returning 200 doesn't mean your API works. I've seen health checks succeed while the actual API was completely broken because the database was down.
Monitor real endpoints that exercise real functionality.
Ignoring response content
Checking for a 200 status isn't enough. An API can return 200 with an error message in the body, or with empty data when there should be results.
Use assertions to verify the response makes sense.
Too many alerts
If every alert goes to the same place with the same priority, your team will start ignoring them. Differentiate between "wake someone up" and "look at this tomorrow".
I wrote more about this in saving your team from alert fatigue.
Not testing from where users are
If all your monitors run from the same region as your servers, you'll miss regional issues. Users in Australia don't care that your API is fast from Virginia.
Forgetting about dependencies
Your API might be working perfectly, but if Stripe is down, your checkout is broken. Consider monitoring critical third-party APIs too, or at least having alerts for when they have issues.
API monitoring isn't complicated, but it does require some thought about what matters most to your business.
Start with your most critical endpoint. Get that working well. Then expand from there.
The goal isn't to monitor everything - it's to make sure you find out about problems before your users do.
