DOCS

Incidents & severities

Understand how monitors and heartbeats resolve into incidents, lifecycle states, and severity levels.

What is an incident?

An incident represents a service disruption that requires attention. Incidents are created automatically when a monitor or heartbeat exceeds its alert threshold, and resolved automatically when the service recovers. You can also create incidents manually for planned maintenance or issues not yet detected by monitors.

Incident detection pipeline

The diagram below shows how a monitor failure flows through the system until an incident is opened and an on-call notification is sent.

Incident lifecycle

TRIGGERED → ACKNOWLEDGED → RESOLVED
StateMeaning
TriggeredMonitor failed beyond threshold. Notifications sent to configured channels.
AcknowledgedA team member is working on the issue. Escalation pauses (if configured).
ResolvedService recovered. Recovery notification sent. Uptime data updated.

Example timeline

Incident timeline

14:32

Monitor DOWN — Production API (status_code=500)

14:32

Alert dispatched · Slack #ops, Email on-call

14:34

Acknowledged by Sarah (on-call)

14:39

Recovered · 7 min downtime

Auto-resolution

When a monitor returns to a healthy state after a failure, the incident resolves automatically. A recovery notification is sent to the same channels that received the original alert. Incidents can also be resolved manually at any time.

Severities

Each incident has a severity that controls escalation timing and notification channels:

SeverityUse case
CriticalFull outage — immediate page
HighSignificant degradation — escalates within minutes
MediumPartial failure — standard on-call routing
LowMinor issue — no on-call page by default

Severity thresholds are configured per monitor in Settings → Monitors.

Automatic incident triggers

An incident is triggered when:

  • HTTP monitor returns a status outside the expected range (default: non-2xx)
  • Monitor times out — no response within the configured timeout
  • SSL certificate is expired or invalid
  • Heartbeat has not pinged within interval + grace_period
  • Response body does not contain the required keyword

Manual incidents

Create a manual incident for situations not captured by monitors — for example, a database migration that causes degraded performance, or a known upstream provider issue.

  1. Go to Incidents in the sidebar.
  2. Click + Manual Incident.
  3. Select the affected monitor(s) and write a description.
  4. The incident appears on the status page immediately.

Incident notes

While working on an incident, add internal notes to track your investigation:

  • Notes are internal only — not visible on the public status page.
  • Markdown formatting is supported.
  • Use notes for root-cause analysis, team communication, and post-mortem documentation.

Status page visibility

Active incidents appear automatically on your public status page. To add a customer-visible message:

  1. Open the incident.
  2. Fill in the Public message field.
  3. Post updates as the incident progresses.

Incident updates appear in chronological order on the status page.

Correlation: multiple monitors, one incident

When multiple monitors fail at the same time, the platform can group them into a single correlated incident rather than flooding your team with individual alerts. Configure this under Monitors → Correlation Groups.

Alert threshold and deduplication

The alert threshold on a monitor controls how many consecutive failures are required before an incident opens. Once an incident is open, additional check failures for the same monitor do not open new incidents — they are deduplicated into the existing one.

A new incident can only open for a monitor after the previous incident for that monitor has been resolved.

History retention

Resolved incidents are retained in the incident history according to your plan. You can filter the history by monitor, date range, or status from the Incidents view.

Was this page helpful?