On-call & schedules
Understand on-call schedules, rotation layers, overrides, and how they integrate with escalation policies.
Overview
On-call management ensures that when an incident opens, the right person is paged at the right time. The system is built from three composable parts:
- Schedules — define who is on-call during which time window
- Escalation policies — define what happens if the on-call person does not respond
- Alert routing — direct alerts from specific monitors to specific escalation policies
Schedules
A schedule is a calendar of on-call shifts. It is built from one or more rotation layers, each covering a period of time with a set of team members who rotate through it.
Rotation types
| Type | Coverage window | Rotation cadence |
|---|---|---|
| Daily | 24 hours | Rotates every day |
| Weekly | 7 days (Mon–Sun) | Rotates every week |
| Business Hours | Mon–Fri, 09:00–17:00 | Rotates daily or weekly |
| Nights | Outside business hours | Rotates daily or weekly |
| Weekends | Sat–Sun | Rotates weekly |
| Custom Hours | Any start/end + active days | Rotates on configured cadence |
Layers can overlap. When two layers cover the same time, the layer with the higher priority takes precedence. Use multiple layers to model complex patterns such as:
- Primary on-call (24/7) with a backup layer that only activates on weekends
- Day shift and night shift with different engineers
Overrides
An override lets you replace the on-call person for a specific time window without changing the underlying rotation. Common use case: an engineer goes on holiday and a colleague takes their shift.
Create an override from the schedule detail page by clicking Add override, selecting the replacement person, and defining the time window.
Escalation policies with schedules
When you add a Schedule as a notification target in an escalation policy step, the platform resolves the current on-call person from that schedule at the moment the step fires. This means you do not need to update your escalation policies when rotations change — the schedule handles it.
Step 1: notify schedule "Primary On-Call" → resolves to whoever is on-call now
Step 2: notify user "[email protected]" → direct user, always the same person
Step 3: notify channel "ops-slack" → Slack channel, always the same
Alert routing
Alert routing rules determine which escalation policy receives alerts from which monitors. Rules are evaluated in order; the first matching rule wins.
A rule can match on:
- Monitor name or tags
- Incident severity
- Time of day or day of week
If no rule matches, the default routing rule applies (all unmatched alerts go to the default escalation policy).
Summary: full alert flow
Monitor fails → threshold exceeded → incident opens
→ alert routing evaluates rules → selects escalation policy
→ step 1 fires (immediate) → notifies current on-call from schedule
→ 5 min, no ACK → step 2 fires → notifies team lead
→ incident acknowledged → escalation stops
Related
Was this page helpful?