DOCS

Heartbeats

Understand passive checks — how heartbeats monitor cron jobs, background workers, and scheduled tasks.

What is a heartbeat?

A heartbeat is a passive check. Instead of the platform actively probing your service, your service probes the platform. At the end of each scheduled job run, your script sends an HTTP POST to a unique ping URL. The platform records the ping and resets the timer. If no ping arrives within interval + grace_period, an incident is opened.

This inverted model is ideal for monitoring things that cannot be reached from outside your network: cron jobs, internal workers, batch processes, or any task that runs on a schedule.


How the timing works

T=0       Job starts
T=N       Job completes successfully → sends ping
T=N+interval  Platform expects next ping
T=N+interval+grace_period  No ping received → incident opens

The grace period is an intentional buffer to absorb jobs with variable runtime. Set it to at least 20–30% of the expected runtime variance.


Heartbeat statuses

StatusMeaning
NewCreated, no pings received yet
UpLatest ping arrived within interval + grace_period
LateInterval exceeded but within grace period — no alert yet
DownGrace period exceeded — incident opened, alert sent

Ping URL format

Each heartbeat gets a unique URL of the form:

POST /api/v1/hb/{heartbeat_id}

Both GET and POST are accepted. The response body and any request body are ignored — only the HTTP method and URL matter. A 200 OK response confirms the ping was recorded.

The ping URL is a secret. If it is ever exposed (committed to a public repo, logged in a public place), rotate it from the heartbeat detail page.


AI Heartbeats (Premium)

On Premium plans, you can upgrade a heartbeat to AI mode. Instead of a fixed interval and grace period, the platform's ML model learns the natural cadence of your job from at least 7 days of historical pings — schedule, frequency, and typical payload size.

When a heartbeat deviates significantly from the learned pattern, you receive an early warning notification before the grace period expires. This is particularly useful for jobs with highly variable runtime (ML training jobs, large data exports) where a fixed interval would generate false positives.


History retention

PlanPing history
Free7 days
Standard30 days
Premium90 days
Enterprise1 year

History is visible on the heartbeat detail page as a ping timeline chart and a log table.


Common patterns

Job success only

/path/to/job.sh && curl -sS -X POST "$PING_URL"

Separate success from failure

if /path/to/job.sh; then
  curl -sS -X POST "$PING_URL"
else
  echo "Job failed, skipping heartbeat ping" >&2
  exit 1
fi

Python context manager

import contextlib, requests

@contextlib.contextmanager
def heartbeat(ping_url: str):
    yield
    requests.post(ping_url, timeout=5)

with heartbeat("$PING_URL"):
    run_my_job()

Was this page helpful?