CloudLine
Getting Started

Glossary

Plain-language definitions for every CloudLine term you'll see in the dashboard, the docs, and the alert messages.

Every term, defined once. If you want full context for a term, follow the link to its dedicated page.

Heartbeat

A small HTTP POST your bot sends to CloudLine that says "I'm alive, and here are my numbers." The SDK does this automatically every 30 seconds by default (configurable per plan). CloudLine considers your bot offline when no heartbeat arrives within the configured threshold.

The body of a heartbeat carries everything the dashboard shows — gateway ping, RAM, CPU, slash command timing, and your custom metrics. See Raw HTTP for the full payload spec.

Heartbeat secret

The long clb_live_… string used to authenticate heartbeats. Looks like:

clb_live_a8f3k2x9p1m7q4r6t5v8w2y4z6c8e1f3

Acts as a password. Anyone with it can post heartbeats as your bot. Rotate from the dashboard at any time; the old secret invalidates immediately. See Heartbeat secret.

Bot ID

The short identifier in the URL /dashboard/bots/<bot-id>. Pass this as botId / bot_id to the SDK. Not secret — safe to commit to code, share in support requests, etc.

Incident

A continuous offline window. Starts at the first missed heartbeat that crosses your offline threshold; ends when a heartbeat lands again. Each incident gets an auto-computed severity:

  • SEV1 — at least 30 minutes (long outage).
  • SEV2 — at least 5 minutes (notable).
  • SEV3 — under 5 minutes (brief blip).

Severity is duration-based, not status-based — a "degraded but online" window is not an incident. Incidents have a short reference like CL-A1B2C3D that appears in alert messages, so you can quote it when paging up a colleague.

Uptime %

Time-weighted:

uptime % = online seconds ÷ total observed seconds × 100

A bot that was offline for 1 hour out of 24 has 95.83% uptime, regardless of how that hour broke up. Sixty 1-minute blips and one 1-hour outage produce the same number.

MTTR

Mean Time To Recovery — average length of incidents in the visible date range. Lower is better.

Computed only over closed incidents (an ongoing outage doesn't have a recovery time yet, so it doesn't pull down MTTR until it resolves).

MTBF

Mean Time Between Failures — average gap between consecutive incident starts. Higher is better.

Less meaningful with fewer than 2–3 incidents in the window — a single incident gives you "MTBF = full window duration" which doesn't tell you much. The trend across multiple windows is what to watch.

SLA budget

The downtime your uptime target allows over the visible range. With a 99% target on a 30-day range, the budget is:

30 × 24 × 60 × 0.01 = 7.2 hours

The dashboard shows remaining budget; negative means you're already over for the period.

Reliability tier

CloudLine's at-a-glance health label, based on uptime over the selected window:

  • Excellent ≥ 99% — well-maintained bot, pro hosting + careful deploys.
  • Good ≥ 97% — typical hobby or personal bot.
  • At-risk ≥ 93% — frequent issues, attention needed.
  • Critical < 93% — actually broken for end users.

Thresholds are static and Discord-bot-tuned — more forgiving than strict SRE "nines." A hobbyist bot on free hosting shouldn't get a "Critical" badge for day-to-day deploys + ISP blips. See Reliability tiers.

Latency (gateway ping)

How long Discord's WebSocket gateway takes to acknowledge the bot's heartbeat. Reported by the bot itself:

  • client.ws.ping in discord.js
  • bot.latency * 1000 in discord.py

Normal: 30–80 ms. Sustained > 200 ms means your host's connection to Discord is degrading, or Discord itself is having issues.

Slash / Component / Autocomplete percentiles

Per-heartbeat samples of how long the bot took to respond to user-triggered interactions:

  • Slash p50 / p95 — chat-input commands + user/message context menus + modal submits.
  • Component p50 / p95 — buttons + select menus.
  • Autocomplete p50 / p95 — slash-command autocomplete (per-keystroke).

p50 is the median — the typical case. p95 is the slow tail — 5% of interactions are slower than this number.

Both reset and recalculate per heartbeat. See Slow slash commands for what to do if these are high.

Event-loop lag

How long the Node.js / Python asyncio event loop is blocked between scheduled timers. Above ~100 ms means the bot's main thread is busy doing synchronous work and can't process new events in time.

Idle bot: 0–5 ms. Brief GC spikes: 20–50 ms. Sustained > 100 ms is worth investigating.

Gateway zombie

The bot's process is alive (heartbeats are landing), but its Discord gateway connection is stale — the underlying TCP / WebSocket connection is dead, so the bot receives no events and serves no commands. The bot looks online to CloudLine but isn't actually working.

CloudLine detects this via the SDK's gateway_ok / gateway_stale_sec fields. See Zombie state.

Shard down

A bot using AutoShardedClient / AutoShardedBot lost one or more shards. Guilds on those shards stop being served until the shard reconnects.

CloudLine detects this when shards_connected < shards_total. Single-process (non-sharded) bots report null shard counts, so this alert doesn't apply to them.

Custom metrics

User-supplied values pushed via the SDK's monitor.gauge(name, value) (point-in-time) or monitor.counter(name, delta) (drained per heartbeat). Appear under the Custom metrics section of the Telemetry panel.

Naming: ASCII letters + digits + dot + underscore + dash, 1–64 chars, max 32 names per bot. See Custom metrics.

Status codes

Five possible values for a bot's current status:

  • online — heartbeats arriving on schedule.
  • degraded — heartbeats arriving but slow.
  • offline — too many heartbeats missed.
  • paused — you manually paused monitoring.
  • unknown — bot was just created or has no data yet.

Full details + transitions in Bot status codes.

Quiet hours

A daily window during which non-critical alerts are suppressed. You set the start and end times (and your timezone offset) on the Alerts tab. Offline alerts can still fire during quiet hours by default — toggle quietHoursAllowOffline to mute them too.

Branding fields (Business)

Four optional customizations for how alerts look:

  • customBrandColor — hex color for Discord embed accent + email header.
  • customLogoUrl — uploaded image, shown in embed and email.
  • customReplyTo — email reply-to address.
  • customFooterText — up to 200 chars, shown under every alert.

Each can be scoped per channel (e.g. brand color on Discord only). See Alert thresholds → Custom branding.