Autonomous Ops Monitors — Index
Last updated: 2026-03-20
Autonomous Ops Monitors — Index
Last updated: 2026-03-20
This document provides a quick reference for all AINL autonomous operations monitors deployed in OpenClaw.
Table
| Monitor | Schedule | Purpose | Key Metrics | Memory Schema | Runner | Envelope | Status |
|---------|----------|---------|-------------|---------------|--------|----------|--------|
| infrastructure_watchdog | every 5 min | Checks caddy, cloudflared, maddy, CRM; auto-restarts down services | service statuses, restart count 24h | ops.infrastructure.restart (event) | scripts/run_infrastructure_watchdog.py | v1.0 | Active |
| tiktok_sla_monitor | every 15 min | TikTok pipeline SLA: reports freshness, video processed, backup freshness | recent_count, video_fresh, backup_fresh, breaches_24h | ops.tiktok_sla.breach (event) | scripts/run_tiktok_sla_monitor.py | v1.0 | Active |
| canary_sampler | every 5 min | HTTP endpoint canary; slow response detection | any_breach, per-target slow, slow_24h | ops.canary.slow (event) | scripts/run_canary_sampler.py | v1.0 | Active |
| token_cost_tracker | hourly | OpenRouter token spending vs budget | daily cost, weekly cost % of budget | workflow.token_cost_state (daily summary) | scripts/run_token_cost_tracker.py | v1.0 | Active |
| lead_quality_audit | daily 2 AM | Lead data completeness (phone, website, rating, reviews) | daily counts, 7-day rolling averages, drop flags | workflow.lead_quality_audit.daily (daily summary) | scripts/run_lead_quality_audit.py | v1.0 | Active |
| token_budget_tracker | hourly | Rolling 7-day token cost vs weekly budget | week cost, week tokens, pct used | reads workflow.token_cost_state | scripts/run_token_budget_tracker.py | v1.0 | Active |
| session_continuity | every 2 hours | Extract user preferences from recent sessions; append daily log | sessions considered, preferences captured | daily_log.note (append), long_term.user_preference (prefs) | scripts/run_session_continuity.py | v1.0 | Active |
| memory_prune | daily 3 AM | Physical deletion of expired memory records | pruned_records, before/after stats | - | scripts/run_memory_prune.py | v1.0 | Active |
| meta_monitor | every 15 min | Watchdog for the monitors themselves; alerts if any monitor is stale | monitors_ok, monitors_stale, stale_details | reads cache keys monitor_heartbeat.* | scripts/run_meta_monitor.py | v1.0 | Active |
Bridge layer: token budget & weekly trends (distinct from scripts/run_*.py)
These run through openclaw/bridge/run_wrapper_ainl.py and append to OpenClaw daily markdown (~/.openclaw/workspace/memory/YYYY-MM-DD.md by default), not the SQLite memory adapter envelopes in the table above.
| Wrapper | Schedule (in .ainl) | Doc |
|---------|----------------------|-----|
| token-budget-alert | 0 23 * * * UTC | docs/openclaw/BRIDGE_TOKEN_BUDGET_ALERT.md |
| weekly-token-trends | 0 9 * * 0 | docs/operations/UNIFIED_MONITORING_GUIDE.md |
ZeroClaw bridge (zeroclaw/bridge/, zeroclaw-ainl-run) includes the same daily/weekly wrappers plus monthly-token-summary (0 3 1 * * UTC). See docs/ZEROCLAW_INTEGRATION.md.
Single operator guide: UNIFIED_MONITORING_GUIDE.md · OpenClaw bridge README: openclaw/bridge/README.md · ZeroClaw bridge README: zeroclaw/bridge/README.md.
Notes
- All monitors use the Standardized Health Envelope (version 1.0) for
queuemessages. - Configuration can be externalized via
memoryrecords underconfig.<module>. - Each monitor writes a heartbeat to
cachewith keymonitor_heartbeat.<module>upon successful completion, enablingmeta_monitor. - Memory records use sensible TTLs (7-90 days) to bound retention;
memory_pruneenforces physical cleanup. - Shared memory logic is now factored into include modules under
modules/common/:token_cost_memory.ainlforworkflowmonitor recordsops_memory.ainlforopsmonitor recordsgeneric_memory.ainlfor namespace-aware records (session,long_term,intel, etc.) This keeps metadata/filter envelopes consistent across monitor programs and reduces drift.
- Runner scripts are located in
scripts/run_*.pyand are added to OpenClaw cron withopenclaw cron add. - For agents implementing or changing monitors: Follow
docs/BOT_ONBOARDING.mdanddocs/OPENCLAW_IMPLEMENTATION_PREFLIGHT.mdbefore coding.
Implementation Docs
- General:
openclaw/AUTONOMOUS_OPS_EXTENSION_IMPLEMENTATION.md - Updates:
openclaw/TOKEN_COST_TRACKER_UPDATE.mdopenclaw/INFRASTRUCTURE_WATCHDOG_UPDATE.mdopenclaw/CANARY_SAMPLER_UPDATE.mdopenclaw/LEAD_QUALITY_AUDIT_UPDATE.mdopenclaw/TIKTOK_SLA_MONITOR_UPDATE.md
- New programs:
openclaw/MEMORY_PRUNE_IMPLEMENTATION.mdopenclaw/META_MONITOR_IMPLEMENTATION.mdopenclaw/SESSION_CONTINUITY_IMPLEMENTATION.mdopenclaw/TOKEN_BUDGET_TRACKER_IMPLEMENTATION.md
Strict-Mode Roadmap
Current autonomous ops programs use strict_mode=False due to reliance on X ops and split R calls. A future conversion plan will migrate each to strict mode once the necessary IR features are stabilized. See openclaw/AUTONOMOUS_OPS_EXTENSION_IMPLEMENTATION.md for details.
