AI System Guardian
Self-healing monitoring daemon for multi-agent AI architectures. 21-pattern deep code scanner + 13 runtime health checks + auto-restart + pipeline stall recovery. Born from fixing 70+ production bugs across an 11-round debugging marathon.
How It Works
Architecture Diagram
┌─────────────────────────────────────────────────┐
│ Shadow Sentinel v2.0 │
│ (Main Patrol Loop — 10s cycle) │
├────────────┬────────────┬───────────────────────┤
│ 13 Health │ Auto-Heal │ 21-Pattern Security │
│ Checks │ Engine │ Deep Scan (24h) │
├────────────┼────────────┼───────────────────────┤
│ PM2 Status │ pm2 restart│ eval/exec detection │
│ HTTP Probe │ Task rollbk│ pickle/ctypes block │
│ Pipeline │ Log rotate │ Network allowlist │
│ Disk/Budget│ Reassign │ Secret file access │
└─────┬──────┴─────┬──────┴──────────┬────────────┘
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌───────────────┐
│ Dashboard│ │ Discord/ │ │ Escalation │
│ JSON │ │ Telegram │ │ Task Queue │
└──────────┘ └──────────┘ └───────────────┘
Use Cases
Multi-Agent AI Systems
When you run 3+ AI agents (orchestrator, workers, dispatcher) via PM2, crashes and infinite loops are inevitable. The guardian catches them in seconds, not hours.
AI-Generated Code Pipelines
If your agents generate and execute Python scripts, the 21-pattern scanner blocks dangerous code (eval, pickle, shell injection) before it runs.
Production Task Queues
Pipeline stall detection finds tasks stuck for 30+ minutes and auto-rolls them back, preventing the entire queue from blocking.
Cost-Sensitive API Operations
Budget burn rate monitoring catches runaway API calls (e.g., infinite retry loops calling GPT-4) before they drain your credits.
Frequently Asked Questions
What is AI System Guardian?▼
AI System Guardian is a self-healing monitoring daemon for multi-agent AI architectures. It runs a 10-second patrol loop that executes 13 runtime health checks (PM2 process status, HTTP probes, pipeline stall detection, disk space, budget burn rate) and a 24-hour deep code security scan with 21 patterns (eval, exec, pickle, shell injection, unauthorized network calls). When it detects a failure, it automatically restarts processes, rolls back stuck tasks, or escalates to a human operator.
How does the self-healing work?▼
When a health check fails — say a PM2 process crashes — the sentinel issues a `pm2 restart` command automatically. A 5-minute cooldown prevents restart storms. If the process crashes again within cooldown, the sentinel writes an escalation task and sends a Discord/Telegram alert. For pipeline stalls, it rolls back the stuck task and reassigns it to a different worker. For bloated logs, it rotates them automatically.
What are the 21 deep code scan patterns?▼
The scanner checks AI-generated scripts for: eval()/exec() (arbitrary code execution), __import__/importlib (dynamic imports), os.system/popen/exec*/spawn* (shell injection), subprocess.Popen, ctypes (native code loading), pickle/shelve (insecure deserialization), shutil.rmtree (recursive deletion), input() (blocking I/O), direct .key/.env/token file access, and outbound HTTP calls to non-allowlisted domains.
Do I need PM2 to run this?▼
PM2 is recommended because the sentinel monitors PM2-managed processes and uses pm2 restart for auto-healing. However, the health check patterns (HTTP probes, disk monitoring, log rotation, code scanning) work independently. You could adapt it to systemd, Docker health checks, or Kubernetes liveness probes.
Can I add custom health checks?▼
Yes. Each health check is a standalone function that returns (ok, message). To add a custom check, write a function following the same pattern and add it to the shadow_sentinel_scan() main loop. The sentinel will include it in the dashboard JSON and daily digest automatically.
How does budget burn rate monitoring work?▼
The sentinel tracks API costs logged by the task pipeline. It calculates hourly and daily burn rates and projects monthly spend. If the projected spend exceeds your configured threshold, it sends an alert. This catches runaway retry loops or infinite task generation that can drain API credits in hours.
Related Products
on GumroadGitHub Actions CI/CD Templates Pack
12 production-ready GitHub Actions workflow templates covering Node.js, Python, Docker, multi-environment deployments, security scanning, and auto-changelog. Copy, paste, ship.
Buy on Gumroad →AI Prompt Engineering Toolkit
Production-ready prompt templates, chain-of-thought workflows, and API integration code for GPT-4, Claude, Gemini, and any instruction-following LLM.
Buy on Gumroad →AI Developer Prompts Pack
55 battle-tested prompt templates for the workflows developers actually use: code review, debugging, architecture, documentation, and testing. Stop writing prompts from scratch.
Buy on Gumroad →More Free Tools
Related Articles
How to Build a Self-Healing AI Agent System: Lessons from 70+ Production Bugs
A practical guide to building a self-healing monitoring system for multi-agent AI architectures. Covers 21 code scan patterns, 13 runtime health checks, auto-restart with cooldown, pipeline stall recovery, and budget burn rate monitoring — all born from an 11-round debugging marathon.
Cron Expression Generator: Build and Understand Cron Schedules
Learn cron expression syntax, build schedules visually, and avoid common mistakes. Includes a cron expression generator tool and reference for every cron format.
GitHub Actions Complete Guide: Build Your First CI/CD Pipeline in 2026
Learn GitHub Actions from scratch. Set up automated testing, linting, Docker builds, and deployments with real workflow examples. No prior CI/CD experience needed.
Get weekly developer tips
Tool guides, productivity playbooks & AI tricks. Free. No spam.
Unlock AI-Powered Dev Tools
- ⚡ AI Code Review, Doc Generator & SQL Builder
- ⚡ All premium templates & early access
- ⚡ Member discounts on Gumroad products
$9/mo after trial · Cancel anytime
Want the full toolkit?
Get DevPlaybook Pro
Every template, guide, boilerplate, and automation script in one bundle. 13 premium products — grab them all at once and save big.
MIT licensed · Instant download · No subscription
See what's included → Browse all deals →