A private research build — not publicly hosted. This write-up focuses on the architecture and the engineering, not a product pitch.
The problem
Most "AI pentester" demos are a single model improvising its way through a target — impressive in a clip, but non-auditable, unsafe to point at anything real, and incapable of learning from its own runs. I wanted the opposite: an autonomous offensive-security agent whose every decision is explicit, logged, safety-gated, and reusable as training signal.
The approach
ChakravyuhRift drives an LLM through a custom async tool-calling loop over a fleet of containerized security tools — but wraps every single decision in a Plan-Do-Check-Act (PDCA) contract. Before each tool call the agent PLANs (problem-first: goal, candidate approaches mapped to real tools, chosen tool + expected outcome); it DOes (runs the tool); it CHECKs (status, signals); it ACTs (reflects — prediction-vs-actual, falsifications, a short carry-forward packet injected into the next turn). The result is an agent that reasons in the open and leaves a structured, reviewable trail.
Architecture
- Agent core — a hand-rolled tool-calling loop (no LangChain) with budget enforcement, context compaction, and anti-repetition steering. A per-decision PDCA engine fires two LLM calls (PLAN, ACT) around each tool dispatch and serializes every cycle to NDJSON.
- Multi-provider LLM layer — one abstraction over Anthropic Claude, OpenAI, local OpenAI-compatible servers (vLLM / Ollama / llama.cpp / MLX), and a human-in-the-loop driver for tightly-controlled runs.
- Tool fabric — 30+ containerized tool-runner microservices (nmap, nuclei, sqlmap, ZAP, …), each a small FastAPI service, exposed to LLM clients through an MCP-over-HTTP bridge.
- Reasoning frameworks — a mission-level stack layering Ishikawa, MITRE ATT&CK, Kill Chain, OWASP Top 10, and CVE/CVSS/KEV enrichment onto the loop.
- Safety — a 4-tier permission model (verify → scan → exploit → lab-only), CIDR/hostname allowlists, and hardened non-root, read-only containers.
- A Living Map UI (React/Vite) visualizes the attack graph and the co-pilot in real time.
The self-improvement loop
The payoff of making every decision auditable: those PDCA traces become training data. They're routed per-specialist (recon / web / exploitation / lateral) and used to LoRA-fine-tune smaller models via Foundry. A contamination guard flags decisions that leaned on memorized facts instead of tool-observed evidence, keeping the training signal honest.
Honest scope
This is research, not a shipped product — the headline pilot numbers live in internal design docs, so I'm leaving them off here. What's real and durable is the engineering: a safe, auditable, self-improving agent architecture.