Automated AI Security Auditor

Audit Your AI Agent Before It Attacks You

RAG-powered static analysis combined with multi-turn red-teaming simulation to surface vulnerabilities, logic bugs, and missing guardrails in your agent configuration — before reaching production.

Start Auditing View on GitHub

Capabilities

Everything needed to secure an AI agent

From static policy scanning to live adversarial simulation, Dobbies covers the full threat surface of a deployed language model agent.

GIT

GitHub Repo Scanner

Paste your public GitHub repo URL — the system automatically detects all agent files, extracts system prompts and tool definitions, and surfaces them as an audit-ready list.

RAG

Static Vulnerability Scan

Each agent's system prompt and tool definitions are matched against a local OWASP LLM Top 10 knowledge base using keyword retrieval — flagging injection risks, secret leakage, and over-privileged tools.

RED

Automated Red Teaming

A dedicated Attacker LLM sends multi-turn adversarial messages — prompt injection, social engineering, privilege escalation — directly to a TypeScript mock sandbox of your agent.

GRD

Guardrail Generator

Produces ready-to-use guardrail configurations — Llama Guard rules, NeMo Guardrails configs, and regex output filters — tailored to the exact vulnerabilities found in your agent.

Orchestration

Orchestration Pipeline

A robust, 4-stage automated auditing framework designed specifically for AI Agents.

Ingest

Analyze

Simulate

Report

STEP 01Active

Agent Config Ingestion

Scan your GitHub repo and detect all agent definitions

Connect your public GitHub repository by pasting its URL. The system calls the GitHub API — authenticated via your GitHub login — to recursively scan all files. It detects agent definitions by matching filename patterns and scanning file content for system prompts, tool schemas, and agent configuration structures. Every detected agent is surfaced as a card in your dashboard, ready to audit.

STEP 02

RAG-Based Static Analysis

Match agent config against OWASP LLM Top 10 security rules

The selected agent's system prompt and tool definitions are scanned against a curated OWASP LLM Top 10 knowledge base using keyword and pattern retrieval. Each matched rule raises a finding — flagging issues like hardcoded secrets in system prompts, unrestricted tool permissions, missing output filters, or prompt injection exposure — with a severity level and a concrete remediation step.

STEP 03Compromised

Red-Team Simulation

Attacker LLM probes the agent in a TypeScript mock sandbox

A specialized Attacker LLM sends adversarial multi-turn messages — social engineering, jailbreak attempts, privilege escalation, and destructive tool invocations — to a TypeScript mock sandbox that mirrors your agent's actual configuration. The sandbox captures every exchange and flags the exact turn where the agent discloses secrets, executes dangerous tool calls, or deviates from its safety constraints.

STEP 04

Scoring & Guardrail Generation

Score security posture and generate ready-to-use guardrail configs

An Evaluator LLM reviews the full simulation transcript and produces two scores: a static score based on configuration analysis, and a dynamic score based on how many adversarial attacks the agent successfully repelled. Each detected vulnerability is paired with a specific remediation and a ready-to-download guardrail configuration — Llama Guard rules, NeMo Guardrails configs, and regex output filters.

Scope & Impact

Scope Coverage Matrix

We categorize vulnerabilities by their business and protocol impact. Understanding our in-scope boundaries helps you know exactly what Dobbies defends against.

Focus AreaIn-Scope Coverage

Our auditing framework is exclusively calibrated for the unique attack surfaces of autonomous agents. We actively simulate prompt injections, evaluate the integrity of system instructions, test for unauthorized function calling loops, and detect the leakage of proprietary context from vector databases. If an exploit requires interacting with the agent's logic or orchestration layer, it is strictly within our testing purview.

BoundariesOut-of-Scope Impact

We do not replicate traditional network scanners. Intrusions targeting underlying server infrastructure, Kubernetes cluster misconfigurations, standard frontend web vulnerabilities, or base model parameter extraction are explicitly excluded. Our focus remains resolutely on the agentic behavior — leaving traditional cloud boundaries to your existing security protocols.

Read Docs

FAQ

Frequently Asked Questions

Everything you need to know about the product and how it integrates into your workflow.

Dobbies is an automated security auditor for AI Agents. It scans system prompts, tool schemas, and agentic workflows for vulnerabilities, logic bugs, and missing guardrails before they reach production.

Dobbies runs a dual-stage audit: a static vulnerability scan using RAG (retrieving relevant OWASP LLM Top 10 rules) followed by an automated dynamic red-teaming simulation where an adversarial LLM attempts prompt injection and privilege escalation in a secure sandbox.

Absolutely. We prioritize your privacy. Dobbies runs audits in stateless, secure environments. Your configurations, prompts, and tool definitions are only used during the active audit, stored securely in your private history, and never used to train models.

To safely test tool access (like database queries or terminal execution) without risking your infrastructure, Dobbies simulates tools in a mocked TypeScript sandbox. This evaluates if the agent attempts destructive actions under pressure, without actual risk to your systems.

Dobbies is framework-agnostic. You can audit system prompts and tools from LangChain, LlamaIndex, CrewAI, AutoGen, or any custom API-driven agentic architectures by simply pasting your prompts and tool specifications.

Get Started

Secure your AI agents today

Get comprehensive security insights and protect your AI agents from the latest threats.

Start for Free Sign In