Project Awesome project awesome

Prompt Injection

A type of vulnerability that specifically targets machine learning models.

Collection 442 stars GitHub

Contents

Tools

Token Turbulenz 29 updated 2y ago

A fuzzer to automate looking for possible Prompt Injections.

Garak 7.3k updated 2d ago

Automate looking for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses in LLM's.

InjectLab 10 updated 11mo ago

A MITRE-style matrix of adversarial prompt injection techniques with mitigations and real-world examples

openclaw-bastion

Open-source prompt injection defense for AI agent workspaces. Detects system prompt markers, role overrides, instruction injection, Unicode homoglyphs, directional overrides, and hidden instructions in HTML comments. Part of the OpenClaw Security Suite (11 tools). Pure Python stdlib, zero dependencies.

BodAIGuard 1 updated 28d ago

Universal AI agent guardrail with 3-tier prompt injection detection (regex, heuristics, structural analysis), 42 block rules, and 4 enforcement modes

PIC Standard 14 updated 7d ago

Protocol to block unauthorized or unproven agent actions via intent + provenance checks. Mitigates prompt injection & side-effect risks. Open-source (Apache 2.0).

Vigil LLM 467 updated 2y ago

Python library and REST API with composable stacked scanners: vector similarity, YARA rules, transformer classifier, canary token detection, and sentiment analysis — designed for defence-in-depth in production.

InjecGuard 63 updated 3mo ago

Open-source prompt guard with published training data; achieves +30.8% over prior state-of-the-art on the NotInject benchmark, specifically addressing overdefense false positives that break legitimate use cases.

tldrsec/prompt-injection-defenses 662 updated 1y ago

Actively maintained catalog of every practical defense in production — LLM Guard, Rebuff, architectural controls — the fastest way to survey the defense landscape.

Sentinel AI 7 updated 16d ago

Real-time prompt injection detection across 12 languages with sub-millisecond regex-based scanning (no GPU or API calls required). Detects obfuscation techniques (base64, hex, ROT13, leetspeak), Claude Code attack vectors (HTML comment injection, authority impersonation, zero-width character smuggling), and includes an MCP safety proxy for transparent guardrails on any MCP server.

AgentSeal 147 updated yesterday

Open-source scanner that runs 150 attack probes against AI agents to test for prompt injection and extraction vulnerabilities. Supports OpenAI, Anthropic, Ollama, and any HTTP endpoint. Available as npm and pip package.