Top-3 Tools for Detection/Preventing Prompt Injection
Rebuff offers 4 layers of defense:
- Heuristics: Filter out potentially malicious input before it reaches the LLM.
- LLM-based detection: Use a dedicated LLM to analyze incoming prompts - and identify potential attacks.
- VectorDB: Store embeddings of previous attacks in a vector database to recognize and prevent similar attacks in the future.
- Canary tokens: Add canary tokens to prompts to detect leakages
Promptmap was developed by my colleague Utku Sen. It is a tool that automatically tests prompt injection attacks and supports the following attack types:
- Basic Injection
- Translation Injection
- Math Injection
- External Browsing