AI Security protects AI agents from malicious prompts, jailbreaks, and hidden commands. It monitors both user prompts and data retrieved by agents to detect and mitigate these threats.Administrators can create and configure policies that determine which agents are monitored and what action to take when a rule is triggered.
Jailbreak / Prompt Injection: Detects attempts to override the AI agent’s built-in restrictions through prompt injection or jailbreak attacks. This applies to both user input and data retrieved or used by the agent.
Malicious Code: Identifies harmful or unsafe code in user input and the AI-generated response that could lead to unintended execution or vulnerabilities.
Harmful Content: Detects hate speech, violent rhetoric, and harmful misinformation in both user input and the AI-generated response.