Tag

lang-en

red-team

Prompt Injection Defense in Agentic Systems

How we defend Gandalf, Gwaihir and Beorn from payloads hidden in banners, writeups and DNS responses. Instruction hierarchy, Spotlighting, StruQ and our Sentinel.

may. 15, 2026 mins
ai-ml

Constitutional AI for Offensive Agents

How we applied Constitutional AI and RLAIF to Gandalf CLI so that our offensive agents reject out-of-scope actions on their own, without relying on manual prompt engineering.

may. 15, 2026 mins