@mttaggart In case of agentic stuff you can "just" pop calc, and in case of natural language output ("say harmful things") the words by themselves are not dangerous. My bigger problem is how do you define vulnerabilities in a system where controls are usually just another non-deterministic pattern matcher system? It is *bound* to let things slip!