Search Results for "jailbreaks"
Anthropic Discovers 'Assistant Axis' to Prevent AI Jailbreaks and Persona Drift
Anthropic researchers map neural 'persona space' in LLMs, finding a key axis that controls AI character stability and blocks harmful behavior patterns.