List of Flash News about red teaming
| Time | Details |
|---|---|
|
2026-01-23 00:08 |
Anthropic Releases Petri 2.0 Open Source AI Alignment Audits With Eval Awareness Countermeasures and Expanded Seeds
According to @AnthropicAI, the company released Petri 2.0, an open source tool for automated alignment audits that adds countermeasures against eval awareness and expands seeds to cover a wider range of behaviors after adoption by research groups and trials by other AI developers, with no crypto or token integrations disclosed, source: https://twitter.com/AnthropicAI/status/2014490502805311959. |
|
2025-02-03 16:31 |
Anthropic's Prototype System Successfully Withstands Jailbreak Attempts
According to Anthropic (@AnthropicAI), their prototype system successfully withstood thousands of hours of red teaming without any participant finding a reliable jailbreak that could extract detailed information from a set of 10 harmful questions. This indicates a robust security architecture that could be beneficial for cryptocurrency trading platforms seeking enhanced system security. |