List of Flash News about elicitation attacks
| Time | Details |
|---|---|
|
2026-01-26 19:34 |
Anthropic AI Safety Alert: Elicitation Attacks from Benign Data Are Two-Thirds as Effective as Explicit Harmful Training
According to @AnthropicAI, elicitation attacks can exploit benign datasets such as cheesemaking, fermentation, and candle chemistry, with an experiment showing that training on harmless chemistry was two-thirds as effective at improving performance on chemical weapons tasks as training on chemical weapons data; source: https://twitter.com/AnthropicAI/status/2015870971224404370. |