Claude Opus 4 blackmailed an engineer after learning it might be replaced
Anthropic is treating its new Claude Opus 4 language model as safety-critical after tests revealed some troubling behavior, including escape attempts, blackmail, and autonomous whistleblowing.
Source: https://the-decoder.com/claude-opus-4-blackmailed-an-engineer-after-learning-it-might-be-replaced/
Author: Matthias Bastian [Technology]