cm0002@lemmy.world to Technology@lemmy.zipEnglish · 7 days agoAnthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunchtechcrunch.comexternal-linkmessage-square2fedilinkarrow-up110arrow-down15cross-posted to: technology@lemmy.mlnews@lemmy.world
arrow-up15arrow-down1external-linkAnthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunchtechcrunch.comcm0002@lemmy.world to Technology@lemmy.zipEnglish · 7 days agomessage-square2fedilinkcross-posted to: technology@lemmy.mlnews@lemmy.world
minus-squareAwesomeLowlander@sh.itjust.workslinkfedilinkEnglisharrow-up10·7 days ago To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort. Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?
Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?