Lugh@futurology.todayM to

Futurology@futurology.todayEnglish · 8 months ago

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

9

13

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

Lugh@futurology.todayM to

Futurology@futurology.todayEnglish · 8 months ago

9

Two-faced AI language models learn to hide deception

‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

Chat

Possibly linux@lemmy.zip
link
fedilink
English
arrow-up
1·
8 months ago
Great, we are all going to die