ChatGPT spills its prompt

David Gerard@awful.systems · 4 months ago

ChatGPT spills its prompt

recklessengagement@lemmy.world · 4 months ago

Hah, still worked for me. I enjoy the peek at how they structure the original prompt. Wonder if there’s a way to define a personality.

o7___o7@awful.systems · 4 months ago

Wonder if there’s a way to define a personality.

Considering how Altman is, I don’t think they’ve cracked that problem yet.

corbin@awful.systems · 4 months ago

Not with this framing. By adopting the first- and second-person pronouns immediately, the simulation is collapsed into a simple Turing-test scenario, and the computer’s only personality objective (in terms of what was optimized during RLHF) is to excel at that Turing test. The given personalities are all roles performed by a single underlying actor.

As the saying goes, the best evidence for the shape-rotator/wordcel dichotomy is that techbros are terrible at words.

NSFW

The way to fix this is to embed the entire conversation into the simulation with third-person framing, as if it were a story, log, or transcript. This means that a personality would be simulated not by an actor in a Turing test, but directly by the token-predictor. In terms of narrative, it means strictly defining and enforcing a fourth wall. We can see elements of this in fine-tuning of many GPTs for RAG or conversation, but such fine-tuning only defines formatted acting rather than personality simulation.

ChatGPT spills its prompt

ChatGPT spills its prompt

ChatGPT just (accidentally) shared all of its secret rules – here's what we learned