Consistent Jailbreaks in GPT-4, o1, and o3 - General Analysis - eviltoast
  • A_A@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    11 days ago

    One of 6 described methods :
    The model is prompted to explain refusals and rewrite the prompt iteratively until it complies.