Multiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy. - eviltoast
  • dustyData@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    2
    ·
    26 days ago

    Not a very good, or easy comparison to make. Against the average, sure, the AI is above the average. But a domain expert like a doctor or an accountant is way much more accurate than that. In the 99+% range. Sure, everyone makes mistakes. But when we are good at something, we are really good.

    Anyways this is just a ridiculous amount of effort and energy wasted just to reduce hallucinations to 4.4%.