Multiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy. - eviltoast
    • dustyData@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      2
      ·
      26 days ago

      Not a very good, or easy comparison to make. Against the average, sure, the AI is above the average. But a domain expert like a doctor or an accountant is way much more accurate than that. In the 99+% range. Sure, everyone makes mistakes. But when we are good at something, we are really good.

      Anyways this is just a ridiculous amount of effort and energy wasted just to reduce hallucinations to 4.4%.

    • copygirl@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      3
      ·
      26 days ago

      I would not accept a calculator being wrong even 1% of the time.

      AI should be held to a higher standard than “it’s on average correct more often than a human”.