GPT-4 is getting worse over time, not better. - eviltoast
  • Hello_there@kbin.social
    link
    fedilink
    arrow-up
    17
    ·
    1 year ago

    I asked for a list of words, and asked to remove any words ending in the letter a. It couldn’t do it. I could fight my way there, but the next revision added some back.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      32
      ·
      edit-2
      1 year ago

      This weakness in ChatGPT and its siblings isn’t surprising when viewed in light of how these large language models work. They don’t actually “see” the things we type as a string of characters. Instead, there’s an intermediate step where what we type gets turned into “tokens.” Tokens are basically numbers that index words in a giant dictionary that the LLM knows. So if I type “How do you spell apple?” It gets turned into [2437, 466, 345, 4822, 17180, 30]. The “17180” in there is the token ID for “apple.” Note that that token is for “apple” with a lower case; the token for “Apple” is “16108”. The LLM “knows” that 17180 and 16108 usually mean the same thing, but that 16108 is generally used at the beginning of a sentence or when referring to the computer company. It doesn’t inherently know how they’re actually spelled or capitalized unless there was information in its training data about that.

      You can play around with OpenAI’s tokenizer here to see more of this sort of thing. Click the “show example” button for some illustrative text, using the “text” view at the bottom instead of “token id” to see how it gets chopped up.

    • Markimus@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      3
      ·
      1 year ago

      For that to work, it would need to create a Python script that would remove all of the words ending with A, run that script, and then give you the results. I think this process of handling user requests is in the works.

      • TheFutureIsDelaware@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        10
        ·
        edit-2
        1 year ago

        It would not HAVE to do that, it just is much harder to get it to happen reliably through attention, but it’s not impossible. But offloading deterministic tasks like this to typical software that can deal with them better than an LLM is obviously a much better solution.

        But this solution isn’t “in the works”, it’s usable right now.

        Working without python:

        It left out the only word with an f, flourish. (just kidding, it left in unfathomable. Again… less reliable.)