Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world! - eviltoast
  • TheOubliette@lemmy.ml
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    3 months ago

    Okay so both of those ideas are incorrect.

    As I said, many are literally Markovian and the main discriminator is beam, which does not really matter for helping people understand my meaning nor should it confuse anyone that understands this topic. I will repeat: there are examples that are literally Markovian. In your example, it would be me saying there are rectangular phones but you step in to say, “but look those ones are curved! You should call it a shape, not a rectangle.” I’m not really wrong and your point is a nitpick that makes communication worse.

    In terms of stochastic processes, no, that is incredibly vague just like calling a phone a “shape” would not be more descriptive or communicate better. So many things follow stochastic processes that are nothing like a Markov chain, whereas LLMs are like Markov Chains, either literally being them or being a modified version that uses derived tree representations.

    • howrar@lemmy.ca
      link
      fedilink
      arrow-up
      1
      arrow-down
      1
      ·
      3 months ago

      I’m not familiar with the term “beam” in the context of LLMs, so that’s not factored into my argument in any way. LLMs generate text based on the history of tokens generated thus far, not just the last token. That is by definition non-Markovian. You can argue that an augmented state space would make it Markovian, but you can say that about any stochastic process. Once you start doing that, both become mathematically equivalent. Thinking about this a bit more, I don’t think it really makes sense to talk about a process being Markovian or not without a wider context, so I’ll let this one go.

      nitpick that makes communication worse

      How many readers do you think know what “Markov” means? How many would know what “stochastic” or “random” means? I’m willing to bet that the former is a strict subset of the latter.

      • TheOubliette@lemmy.ml
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        3 months ago

        The very first response I gave said you just have to reframe state.

        This is getting repetitive and I think it is because you aren’t really trying to understand what I am saying. Please let me know when you are ready to have an actual conversation.

        • howrar@lemmy.ca
          link
          fedilink
          arrow-up
          1
          ·
          3 months ago

          The very first response I gave said you just have to reframe state.

          And I said “am augmented state space would make it Markovian”. Is that not what you meant by reframing the state? If not, then apologies for the misunderstanding. I do my best, but I understand that falls short sometimes.