Microsoft Needs So Much Power to Train AI That It's Considering Small Nuclear Reactors - eviltoast
    • FooBarrington@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I can’t imagine they are. What would the training data of those models be? Why would you train the model when the user sent a request? Why would you wait responding to the request until the model is trained?

      • frezik@midwest.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Often, these models are a feedback loop. The input from one search query is itself training data that affects the result of the next query.

        • FooBarrington@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Sure, but that’s not done with the kind of model this thread is about (separate training and inference). You’re talking about classical ML models with continuous updates, which you wouldn’t run on this kind of GPU infrastructure.