How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media? - eviltoast

Obviously there’s not a lot of love for OpenAI and other corporate API generative AI here, but how does the community feel about self hosted models? Especially stuff like the Linux Foundation’s Open Model Initiative?

I feel like a lot of people just don’t know there are Apache/CC-BY-NC licensed “AI” they can run on sane desktops, right now, that are incredible. I’m thinking of the most recent Command-R, specifically. I can run it on one GPU, and it blows expensive API models away, and it’s mine to use.

And there are efforts to kill the power cost of inference and training with stuff like matrix-multiplication free models, open source and legally licensed datasets, cheap training… and OpenAI and such want to shut down all of this because it breaks their monopoly, where they can just outspend everyone scaling , stealiing data and destroying the planet. And it’s actually a threat to them.

Again, I feel like corporate social media vs fediverse is a good anology, where one is kinda destroying the planet and the other, while still niche, problematic and a WIP, kills a lot of the downsides.

  • brucethemoose@lemmy.worldOP
    link
    fedilink
    arrow-up
    2
    ·
    4 months ago

    Cutting edge ones? Unfortunately, rarely. Right now there’s a sliding scale between “open and transparent” and “smart and performant” because they’re just so darn expensive to train.

    I think some of the closest ones to your requirements are Nvidia’s research models, excluding Mistral Nemo which isn’t as well documented (as its really a Mistral Model). And you can see a lot of the open “alternative” efforts like RWKV, openllama and such are severely underfunded and undertrained.

    The datasets are there, the highly optimized implementations are getting there, pieces are there, a lot of of models have detailed papers, fully open codebases, but the funding to actually do it is just too much to deal with most of the time.

    Another factor is that “closed” datasets like whatever Mistral, Facebook, Cohere and such use do seem to have an edge.