Self hosted LLM - eviltoast

Hello internet users. I have tried gpt4all and like it, but it is very slow on my laptop. I was wondering if anyone here knows of any solutions I could run on my server (debian 12, amd cpu, intel a380 gpu) through a web interface. Has anyone found any good way to do this?

  • grilledcheesecowboy@kbin.social
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    10 months ago

    I’ve had pretty good luck running llamafile on my laptop. The speeds aren’t super fast, and I can only use the models that are Mistral 7B and smaller, but the results are good enough for casual use and general R and Python code.

    Edit: my laptop doesn’t have a dedicated GPU, and I don’t think llamafile has support for Intel GPUs yet. CPU inference is still pretty quick.