Recommendations on running GPTs on Asahi - M1 Ultra?

plsnotracking@lemmy.world · 1 year ago

Recommendations on running GPTs on Asahi - M1 Ultra?

Guenther_Amanita@feddit.de · edit-2 1 year ago

I don’t know what’s your intention.
I’m no expert or highly qualified in any way, so please correct me, but I don’t know if that’s the right way.

LLMs usually need lots of computing power, optimally in form of a GPU.
I use GPT4All, and when I send a prompt, I notice the temps/ fan speed and usage of my GPU turning up instantly to almost 100%. If it’s a longer one, my PC sounds like a helicopter 😁

In terms of hosting a server, you want something barely good enough for your service, e.g. running your cloud. This results in way less power draw, which is what you want, since it runs 24/7. Something powerful enough to run LLMs comfortably would likely draw lots of power, even an Apple Silicon.

I think, you’re better off just using GPT4All on your gaming PC if you need it.

I hope I’m wrong, and that M1s draw barely any power, especially in idle.
And even if I am, they (almost) can only run MacOS, which wouldn’t be a good server OS.

c10l@lemmy.world · 1 year ago

On macOS I’ve been using Ollama. It’s very easy to setup, can run as a service and expose an API.

You can talk to it directly from the CLI (ollama run ) or via applications and plugins (like https://continue.dev ) that consume the API.

It can run on Linux but I haven’t personally tried it.

https://ollama.ai/