You are viewing a single comment's thread:

RE: The future of AI – self hosted open sourced models?

You can actually run a very small gemma 4 model with about 8 GB vram on your computer but I tried that with my old one and it made a lot of noise and was very slow (I have only 16 GB ram and windows uses a big part of it and no dedicated GPU. So it didn't work. I researched a bit and the best solutions are the following to host real big models locally:

  • amd ryzen ai max+ 395 with 128 GB - this is the superpower in my opinion for locally hosted AI's
  • nvidia dgx spark that has also a similar build up with 128 GB but is slightly more expensive
  • mac studio but it's difficult to find one that has enough GB Ram and high ram gets quite expensive.

all three models are actually quite low in power consumption and can run pretty big models efficiently. Their build up is similar with shared RAM.

I think with 16gb Vram you could actually run gemma 4:e2b or even gemma 4:e4b. You would need to install ollama and download it over it. Then you could execute it directly in ollama and use it as local llm. Probably it wouldn't be enough to run agents but enough for a local llm. You would need to check if your computer can deal with the workload :-)

0.00055934 BEE
1 comments

ahh it even seems to run on linux , let's make a ubuntu VM and play with it :)

0.00000000 BEE