You are viewing a single comment's thread:
Mac Studio and even Mac Minis are very popular option for LLM due to how unified memory works. Nowhere can you get ~188 VRAM for less than the cost of even a single A100 40G.
I'm getting 23 tokens per second using the 5 bit Mixtal 2.7 model.
View more
I'm getting 23 tokens per second using the 5 bit Mixtal 2.7 model.
View more