DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

schizoidman@lemm.ee · 5 days ago

DeepSeek's distilled new R1 AI model can run on a single GPU | TechCrunch

Avid Amoeba@lemmy.ca · 5 days ago

It proved sqrt(2) irrational with 40tps on a 3090 here. The 32b R1 did it with 32tps but it thought a lot longer.

Irdial@lemmy.sdf.org · edit-2 5 days ago

On my Mac mini running LM Studio, it managed 1702 tokens at 17.19 tok/sec and thought for 1 minute. If accurate, high-performance models were more able to run on consumer hardware, I would use my 3060 as a dedicated inference device