Date: February 25th, 2026 12:27 AM Author: Jared Baumeister
with llama.cpp, not ollamashit. Shouldn't be possible but it is. The terminal window says it's putting 16gb in the system RAM, but it never actually uses more than 1gb. I could literally give it 2048kb of system RAM and 512k swap, and it would still run a 70b parameter model on 32gb of VRAM at 23t/s. Black magic
(http://www.autoadmit.com/thread.php?thread_id=5838189&forum_id=2],#49693660) |