Local LLM | Ryan Marcus Jeremy Lupague

Laptop workstation running local LLM inference with GPU acceleration

Running local LLM models with llama.cpp

A practical llama.cpp setup note covering CUDA builds, server commands, MoE tuning flags, and benchmarking local LLM performance.