qwen 2.5
-
Proxmox
Qwen QwQ 2.5 32B Ollama Local AI Server Benchmarked w/ Cuda vs Apple M4 MLX
The new Qwen with Questions aka QwQ LLM fine tune based off the popular Qwen 2.5 32B base is a unique step in chain of though and reasoning which really is impressive! I was lucky enough to find some stats from a X poster about their Apple Q8 M4 Max Tokens per second also to compare against for all those…
Read More »