Proxmox

Qwen QwQ 2.5 32B Ollama Local AI Server Benchmarked w/ Cuda vs Apple M4 MLX



The new Qwen with Questions aka QwQ LLM fine tune based off the popular Qwen 2.5 32B base is a unique step in chain of though and reasoning which really is impressive! I was lucky enough to find some stats from a X poster about their Apple Q8 M4 Max Tokens per second also to compare against for all those who have been asking. This is must watch and frankly a fascinating model demonstrating some behaviors in its search for providing the best answers that are surprisingly human. Highly recommend pull this Ollama local Ai server model

AI Home Server Quad 3090 Build
AI Playlist

QUAD 3090 AI SERVER BUILD
GPU Rack Frame
RTX 3090 24GB GPU (x4)
Gigabyte MZ32-AR0 Motherboard
Kritical Thermal GPU Pads
256GB (8x32GB) DDR4 2400 RAM
PCIe4 Risers (x4)
AMD EPYC 7702p
iCUE H170i ELITE CAPELLIX
(sTRX4 fits SP3 and retention kit comes with the CAPELLIX)
ARCTIC MX4 Thermal Paste
CORSAIR HX1500i PSU
4i SFF-8654 to 4i SFF-8654 (x4)
HDD Rack Screws for Fans

Article on Qwen with Questions

Be sure to 👍✅Subscribe✅👍 for more content like this!

Join this channel

Please share this video to help spread the word and drop a comment below with your thoughts or questions. Thanks for watching!

Digital Spaceport Website
🌐

🛒Shop (Channel members get a 3% or 5% discount)
Check out for great deals on hardware and merch.

*****
As an Amazon Associate I earn from qualifying purchases.

When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network.
*****

[ad_2]

source

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button