Proxmox

Llama 3.2 Vision 11B LOCAL Cheap AI Server Dell 3620 and 3060 12GB GPU



We are testing a killer cheap AI home server off a single 3060 GPU and a 3620, a very low cost and surprisingly capable when paired with the new Llama 3.2 11B LLM powered by Ollama, OpenWEBUI and LCX containers in Proxmox.

Cheap AI Server
Dell Precision 3620 Tower
3060 12GB GPU
GPU 6 to 8 pin Power Adapter

Ai Server Playlist
Ollama Llama 3.2 Vision Model

Chapters
0:00 Cheap AI Server
1:13 Adding 3060 12GB GPU
2:36 Ollama Software Primer
6:20 Llama 3.2 Vision 11b Overview
9:11 Snake Picture Test
9:48 Kitten and Cat Test
11:12 LLM Product Recognition
12:01 GPU Parts Testing
13:08 Motherboard Parts Testing
16:14 LCD Screen Reading off Photos
18:22 Meme Understanding
19:12 Handwriting OCR recognition
20:20 Ai Texas Toast
24:00 Untagged Product AI Vision
25:32 AI Cooking Vision Recognition
27:12 Well Hardware LLM
28:12 Conclusion

Be sure to 👍✅Subscribe✅👍 for more content like this!

Join this channel

Please share this video to help spread the word and drop a comment below with your thoughts or questions. Thanks for watching!

Digital Spaceport Website
🌐

🛒Shop (Channel members get a 3% or 5% discount)
Check out for great deals on hardware and merch.

*****
As an Amazon Associate I earn from qualifying purchases.

When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network.
*****

[ad_2]

source

Related Articles

25 Comments

  1. Thank you for your video. I will share it with other people and other work organizations and put you on our list as preferred content providers for those who wanna do it yourselves again thank you for your video. It is so easy and you’re very detailed and the explanation Nolly in the application deployment as well as the hardware configuration.

  2. Thank you for your video. I will share it with other people and other work organizations and put you on our list as preferred content providers for those who wanna do it yourselves again thank you for your video. It is so easy and you’re very detailed and the explanation Nolly in the application deployment as well as the hardware configuration.

  3. could you please test this build with localGPT vision github, that repo had several vision model to test with seeing how each model perform on RAG with such build might really interesting because this kind of RAG were really different to image to text to vector, this system image to vector. different architecture

  4. I would guess by the fact if you ask multiple things the LLM processes them all at once, the vision is the same and doesn't read left to right nor right to left but processes the entire sentence all at once. 29:14

  5. these "vision" models are so bad and unreliable for anything. need to be way more specialized and fed much more samples to be of any value. spatial relationships are completly wrong. blob classification/recognition is weak. i dont see any use for this unless very very basic tasks. i dont even know if any of this can be put to production due to unreliability.

  6. Thanks nearly my setup! Did you go with pci passthrough to an vm or to an lxc?
    The card is pretty good for daily tasks and some low power consumption.
    Also 3.2 vision is at the moment really good for what i use it, mine takes about 170W on full load though 😅

  7. 30:07 if you have the ram you can always throw up a RAMDisk and swap models out of CPU RAM and into VRAM much quicker than off a drive. More advanced setup would use Memcached or Redis but for something quick and dirty RANDisk all day.

  8. After seeing this video I had to download and try this model by myself (also running Open WebUI in dockge while Ollama in a separate LXC container on Proxmox with a 20GB Nvidia RTX 4000 Ada passed through). I was flashed by the accuracy of the pictures being recognized! Even the numbers shown on my electricity meter's display were identified correct. Wow … that is and will be fun using more over the weekend 😉 Keep up your good work with these videos!

  9. Next time give it a try to ask a new question in a new chat. Ollama by default is using context size of 2k, you most probably exhausting it too quick with pictures. And the GPU VRAM is too low to accomodate higher context size without flash attention or using smaller quants, rather than default 4bit you have downloaded.

  10. Hi. This is an awesome video showcasing Ollama on a 12GB GPU. I am currently using a 12GB 6750xt. I still find it very usable speed with models in the 18-24 GB range.

  11. Sounds like maybe you'll be doing a compilation video here soon, but if not or if it's going to be a while, maybe you should add the guide videos to a playlist. You have so much great content out there. It's hard to figure out which ones to watch if you're starting from scratch

  12. Interesting build. Funny you make this video not too long after I recycled a bunch of them. It would be nice if people found more uses for stuff older than 8TH gen. These older machines are still perfectly usable.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button