this post was submitted on 27 Jun 2025
23 points (96.0% liked)

LocalLLaMA

3297 readers
51 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago
MODERATORS
 

Theres a lot more to this stuff than I thought there would be when starting out. I spent the day familiarizing with how to take apart my pc and swap gpus .Trying to piece everything together.

Apparently in order for PC to startup right it needs a graphical driver. I thought the existance of a HDMI port on the motherboard implied the existance of onboard graphics but apparently only special CPUs have that capability. My ryzen 5 2600 doesnt. The p100 Tesla does not have graphical display capabilities. So ive hit a snag where the PC isnt starting up due to not finding a graphical interface output.

I'm going to try to run multiple GPU cards together on pcie. Hope I can mix amd Rx 580 and nvidia tesla on same board fingers crossed please work.

My motherboard thankfully supports 4x4x4x4 pcie x16 bifurcation which isa very lucky break I didnt know going into this ๐Ÿ™

Strangely other configs for splitting 16x lanes like 8x8 or 8x4x4 arent in my bios for some reason? So I'm planning to get a 4x bifurcstion board and plug both cards in and hope that the amd one is recognized!

According to one source The performance loss for using 4x lanes for GPUs doing the compute i m doing is 10-15 % surprisingly tolerable actually.

I never really had to think about how pcie lanes work or how to allocate them properly before.

For now I'm using two power supplies one built into the desktop and the new 850e corsair psu. I choose this one as it should work with 2-3 GPUs while being in my price range.

Also the new 12v-2x6 port supports like 600w enough for the tesla and comes with a dual pcie split which was required for the power cable adapter for Tesla. so it all worked out nicely for a clean wire solution.

Sadly I fucked up a little. The pcie release press plastic thing on the motherboard was brittle and I fat thumbed it too hard while having problems removing the GPU initially so it snapped off. I dont know if that's something fixable. It doesnt seem to affect the security of the connection too bad fortunately. I intend to grt a pcie riser extensions cable so there won't be much force on the now slightly loosened pcieconnection. Ill have the gpu and bifurcation board layed out nicely on the homelab table while testing, get them mounted somewhere nicely once I get it all working.

I need to figure out a external GPU mount system. I see people use server racks or nut and bolt meta chassis. I could get a thin plate of copper the size of the desktops glass window as a base/heatsink?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] afk_strats@lemmy.world 6 points 1 day ago* (last edited 13 hours ago) (1 children)

Yep. Vulkan is recommended for cross-vendor setups, more commonly where there's integrated graphics.

I actually had ti and xtx variants, so vram was 12+24GB = 36 GB. Vulkan is implemented cross-vendor and running vulkan-based llama.cpp yielded similar (though slightly worse) performance than CUDA on the 3080ti as a point of reference.

I don't have this well documened but, from memory, Llama3.1 8B_k4 could reliably get arund 110 tk/s on CUDA and 100 on Vulkan on the same computer

I used this setup specifically to take advantage of the vastly increased VRAM of having two cards. I was able to use 32B_k4 models which were outside of the VRAM of either card and tracked power and RAM uasage with Lact. Performance seemed pretty great compared to my friend running the same models on a 4x4060ti setup using just CUDA.

If this is interesting to a lot of people, I could put this setup together to answer more questions / do a separate post. I took the setup apart because it physically used more space than what my case could accommodate and I had the 3080ti literally hanging out of a riser.

[โ€“] clothes@lemmy.world 5 points 23 hours ago

Wow, I had no idea! Nor did I know that Vulkan performs so well. I'll have to read more, because this could really simplify my planned build.

Count me as someone who would be interested in a post!