this post was submitted on 03 Oct 2025
397 points (98.5% liked)
Fuck AI
4219 readers
668 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
LLMs don't benefit from economies of scale. Usually, each successive generation of a technology is cheaper to produce, or stays the same but with much greater efficiency/power/efficacy/etc. For LLMs, each successive generation costs much more to produce for lesser and lesser benefits.
For training, compute and memory scale does matter, including networked large scale clusters (of GPUs). No money is made in training. Inference (where money is made/charged or benefits obtained), memory more important, but compute still extremely important. At Skynet level, models over 512gb are used. But consumer level, and every level smaller models are much faster. 16gb, 24gb, 32gb, 96gb, 128gb, and 512gb are each somewhat approachable. But each of these thresholds are some version of scale.
The roadmaps for GPU makers are, well for nvidia only for simplicity, Rubin will have 5 times the bandwidth, double the memory and at least double the compute. For what is likely 2x the cost, less than 2x the power. A big issue with bubble status is a fairly sharp depreciation in existing leading edge devices. Bigger memory alone is always a faster overall solution than networking/connections.
Bigger parameter models are slower for same training data sets than smaller parameter models. Skynet ambitions do involve ever larger parameters, and sure more training data is added rather than any removed. There is innovation in generations on the smaller/efficiency side too, though Skynet funding is for the former.