TechTakes

2284 readers

39 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

dgerard@awful.systems

The futile future of the gigawatt datacenter — by Nicholas Weaver (pivot-to-ai.com)

submitted 1 day ago by dgerard@awful.systems to c/techtakes@awful.systems

10 comments fedilink hide all child comments

The edge is where it’s at

Interview with Nick about the post:

https://www.youtube.com/watch?v=a5rLzNxRjEQ&list=UU9rJrMVgcXTfa8xuMnbhAEA - video
https://pivottoai.libsyn.com/20251107-nicholas-weaver-the-futile-future-of-the-gigawatt-datacenter - podcast

time: 26 min 53 sec

you are viewing a single comment's thread
view the rest of the comments

[–] Architeuthis@awful.systems 3 points 1 day ago (1 children)

So if a company does want to use LLM, it is best done using local servers, such as Mac Studios or Nvidia DGX Sparks: relatively low-cost systems with lots of memory and accelerators optimized for processing ML tasks.

Eh, Local LLMs don't really scale, you can't do much better than one person per one computer, unless it's really sparse usage, and buying everyone a top-of-the-line GPU only works if they aren't currently on work laptops and VMs.

Sparks type machines will do better eventually but for now they're supposedly geared more towards training than inference, it says here that running a 70b model there returns around one word per second (three tokens) which is snail's pace.

[–] dgerard@awful.systems 3 points 23 hours ago (1 children)

yeah. LLMs are fat. Lesser ML works great tho.

[–] pikesley@mastodon.me.uk 4 points 22 hours ago

@dgerard @Architeuthis

Lard Language Model