this post was submitted on 30 Sep 2025
7 points (88.9% liked)
Hacker News
2724 readers
248 users here now
Posts from the RSS Feed of HackerNews.
The feed sometimes contains ads and posts that have been removed by the mod team at HN.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
So this is bolted on top of a model that cost six figures.
And deepseek is based on llama, more than six figures.
I'm not aware of any larger parameter LLMs not based on one which is absurdly expensive.
DeepSeek is trained from-scratch. Only some variants used other LLMs.
This is a megaphone made from string, a squirrel, and a megaphone.