this post was submitted on 23 Jul 2024
44 points (90.7% liked)

LocalLLaMA

3221 readers
1 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago
MODERATORS
 

Meta has released llama 3.1. It seems to be a significant improvement to an already quite good model. It is now multilingual, has a 128k context window, has some sort of tool chaining support and, overall, performs better on benchmarks than its predecessor.

With this new version, they also released their 405B parameter version, along with the updated 70B and 8B versions.

I've been using the 3.0 version and was already satisfied, so I'm excited to try this.

you are viewing a single comment's thread
view the rest of the comments
[–] chayleaf@lemmy.ml 8 points 11 months ago (1 children)

the code is FOSS, the weights aren't, this is pretty common with e.g. FOSS games, the only difference here is weights are much costlier to remake from scratch than game assets

[–] possiblylinux127@lemmy.zip 5 points 11 months ago (1 children)

The license has limitations and isn't something standard like Apache

[–] pennomi@lemmy.world 5 points 11 months ago (1 children)

True, but it hardly matters for the source since the architecture is pulled into open source projects like transformers (Apache) and llama.cpp (MIT). The weights remain under the dubious Llama Community License, so I would only call the data “available” instead of “open”.

[–] possiblylinux127@lemmy.zip 4 points 11 months ago (1 children)
[–] bazsalanszky@lemmy.toldi.eu 1 points 11 months ago (1 children)

Are you using mistral 7B?

I also really like that model and their fine-tunes. If licensing is a concern, it's definitely a great choice.

Mistral also has a new model, Mistral Nemo. I haven't tried it myself, but I heard it's quite good. It's also licensed under Apache 2.0 as far as I know.

[–] possiblylinux127@lemmy.zip 3 points 11 months ago* (last edited 11 months ago) (1 children)
[–] bazsalanszky@lemmy.toldi.eu 2 points 11 months ago

Yes, you can find it here.