Mechanize

joined 2 years ago
[–] Mechanize@feddit.it 9 points 11 months ago (1 children)

I've never used oobabooga but if you use llama.cpp directly you can specify the number of layers that you want to run on the GPU with the -ngl flag, followed by the number.

So, as an example, a command (on linux) from the directory you have the binary, to run its server would look something like: ./llama-server -m "/path/to/model.gguf" -ngl 10

Another important flag that could interest you is -c for the context size.

This will put 10 layers of the model on the GPU, the rest will be on RAM for the CPU.

I would be surprised if you can't just connect to the llama.cpp server or just set text-generation-webui to do the same with some setting.

At worst you can consider using ollama, which is a llama.cpp wrapper.

But probably you would want to invest the time to understand how to use llama.cpp directly and put a UI in front of it, Sillytavern is a good one for many usecases, OpenWebUI can be another but - in my experience - it tends to have more half baked features and the development jumps around a lot.

As a more general answer, no, the safetensor format doesn't directly support quantization, as far as I know

[–] Mechanize@feddit.it 2 points 11 months ago

I watched the features trailer and it gives me the vibes of an attempt towards a weird Kenshi in a wuxia world, which is not bad per se, but there are really not a lot of info on how the grind (the meat of this type of games) really works.

For now it seems mostly a map based sandbox with lofty ambitions... but the graphic style is lovely!

I'll try to keep an eye on it, even if it probably will not find the light of release for a couple of years more (IMHO)

Good find!

[–] Mechanize@feddit.it 16 points 11 months ago (1 children)

If things haven't changed recently: remember that each time you get a giveaway game from the site you are (re)subscribing to the newsletter too

[–] Mechanize@feddit.it 15 points 1 year ago* (last edited 1 year ago) (1 children)

They have finally updated the Status Page

Not a lot of information, but better than nothing

[–] Mechanize@feddit.it 39 points 1 year ago (7 children)

Yeah, incredibly frustrating.
The only acknowledgement is from a volunteer mod on reddit that said an hour ago that "the team is aware and the status page will be updated shortly".

The fact I had to dig around to find that is really not a pleasing experience.

[–] Mechanize@feddit.it 9 points 1 year ago* (last edited 1 year ago)

~~Their systems currently report that everything's fine, which - to be fair - could be a misreporting and change at any moment~~
~~Anecdotally all their landing sites load fine for me~~

~~Tackling it from the other side: could it be a problem with your DNS?~~
~~Did you try with another network? Like wifi or mobile data~~

~~EDIT: Formatting~~

EDIT: Tried again and now the email service seems to not be loading

EDIT 2: It is being reported on other sites too, but currently there's nothing official I could find, not even on their Mastodon or Twitter various accounts

EDIT 3: On reddit the volunteer mod alex_herrero wrote an hour ago that "The team is aware and [the] status page will be updated shortly".

[–] Mechanize@feddit.it 3 points 1 year ago (1 children)

I've read good things about LTX, but I've never used it.

[–] Mechanize@feddit.it 11 points 1 year ago (1 children)

You'll die, just make it matter

[–] Mechanize@feddit.it 1 points 1 year ago

That's the bad thing about social media. If no one was doing it before, someone is now!

Jokes aside it's possible, but with the current LLMs I don't think there's really a need for something like that.

Malicious actors usually try to spend the least amount of effort possibile for generalized attacks, because you end up having to often restart when found out.

So they probably just feed an LLM with some examples to get the tone right and prompt it in a way that suits their uses.

You can generate thousands of posts while Lemmy hasn't even started to reply to one.

If you instead want to know if anyone is taking all the comments on lemmy to feed to some model training.. Yeah, of course they are. Federation makes it incredibly easy to do.

[–] Mechanize@feddit.it 20 points 1 year ago (1 children)

Probably I'm missing something but I've read the parent comment as a way to highlight the hypocrisy behind making extensive use of something while, simultaneously, wanting to bar others from using it

I don't see an insult in there, given the choice of words and the context, but maybe I'm missing something fundamental?

[–] Mechanize@feddit.it 2 points 1 year ago* (last edited 1 year ago)

Technically speaking, AFAIK, it is some years now that Ariane has started a project about a partially reusable rocket

EDIT: Reading the article I think it is the same project? I assume I'm misreading your comment

[–] Mechanize@feddit.it 4 points 1 year ago

AFAIK it is still a tuning of llama 3[.1], the new Base models will come with the release of 4 and the "Training Data" section of both the model cards is basically a copy paste.

Honestly I didn't even consider the fact they would not be giving Base models anymore before reading this post and, even now, I don't think this is the case. I went to search the announcements posts to see if there was something that could make me think about it being a possibility, but nothing came out.

It is true that they released Base models with 3.2, but there they had added a new projection layer on top of that, so the starting point was actually different. And 3.1 did supersede 3...

So I went and checked the 3.3 hardware section and compare it with the 3 one, the 3.1 one and the 3.2 one.

|3 | 3.1 | 3.2 | 3.3 | |


|


|


|


| | 7.7M GPU hours | 39.3M GPU hours | 2.02M GPU hours | 39.3M GPU hours |

So yeah, I'm pretty sure the base of 3.3 is just 3.1 and they just renamed the model in the card and added the functional differences. The instruct and base versions of the models have the same numbers in the HW section, I'll link them at the end just because.

All these words to say: I've no real proof, but I will be quite surprised if they will not release the Base version of 4.

Mark Zuckerberg on threadsLink to post on threads
zuck a day ago
Last big AI update of the year:
•⁠ ⁠Meta AI now has nearly 600M monthly actives
•⁠ ⁠Releasing Llama 3.3 70B text model that performs similarly to our 405B
•⁠ ⁠Building 2GW+ data center to train future Llama models
Next stop: Llama 4. Let's go! 🚀

Meta for DevelopersLink to post on facebook
Today we're releasing Llama 3.3 70B which delivers similar performance to Llama 3.1 405B allowing developers to achieve greater quality and performance on text-based applications at a lower price point.
Download from Meta: --

Small note: I did delete my previous post because I had messed up the links, so I had to recheck them, whoops

view more: ‹ prev next ›