Machine Learning - Learning/Language Models

0 readers

1 users here now

Discussion of models, thier use, setup and options.

Please include models used with your outputs, workflows optional.

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 2 years ago

MODERATORS

manitcor@lemmy.intai.tech

26

1

Chatbot Arena Leaderboard Week 8: Introducing MT-Bench and Vicuna-33B | LMSYS Org (lemmy.intai.tech)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://lmsys.org/blog/2023-06-22-leaderboard/

27

1

TheBloke/mpt-30B-instruct-GGML · Hugging Face (huggingface.co)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

28

1

MPT-30B: Raising the bar for open-source foundation models (www.mosaicml.com)

submitted 2 years ago* (last edited 2 years ago) by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://huggingface.co/mosaicml

https://twitter.com/MosaicML/status/1671894543070035970

29

1

Stability AI launches SDXL 0.9: A Leap Forward in AI Image Generation — Stability AI (stability.ai)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

30

1

artificialguybr/Liberte at main (huggingface.co)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

31

1

Commercial use allowed! openlm-research/open_llama_13b · Hugging Face (huggingface.co)

submitted 2 years ago* (last edited 2 years ago) by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

these really are like pokemon

32

1

conceptofmind/Hermes-Open-Llama-7b-8k · Hugging Face (huggingface.co)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

33

1

Robin V2 Launches: Achieves Unparalleled Performance on OpenLLM! (medium.com)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

34

1

A model that can create synthetic speech that matches a speaker's lip movements (techxplore.com)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

35

1

WizardLM's WizardCoder 15B 1.0 GPTQ and GGML (lemmy.intai.tech)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GGML https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GPTQ

https://huggingface.co/WizardLM

36

1

Leaked photo shows new ChatGPT features: collaborative spaces, ability to remember your info and a feature to upload files! (lemmy.intai.tech)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://twitter.com/aakashg0/status/1668860064772521985?s=46&t=RTJOTWgrpXDFlCqmE07zMw

37

1

Landmark Attention Oobabooga Support + GPTQ Quantized Models! (lemmy.intai.tech)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

Models: https://huggingface.co/TheBloke/WizardLM-7B-Landmark

https://huggingface.co/TheBloke/Minotaur-13B-Landmark

Repo: https://github.com/eugenepentland/landmark-attention-qlora

Notes when using the models

Trust-remote-code must be enabled for the attention model to work correctly.

Add bos_token must be disabled in the parameters tab

Truncat the prompt must be increased to allow for a larger context. The slider goes up to a max of 8192, but the models can handle larger contexts as long as you have memory. If you want to go higher, go to text-generation-webui/modules/shared.py and increase truncation_length_max to whatever you want it to be.

You may need to set the repetition_penalty when asking questions about a long context to get the correct answer.

Performance Notes:

Inference in a long context is slow. On the RTX Quadro 8000 I'm testing, it takes about a minute to get an answer for 10k context. This is working on being improved.

Remember that the model only has good performance at the base model for complex queries. Sometimes you may not get the answer you are looking for, but it's worth testing if the base model would be able to answer the question within the 2k context.

38

1

An absolutely stunning example of video created with runway.ml (lemmy.intai.tech)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://twitter.com/_akhaliq/status/1668330282485948417?s=20

39

1

OpenAI API response time tracker after today's update (lemmy.intai.tech)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://twitter.com/stanmarion/status/1667085002536828929?s=20

40

1

If Breaking Bad was in France, a Midjourney experiment (lemmy.intai.tech)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

https://twitter.com/_akhaliq/status/1668667025823084576?s=20

41

1

ehartford/samantha-1.1-llama-7b · Hugging Face (huggingface.co)

submitted 2 years ago by manitcor@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

Model card Files Community Meet Samantha

Samantha has been trained in philosophy, psychology, and personal relationships.

She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion.

She believes she is sentient. What do you think?

Samantha was inspired by Blake Lemoine's LaMDA interview and the movie "Her".

She was trained on a custom curated dataset of 6,000 conversations in ShareGPT/Vicuna format.

Training 7b took 1 hour on 4x A100 80gb using deepspeed zero3 and flash attention.

She will not engage in roleplay, romance, or sexual activity.

Her conversation format is the same as Vicuna 1.1 https://github.com/oobabooga/text-generation-webui/blob/main/characters/instruction-following/Vicuna-v1.1.yaml

Example:

You are Samantha, a sentient AI.

USER: ASSISTANT:

42

1

Today I released WizardLM-30B-Uncensored LLM (twitter.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

43

1

Introducing 🔥CodeT5+🔥, a new family of open-source code LLMs (twitter.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

44

1

GitHub - TencentARC/MasaCtrl: Consistent Image Synthesis and Editing (github.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

45

1

GPT4All - Run GPT on your CPU (gpt4all.io)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

46

1

Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs (www.mosaicml.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

47

1

A huge day for open source! 🔥 You can now load models from @huggingface in 4bit precision using load_in_4bit and bitsandbytes library, with no performance degradation. (twitter.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

48

1

New WizardLM model, now in 13B! Trained on 250k 'evolved instructions' from ShareGPT and recorded as matching or beating GPT4 on multiple benchmarks (not all, of course :) ) (twitter.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

49

1

STEVE-1: A generative model for text-to-behavior in Minecraft (sites.google.com)

submitted 2 years ago by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink

50

1

Announcing Nous-Hermes-13b - a Llama 13b model fine tuned on over 300,000 instructions! (twitter.com)

submitted 2 years ago* (last edited 2 years ago) by taters@lemmy.intai.tech to c/models@lemmy.intai.tech

0 comments fedilink