Linux

10174 readers

551 users here now

A community for everything relating to the GNU/Linux operating system (except the memes!)

Also, check out:

Original icon base courtesy of lewing@isc.tamu.edu and The GIMP

founded 2 years ago

MODERATORS

Ategon@programming.dev

anzo@programming.dev

dwraf_of_ignorance@programming.dev

164

TIL: There is an open source "Alexa replacement" project (www.openvoiceos.org)

submitted 1 day ago by cm0002@libretechni.ca to c/linux@programming.dev

37 comments fedilink hide all child comments

As Snowden told us, video and audio recording capabilities of your devices are NSA spying vectors. OSS/Linux is a safeguard against such capabilities. The massive datacenter investments in US will be used to classify us all into a patriotic (for Israel)/Oligarchist social credit score, and every mega tech company can increase profits through NSA cooperation, and are legally obligated to cooperate with all government orders.

Speech to text and speech automation are useful tech, though always listening state sponsored terrorists is a non-NSA targeted path for sweeping future social credit classifications of your past life.

Some small LLMs that can be used for speech to text: https://modal.com/blog/open-source-stt

you are viewing a single comment's thread
view the rest of the comments

[–] brucethemoose@lemmy.world 1 points 1 day ago* (last edited 1 day ago) (1 children)

It still uses memory bandwidth, unfortunately. There's no way around that, though NPU TTS would still be neat.

...Also, generally, STT responses can't be streamed, so you mind as well use the iGPU anyway. TTS can be chunked I guess, but do the major implementations do that?

[–] fonix232@fedia.io 2 points 1 day ago (2 children)

Piper does chunking for TTS, and could utilise the NPU with the right drivers.

And the idea of running them on the NPU is not about memory usage but hardware capacity/parallelism. Although I guess it would have some benefits when I don't have to constantly load/unload GPU models.

[–] brucethemoose@lemmy.world 2 points 21 hours ago (1 children)

Oh, I forgot!

You should check out Lemonade:

https://github.com/lemonade-sdk/lemonade

It's supports Ryzen NPUs via 2 different runtimes... though apparently not the 8000 series yet?

[–] fonix232@fedia.io 1 points 20 hours ago (1 children)

I've actually been eyeing lemonade, but the lack of Dockerisation is still an issue... guess I'll just DIY it at one point.

[–] brucethemoose@lemmy.world 1 points 15 hours ago* (last edited 15 hours ago) (1 children)

It's all C++ now, so it doesn't really need docker! I don't use docker for any ML stuff, just pip/uv venvs.

You might consider Arch (dockerless) ROCM soon; it looks like 7.1 is in the staging repo right now.

[–] fonix232@fedia.io 2 points 8 hours ago

Due to the fact I am running UnRaid on the node in question, I kinda do need Docker. I want to avoid messing with the core OS as much as possible, plus a Dockerised app is always easier to restore.

[–] brucethemoose@lemmy.world 2 points 1 day ago* (last edited 1 day ago)

Yeah... Even if the LLM is RAM speed constrained, simply using another device to not to interrupt it would be good.

Honestly AMD's software dev efforts are baffling. They've focused on a few on libraries precisely no-one uses, like this: https://github.com/amd/Quark

While ignoring issues holding back entire sectors (like broken flash-attention) with devs screaming about it at the top of their lungs.

Intel suffers from corporate Game of Thrones, but at least they have meaningful contributions in the open source space here, like the SYCL/AMX llama.cpp code or the OpenVINO efforts.