this post was submitted on 09 Jan 2025
462 points (99.1% liked)

Opensource

3466 readers
65 users here now

A community for discussion about open source software! Ask questions, share knowledge, share news, or post interesting stuff related to it!

CreditsIcon base by Lorc under CC BY 3.0 with modifications to add a gradient



founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] FundMECFSResearch@lemmy.blahaj.zone 183 points 6 months ago (4 children)

I know people are gonna freak out about the AI part in this.

But as a person with hearing difficulties this would be revolutionary. So much shit I usually just can’t watch because open subtitles doesn’t have any subtitles for it.

[–] kautau@lemmy.world 110 points 6 months ago* (last edited 6 months ago) (1 children)

The most important part is that it’s a local ~~LLM~~ model running on your machine. The problem with AI is less about LLMs themselves, and more about their control and application by unethical companies and governments in a world driven by profit and power. And it’s none of those things, it’s just some open source code running on your device. So that’s cool and good.

[–] technomad@slrpnk.net 36 points 6 months ago (1 children)

Also the incessant ammounts of power/energy that they consume.

[–] Sixtyforce@sh.itjust.works 0 points 6 months ago (1 children)

Curious how resource intensive AI subtitle generation will be. Probably fine on some setups.

Trying to use madVR (tweaker's video postprocessing) in the summer in my small office with an RTX 3090 was turning my office into a sauna. Next time I buy a video card it'll be a lower tier deliberately to avoid the higher power draw lol.

[–] kautau@lemmy.world 2 points 6 months ago

I think it really depends on how accurate you want / what language you are interpreting. https://github.com/openai/whisper has multiple variations on their model, but they all pretty much require VRAM/graphics capability (or likely NPUs as they become more commonplace).

[–] mormund@feddit.org 40 points 6 months ago (1 children)

Yeah, transcription is one of the only good uses for LLMs imo. Of course they can still produce nonsense, but bad subtitles are better none at all.

[–] kautau@lemmy.world 2 points 6 months ago* (last edited 6 months ago) (1 children)

Just an important note, speech to text models aren't LLMs, which are literally "conversational" or "text generation from other text" models. Things like https://github.com/openai/whisper are their own, separate types of models, specifically for transcription.

That being said, I totally agree, accessibility is an objectively good use for "AI"

[–] mormund@feddit.org 1 points 6 months ago

That's not what LLMs are, but it's a marketing buzzword in the end I guess. What you linked is a transformer based sequence-to-sequence model, exactly the same principal as ChatGPT and all the others.

I wouldn't say it is a good use of AI, more like one of the few barely acceptable ones. Can we accept lies and hallucinations just because the alternative is nothing at all? And how much energy/CO2 emissions should we be willing to waste on this?

[–] hushable@lemmy.world 19 points 6 months ago

Indeed, YouTube had auto generated subtitles for a while now and they are far from perfect, yet I still find it useful.

[–] M137@lemmy.world 1 points 6 months ago* (last edited 6 months ago)

I agree that this is a nice thing, just gotta point out that there are several other good websites for subtitles. Here are the ones I use frequently:

https://subdl.com/
https://www.podnapisi.net/
https://www.subf2m.co/

And if you didn't know, there are two opensubtitles websites:
https://www.opensubtitles.com/
https://www.opensubtitles.org/

Not sure if the .com one is supposed to be a more modern frontend for the .org or something but I've found different subtitles on them so it's good to use both.