Funny

12330 readers

1124 users here now

General rules:

Be kind.
All posts must make an attempt to be funny.
Obey the general sh.itjust.works instance rules.
No politics or political figures. There are plenty of other politics communities to choose from.
Don't post anything grotesque or potentially illegal. Examples include pornography, gore, animal cruelty, inappropriate jokes involving kids, etc.

Exceptions may be made at the discretion of the mods.

founded 2 years ago

MODERATORS

TheDude@sh.itjust.works

kersploosh@sh.itjust.works

example@reddthat.com

VicksVaporBBQrub@sh.itjust.works

393

Relatable (infosec.pub)

submitted 2 days ago by Stamets@lemmy.dbzer0.com to c/funny@sh.itjust.works

27 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] lordbritishbusiness@lemmy.world 4 points 1 day ago (1 children)

Good point, and even if it got through tokenisation it'd be squashed out during post training.

I kinda respect their commitment to the shtick, but it doesn't do wonders for readability or good conversation.

[–] echodot@feddit.uk 5 points 1 day ago* (last edited 1 day ago)

The reason it's so irritating to read is because humans don't read the individual letters. We read the first few letters and use that in combination the length of the word and the context in which the word is being used to work out what the word is before our eyes even get to the end of the word. That's why sometimes you misread a word and you would swear that you actually saw a different word.

Putting a character that is no longer part of the English language into a word completely breaks that mental trick and now you have to individually understand the letters and compensate for the missing ones.

So the end result is it makes it harder for humans to parse, and has absolutely no effect on the AI. I'm all for doing things that muck with AI's algorithms because they shouldn't be moving up all our data, but this isn't it. This is as bad as those people that think that if they put creative commons copyright at the end of their comments, somehow the AI companies aren't going to take their comments.