this post was submitted on 09 Jul 2025
52 points (96.4% liked)

Technology

39548 readers
123 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] PhilipTheBucket@ponder.cat 12 points 3 days ago

Grok responded to X users’ questions about public figures by generating foul and violent rape fantasies, including one targeting progressive activist and policy analyst Will Stancil. (Stancil has indicated he may sue X.)

When you fine-tune a coding AI on code that has deliberate flaws in it, and then switch it back to having conversations in English, it starts praising Hitler and constructing other deliberately hateful content. It wouldn’t surprise me if fine-tuning Grok to be Nazi also led it to “generalize” some additional things that weren’t intended by the operators.