this post was submitted on 21 May 2025
43 points (92.2% liked)

Technology

70249 readers
3276 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 2 comments
sorted by: hot top controversial new old

Great article, thanks for sharing it OP.

For example, the Anthropic researchers who located the concept of the Golden Gate Bridge within Claude didn’t just identify the regions of the model that lit up when the bridge was on Claude’s mind. They took a profound next step: They tweaked the model so that the weights in those regions were 10 times stronger than they’d been before. This form of “clamping” the model weights meant that even if the Golden Gate Bridge was not mentioned in a given prompt, or was not somehow a natural answer to a user’s question on the basis of its regular training and tuning, the activations of those regions would always be high.

The result? Clamping those weights enough made Claude obsess about the Golden Gate Bridge. As Anthropic described it:

If you ask this “Golden Gate Claude” how to spend $10, it will recommend using it to drive across the Golden Gate Bridge and pay the toll. If you ask it to write a love story, it’ll tell you a tale of a car who can’t wait to cross its beloved bridge on a foggy day. If you ask it what it imagines it looks like, it will likely tell you that it imagines it looks like the Golden Gate Bridge.

Okay, now imagine you're Elon Musk and you really want to change hearts and minds on the topic of, for example, white supremacy. AI chatbots have the potential to fundamentally change how a wide swath of people perceive reality.

If we think the reality distortion bubble is bad now (MAGAsphere, etc), how bad will things get when people implicitly trust the output from these models and the underlying process by which the model decides how to present information is weighted towards particular ideologies? Considering the rest of the article, which explores the way in which chatbots attempt to create a profile for the user and serve different content based on that profile, now it will be even easier to identify those most susceptible to mis/disinformation and deliver it with a cheery tone.

How might we, as a society, create a process for conducting oversight for these "tools"? We need a cohesive approach that can be explained to policymakers in a way that will call them to action on this issue.

[–] Dojan@pawb.social 5 points 15 hours ago* (last edited 15 hours ago)

Figures that a slop company’s CEO wouldn’t have words of his own, rather have the machine generate slop for him, but this stuck out

We can’t stop the bus, but we can steer it …

What a bullshit statement. It’s not that you can’t stop it, but rather that you won’t stop it. It’s an active choice you’re making, and not a compulsion like in the case of a kleptomaniac. Machine learning isn’t some natural force, it’s entirely man-made and we can stop whenever we want.

No, you’re not uncontrollable kleptomaniacs, you’re just doing the same shit the rich elite has always done; you exploit the world around you and you get away with it scot-free.

Also, you can stop a bus. It’s an integral part of how they fucking operate. Not that I’d expect a CEO to have ever been on one.