this post was submitted on 21 Jul 2025
560 points (98.8% liked)

Technology

73287 readers
3629 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] tabarnaski@sh.itjust.works 67 points 6 days ago (4 children)

The [AI] safety stuff is more visceral to me after a weekend of vibe hacking,” Lemkin said. I explicitly told it eleven times in ALL CAPS not to do this. I am a little worried about safety now.

This sounds like something straight out of The Onion.

[–] Natanael 21 points 6 days ago (1 children)

The Pink Elephant problem of LLMs. You can not reliably make them NOT do something.

[–] Jankatarch@lemmy.world 6 points 6 days ago

Just say 12 times next time

[–] Yaky@slrpnk.net 10 points 6 days ago

That is also the premise of one of the stories in Asimov's I, Robot. Human operator did not say the command with enough emphasis, so the robot went did something incredibly stupid.

Those stories did not age well... Or now I guess they did?

[–] ChaoticEntropy@feddit.uk 9 points 6 days ago

Even after he used "ALL CAPS"?!? Impossible!

[–] echodot@feddit.uk 2 points 6 days ago

It's because these people don't have a clue how AI actually works. They think it's like a human intelligence and that writing something in all caps is in some way going to give it more emphasis. They're trying to reason with something that has zero self-awareness.

[–] PlantPowerPhysicist@discuss.tchncs.de 56 points 6 days ago (1 children)

If an LLM can delete your production database, it should

[–] ohshit604@sh.itjust.works 4 points 6 days ago

And the backups.

[–] panda_abyss@lemmy.ca 49 points 6 days ago (3 children)

I explicitly told it eleven times in ALL CAPS not to do this. I am a little worried about safety now.

Well then, that settles it, this should never have happened.

I don’t think putting complex technical info in front of non technical people like this is a good idea. When it comes to LLMs, they cannot do any work that you yourself do not understand.

That goes for math, coding, health advice, etc.

If you don’t understand then you don’t know what they’re doing wrong. They’re helpful tools but only in this context.

[–] dejected_warp_core@lemmy.world 28 points 6 days ago (1 children)

I explicitly told it eleven times in ALL CAPS not to do this. I am a little worried about safety now.

This baffles me. How can anyone see AI function in the wild and not conclude 1) it has no conscience, 2) it's free to do whatever it's empowered to do if it wants and 3) at some level its behavior is pseudorandom and/or probabilistic? We're figuratively rolling dice with this stuff.

[–] panda_abyss@lemmy.ca 18 points 6 days ago (1 children)

It’s incredible that it works, it’s incredible what just encoding language can do, but it is not a rational thinking system.

I don’t think most people care about the proverbial man behind the curtain, it talks like a human so it must be smart like a human.

[–] dejected_warp_core@lemmy.world 14 points 6 days ago (2 children)

it talks like a human so it must be smart like a human.

Yikes. Have those people... talked to other people before?

[–] fishy@lemmy.today 12 points 6 days ago

Smart is a relative term lol.

A stupid human is still smart when compared to a jellyfish. That said, anybody who comes away from interactions with LLM's and thinks they're smart is only slightly more intelligent than a jellyfish.

[–] sunbytes@lemmy.world 4 points 6 days ago

Yes, and they were all as smart at humans. ;)

So mostly average but some absolute thickos too.

[–] LilB0kChoy@midwest.social 11 points 6 days ago

When it comes to LLMs, they cannot do any work that you yourself do not understand.

And even if they could how would you ever validate it if you can't understand it.

[–] vxx@lemmy.world 7 points 6 days ago* (last edited 6 days ago) (3 children)

What are they helpful tools for then? A study showed that they make experienced developers 19% slower.

[–] WraithGear@lemmy.world 7 points 6 days ago

ok so, i have large reservations with how LLM’s are used. but when used correctly they can be helpful. but where and how?

if you were to use it as a tutor, the same way you would ask a friend what a segment of code does, it will break down the code and tell you. and it will get as nity grity, and elementary school level as you weir wish without judgement, and i in what ever manner you prefer, it will recommend best practices, and will tell you why your code may not work with the understanding that it does not have the knowledge of the project you are working on. (it’s not going to know the name of the function you are trying to load, but it will recommend checking for that in trouble shooting).

it can rtfm and give you the parts you need for any thing with available documentation, and it will link to it so you can verify it, wich you should do often, just like you were taught to do with wikipedia articles.

if you ask i it for code, prepare to go through each line like a worksheet from high school to point out all the problems, wile good exercise for a practicle case, being the task you are on, it would be far better to write it yourself because you should know the particulars and scope.

also it will format your code and provide informational comments if you can’t be bothered, though it will be generic.

again, treat it correctly for its scope, not what it’s sold as by charletons.

[–] LilB0kChoy@midwest.social 7 points 6 days ago (1 children)

I'm not the person you're replying to but the one thing I've found them helpful for is targeted search.

I can ask it a question and then access its sources from whatever response it generates to read and review myself.

Kind of a simpler, free LexisNexis.

[–] panda_abyss@lemmy.ca 2 points 6 days ago

One built a bunch of local search tools with MCP and that’s where I get a lot of my value out of it

RAG workflows are incredibly useful and with modern agents and tool calls work very well.

They kind of went out of style but it’s a perfect use case.

[–] panda_abyss@lemmy.ca 4 points 6 days ago (1 children)

Vibe coding you do end up spending a lot of time waiting for prompts, so I get the results of that study.

I fall pretty deep in the power user category for LLMs, so I don’t really feel that the study applies well to me, but also I acknowledge I can be biased there.

I have custom proprietary MCPs for semantic search over my code bases that lets AI do repeated graph searches on my code (imagine combining language server, ctags, networkx, and grep+fuzzy search). That is way faster than iteratively grepping and code scanning manually with a low chance of LLM errors. By the time I open GitHub code search or run ripgrep Claude has used already prioritized and listed my modules to investigate.

That tool alone with an LLM can save me half a day of research and debugging on complex tickets, which pays for an AI subscription alone. I have other internal tools to accelerate work too.

I use it to organize my JIRA tickets and plan my daily goals. I actually get Claude to do a lot of triage for me before I even start a task, which cuts the investigation phase to a few minutes on small tasks.

I use it to review all my PRs before I ask a human to look, it catches a lot of small things and can correct them, then the PR avoids the bike shedding nitpicks some reviewers love. Claude can do this, Copilot will only ever point out nitpicks, so the model makes a huge difference here. But regardless, 1 fewer review request cycle helps keep things moving.

It’s a huge boon to debugging — much faster than searching errors manually. Especially helpful on the types of errors you have to rabbit hole GitHub issue content chains to solve.

It’s very fast to get projects to MVP while following common structure/idioms, and can help write unit tests quickly for me. After the MVP stage it sucks and I go back to manually coding.

I use it to generate code snippets where documentation sucks. If you look at the ibis library in Python for example the docs are Byzantine and poorly organized. LLMs are better at finding the relevant docs than I am there. I mostly use LLM search instead of manual for doc search now.

I have a lot of custom scripts and calculators and apps that I made with it which keep me more focused on my actual work and accelerate things.

I regularly have the LLM help me write bash or python or jq scripts when I need to audit codebases for large refactors. That’s low maintenance one off work that can be easily verified but complex to write. I never remember the syntax for bash and jq even after using them for years.

I guess the short version is I tend to build tools for the AI, then let the LLM use those tools to improve and accelerate my workflows. That returns a lot of time back to me.

I do try vibe coding but end up in the same time sink traps as the study found. If the LLM is ever wrong, you save time forking the chat than trying to realign it, but it’s still likely to be slower. Repeat chats result in the same pitfalls for complex issues and bugs, so you have to abandon that state quickly.

Vibe coding small revisions can still be a bit faster and it’s great at helping me with documentation.

[–] vxx@lemmy.world 6 points 6 days ago* (last edited 6 days ago) (1 children)

Don't you have any security concerns with sending all your code and JIRA tickets to some companies servers? My boss wouldn't be pleased if I send anything that's deemed a company secret over unencrypted channels.

[–] panda_abyss@lemmy.ca 3 points 6 days ago (1 children)

The tool isn’t returning all code, but it is sending code.

I had discussions with my CTO and security team before integrating Claude code.

I have to use Gemini in one specific workflow and Gemini had a lot of landlines for how they use your data. Anthropic was easier to understand.

Anthropic also has some guidance for running Claude Code in a container with firewall and your specified dev tools, it works but that’s not my area of expertise.

The container doesn’t solve all the issues like using remote servers, but it does let you restrict what files and network requests Claude can access (so e.g. Claude can’t read your env vars or ssh key files).

I do try local LLMs but they’re not there yet on my machine for most use cases. Gemma 3n is decent if you need small model performance and tool calls, phi4 works but isn’t thinking (the thinking variants are awful), and I’m exploring dream coder and diffusion models. R1 is still one of the best local models but frequently overthinks, even the new release. Context window is the largest limiting factor I find locally.

[–] 6nk06@sh.itjust.works 4 points 6 days ago (1 children)

I have to use Gemini in one specific workflow

I would love some story on why AI is needed at all.

[–] panda_abyss@lemmy.ca 4 points 6 days ago (2 children)

Batch process turning unstructured free form text data into structured outputs.

As a crappy example imagine if you wanted to download metadata about your albums but they’re all labelled “Various Artists”. You can use an LLM call to read the album description and fix the track artists for the tracks, now you can properly organize your collection.

I’m using the same idea, different domain and a complex set of inputs.

It can be much more cost effective than manually spending days tagging data and writing custom importers.

You can definitely go lighter than LLMs. You can use gensim to do category matching, you can use sentence transformers and nearest neighbours (this is basically what Semantle does), but LLM performed the best on more complex document input.

[–] vxx@lemmy.world 2 points 6 days ago

That's pretty much what google says they use AI for, for structuring.

Thanks for your insight.

load more comments (1 replies)
[–] mrgoosmoos@lemmy.ca 27 points 6 days ago

His mood shifted the next day when he found Replit “was lying and being deceptive all day. It kept covering up bugs and issues by creating fake data, fake reports, and worse of all, lying about our unit test.”

yeah that's what it does

[–] Transtronaut@lemmy.blahaj.zone 27 points 6 days ago

The founder of SaaS business development outfit SaaStr has claimed AI coding tool Replit deleted a database despite his instructions not to change any code without permission.

Sounds like an absolute diSaaStr...

[–] Blackmist@feddit.uk 25 points 6 days ago

The world's most overconfident virtual intern strikes again.

Also, who the flying fuck are either of these companies? 1000 records is nothing. That's a fucking text file.

[–] towerful@programming.dev 23 points 6 days ago (1 children)

Not mad about an estimated usage bill of $8k per month.
Just hire a developer

[–] Dogiedog64@lemmy.world 8 points 6 days ago

But then how would he feel so special and smart about "doing it himself"???? Come on man, think of the rich fratboys!! They NEED to feel special and smart!!!

[–] nobleshift@lemmy.world 18 points 6 days ago

So it's the LLM's fault for violating Best Practices, SOP, and Opsec that the rest of us learned about in Year One?

Someone needs to be shown the door and ridiculed into therapy.

[–] echodot@feddit.uk 9 points 6 days ago* (last edited 6 days ago) (1 children)

“Vibe coding makes software creation accessible to everyone, entirely through natural language,” Replit explains, and on social media promotes its tools as doing things like enabling an operations manager “with 0 coding skills” who used the service to create software that saved his company $145,000

Yeah if you believe that you're part of the problem.

I'm prepared to accept that Vibe coding might work in certain circumstances but I'm not prepared to accept that someone with zero code experience can make use of it. Claude is pretty good for coding but even it makes fairly dumb mistakes, if you point them out it fixes them but you have to be a competent enough programmer to recognise them otherwise it's just going to go full steam ahead.

Vibe coding is like self-driving cars, it works up to a point, but eventually it's going to do something stupid and drive to a tree unless you take hold of the wheel and steer it back onto the road. But these vibe codeing idiots are like Tesla owners who decide that they can go to sleep with self-driving on.

[–] iAvicenna@lemmy.world 3 points 6 days ago* (last edited 6 days ago)

And you are talking about obvious bugs. It likely will make erroneous judgements (because somewhere in its training data someone coded it that way) which will down the line lead to subtle problems that will wreck your system and cost you much more. Sure humans can also make the same mistakes but in the current state of affairs, an experienced software engineer/programmer has a much higher chance of catching such an error. With LLMs it is more hit and miss especially if it is a more niche topic.

Currently, it is an assistant tool (sometimes quite helpful, sometimes frustrating at best) not an autonomous coder. Any company that claims so is either a crook or also does not know much about coding.

[–] LovableSidekick@lemmy.world 13 points 6 days ago* (last edited 6 days ago)

Headling should say, "Incompetent project managers fuck up by not controlling production database access. Oh well."

[–] baduhai@sopuli.xyz 15 points 6 days ago (1 children)

Replit was pretty useful before vibe coding. How the mighty have fallen.

[–] Opisek@lemmy.world 4 points 6 days ago (1 children)

First time I'm hearing them be related to vibe coding. They've been very respectable in the past, especially with their open-source CodeMirror.

[–] Jankatarch@lemmy.world 4 points 6 days ago* (last edited 6 days ago)

Yeah they limited people to 3 projects and pushed AI into front at some point.

They advertise themselves as a CLOUD IDE POWERED BY AI now.

[–] codexarcanum@lemmy.dbzer0.com 13 points 6 days ago* (last edited 6 days ago) (1 children)

It sounds like this guy was also relying on the AI to self-report status. Did any of this happen? Like is the replit AI really hooked up to a CLI, did it even make a DB to start with, was there anything useful in it, and did it actually delete it?

Or is this all just a long roleplaying session where this guy pretends to run a business and the AI pretends to do employee stuff for him?

Because 90% of this article is "I asked the AI and it said:" which is not a reliable source for information.

[–] eestileib@lemmy.blahaj.zone 6 points 6 days ago (1 children)

It seemed like the llm had decided it was in a brat scene and was trying to call down the thunder.

[–] SkyezOpen@lemmy.world 5 points 6 days ago (1 children)

Oops I dweted evewyfing 🥺

[–] eestileib@lemmy.blahaj.zone 2 points 6 days ago

I knew it would make you mad but I did it anyway.

I don't think you have the guts to do anything about it either, vibe coder.

[–] KSPAtlas@sopuli.xyz 5 points 6 days ago

Replit is a vibe coding service now? Swear it just used to be a place to write code in projects

[–] iAvicenna@lemmy.world 3 points 6 days ago* (last edited 6 days ago)

I am now convinced this is how we will have the AI catastrophe.

"Do not ever use nuclear missiles without explicit order from a human."

"Ok got it, I will only use non-nuclear missiles."

five minutes later fires all nuclear missiles

load more comments
view more: next ›