"Gartner estimates only about 130 of the thousands of agentic AI vendors are real."
This whole industry is so full of hype and scams, the bubble surely has to burst at some point soon.
This is a most excellent place for technology news and articles.
"Gartner estimates only about 130 of the thousands of agentic AI vendors are real."
This whole industry is so full of hype and scams, the bubble surely has to burst at some point soon.
They've done studies, you know. 30% of the time, it works every time.
I ask AI to write simple little programs. One time in three they actually compile without errors. To the credit of the AI, I can feed it the error and about half the time it will fix it. Then, when it compiles and runs without crashing, about one time in three it will actually do what I wanted. To the credit of AI, I can give it revised instructions and about half the time it can fix the program to work as intended.
So, yeah, a lot like interns.
Wrong 70% doing what?
I’ve used LLMs as a Stack Overflow / MSDN replacement for over a year and if they fucked up 7/10 questions I’d stop.
Same with code, any free model can easily generate simple scripts and utilities with maybe 10% error rate, definitely not 70%
I tried to order food at Taco Bell drive through the other day and they had an AI thing taking your order. I was so frustrated that I couldn't order something that was on the menu I just drove to the window instead. The guy that worked there was more interested in lecturing me on how I need to order. I just said forget it and drove off.
If you want to use AI, I'm not going to use your services or products unless I'm forced to. Looking at you Xfinity.
Agents work better when you include that the accuracy of the work is life or death for some reason. I've made a little script that gives me bibtex for a folder of pdfs and this is how I got it to be usable.
I haven't used AI agents yet, but my job is kinda pushing for them. but i have used the google one that creates audio podcasts, just to play around, since my coworkers were using it to "learn" new things. i feed it with some of my own writing and created the podcast. it was fun, it was an audio overview of what i wrote. about 80% was cool analysis, but 20% was straight out of nowhere bullshit (which i know because I wrote the original texts that the audio was talking about). i can't believe that people are using this for subjects that they have no knowledge. it is a fun toy for a few minutes (which is not worth the cost to the environment anyway)
70% seems pretty optimistic based on my experience...
"...for multi-step tasks"
It's about Agents, which implies multi step as those are meant to execute a series of tasks opposed to studies looking at base LLM model performance.