Why I'm Betting Against AI Agents in 2025 (Despite Building Them)
Why I'm Betting Against AI Agents in 2025 (Despite Building Them)
utkarshkanwat.com
Why I'm Betting Against AI Agents in 2025 (Despite Building Them)
Why I'm Betting Against AI Agents in 2025 (Despite Building Them)
Why I'm Betting Against AI Agents in 2025 (Despite Building Them)
Performing procedural tasks using a statistical model of our language will never be reliable. There’s a reason why we use logical and proscriptive syntax when we want deterministic outcomes.
I expect what we will see are tools where the human manages high level implementation, and the agents are used to implement specific functionality that can be easily tested and verified. I can see something along the lines of a scene graph where you focus on the flow of the code, and farm off details of implementation of each step to a tool. As the article notes, these tools can already get over 90% degree accuracy in these scenarios.
I agree that this could be helpful for finally getting to that natural language programming paradigm that people have been hoping for. But there’s going to have to be something capable of logically implementing the low level code and that has to be more than just a statistical model of how people write code, as trained on a big collection of random repositories. (I’m open to being proved wrong about that!)
The 90% accuracy could just arise from the fact that the tests are against trivial or commonly solved tasks, so that the exact solutions to them exist in the training set. Anything novel will exist outside the training set and outside of the model.
Well said. I understood next to none of it, but well said.