Answer.AI - Thoughts On A Month With Devin

2025-01-17

AI-news are a rollercoaster these days. Some days companies claim that they're going to replace engineers with AI. Then people actually try to use AI tools that promise to replace programmers and this is the result:

Working with Devin showed what autonomous AI development aspires to be. The UX is polished - chatting through Slack, watching it work asynchronously, seeing it set up environments and handle dependencies. When it worked, it was impressive.

But that’s the problem - it rarely worked. Out of 20 tasks we attempted, we saw 14 failures, 3 inconclusive results, and just 3 successes. More concerning was our inability to predict which tasks would succeed. Even tasks similar to our early wins would fail in complex, time-consuming ways. The autonomous nature that seemed promising became a liability - Devin would spend days pursuing impossible solutions rather than recognizing fundamental blockers.

This reflects a pattern we’ve observed repeatedly in AI tooling. Social media excitement and company valuations have minimal relationship to real-world utility. We’ve found the most reliable signal comes from detailed stories of users shipping products and services. For now, we’re sticking with tools that let us drive the development process while providing AI assistance along the way.

Link

In How I program with LLMs David Crawshaw describes a totally different way to use LLMs for programming that works well for him. The key difference is that David uses LLMs are a tool that are expected to help, not replace, programmers.