Since LLMs like ChatGPT, Google Gemini, and Bing Chat have been publicly available, we’ve seen a lot of chatter about AI eventually replacing humans at work. While this isn’t completely true, the fact is that you can enter a simple query and get anything from an article to a piece of code.
The ability to generate code snippets by entering a simple query is both a good and a bad thing. While this makes coding accessible to more people, it risks potentially putting software engineers out of a job. Devin, the first AI software engineer developed by Cognition Labs, has further fueled this fire.
Devin’s capabilities, as claimed by Cognition Labs, are far superior to ChatGPT or Gemini. While the latter are generalised models that can do anything the user asks, Devin is built to write code from the ground up, making it both very exciting and worrying for some.
In the News: Amazon introduces generative AI for seller listing creation
What is Devin?
As mentioned before, Devin is the first of its kind AI software engineer, as claimed by its creator, Cognition Labs, who can work alongside human software engineers or independently. Not much is known about the tool now, considering Cognition only released an SWE-bench evaluation to show performance differences against other LLMs and a general rundown of its capacities.
Devin comes with its own shell, code editor, and even a browser—all within a sandboxed environment, much like a human software engineer. That said, it can also report its progress in real-time, accept feedback, and work with a human as needed.
However, some of Devin’s most exciting capabilities include the ability to recall any relevant context as it goes through a step-by-step process to complete its task, fix any mistakes that it might have made during the process, including debugging in a very similar way, a human would and learn over time.
This means that not only can you rely on Devin to create and even deploy apps end-to-end, but it can also go beyond and do things like training and fine-tuning its own AI models, learn how to use new technologies, autonomously find and fix bugs and codebases, address bugs and feature requests in open source repositories, and even do real jobs from Upwork.
As mentioned, Cognition Labs has yet to release a more detailed technical report soon. It has released a Google Form allowing users to send their engineering work requests to Devin.
How does Devin fare against the competition?
Devin isn’t the first AI tool to generate working code. ChatGPT, Gemini (formerly Bard), Github Copilot, and others have already been around for a while, generating code snippets and suggestions for developers and engineers to use during the process. However, Devin might just be the first to handle tasks end-to-end.
Cognition Labs has released Devin’s results on the SWE-bench, a benchmark that asks these AI tools to fix real-world issues found on open-source Github repositories. Devin was able to correctly resolve 13.86% of the issues end to end, exceeding the previous resolution percentage of 1.96% set by Claude 2 + BM24 Retrieval.
It’s important to note that while both Devin and Claude 2 were unassisted, Devin was evaluated on a random 25% subset of the dataset. When assisted, Claude 2’s resolution percentage was 4.80%. The graph below shows how other models fared in comparison to Devin.

What the aforementioned jargon means is that Devin will be far more capable of identifying and solving problems in code than existing models. This is exciting news because if Devin comes out as promised, it can resolve many headaches and save many hours that software engineers otherwise would spend hunting down bugs in their codebases, in addition to all the other things the bot can do.
Is Devin a hoax or a hack?
We ask this question every time a new AI tool launches: “Will X AI replace X profession?” The answer is always the same: no, they won’t, at least not anytime soon. That said, they are tools, meaning professionals will use them to produce more work, and that’s a different story altogether for people looking to get started in said profession.
In the case of Devin, the short answer is no, it won’t. As good as Devin might be, human software engineering jobs aren’t going anywhere. That said, if Devin comes out with the same capabilities as Cognition Labs, which is promising, it has the potential to start a rat race among software engineers and programmers.
Like the tools before it, Devin is simply a more advanced way of writing and debugging code. Developers who adapt early can use it to produce more, potentially better-quality code in a shorter duration of time—security and privacy implications aside.

For larger corporations, this means fewer human hirings, as a handful of senior engineers will work with multiple Devins (so to speak). They can manage codebases and develop features that previously required big teams, potentially consisting of hundreds of people.
Smaller companies and startups can also benefit from this, as they don’t have to hire expensive engineers to build and maintain codebases and software products. In my opinion, smaller companies will benefit more from Devin, as they can move at a much faster pace without having to spend the time and money to hire the right people and wait for them to get up to speed with their existing infrastructure.
Devin won’t replace humans overnight, but it will surely cut some jobs for entry-level programmers. It’s also important to consider that not all programming jobs are the same, and some will be affected more than others. Fields like cybersecurity or other code-sensitive areas that require pure programming mastery should remain unaffected for the most part.
That said, there is good news for those looking into programming. While Devin won’t automatically give anyone the ability to code anything, it does bring them very close. This lowers the skill ceiling required to grasp programming basics, as you can now learn at breakneck speeds with Devin at your back. AI brings all the benefits it did to corporations to people looking to get into programming. That said, ideally a programmer should be able to do everything Devin can, only then can they yield it to its full potential.
However, that’s just learning to write code. The job market can be different from a business perspective; there just won’t be as much demand for programming roles. We’ll just have to wait and see.
In the News: EU opens the gates to regulate artificial intelligence with AI Act