Deepmind’s AI Is Learning About the Art of Coding

In the field of computer science, there is perhaps no more fundamental task than to sort. Bubble, heap, merge—take your pick. The methods for reordering data inside a computer have been theorized to death, served as practice exercises for millions of novices, and been optimized for decades by expert developers. Type a sort() function in any programming language, and it’s code you can rely on. Don’t touch it. It already works great.

But last year, an AI system developed by engineers at Google’s Deepmind improved on great by just enough to matter. The system, which Deepmind calls AlphaDev, was tasked with coming up with a new way to sort short sequences in numbers in C++, the popular coding language. It meant going under the hood and having the AI build new algorithms in assembly code—the instructions that bridge the gap between programming languages like C++ and computer hardware. When a C++ developer tells the computer to “sort,” those commands are converted into machine-readable code that tells a computer’s memory and processor exactly what to do: where to move data, and how to change it. It’s where bits meet the metal.

The experiment worked. Since April of last year, C++ has been running slightly faster, thanks to a new set of AI-concocted sorting algorithms. But according to AlphaDev’s engineers, who described the work today in Nature, that’s just the first step. “We want to optimize the entire computing stack,” says Daniel Mankowitz, a staff research scientist at Deepmind who led the sorting project. Mankowitz says that AlphaDev has already improved algorithms not just for sorting, but also for other basic tasks like hashing.

“I think this work is incredibly exciting,” says Armando Solar-Lezama, an expert in program synthesis at MIT, who wasn’t involved in the research. It’s useful to have AI come up with a new sorting algorithm; it’s a much bigger deal to build an AI that can learn how to write state-of-the-art code across a variety of tasks, he says. That means AlphaDev has started to learn something more fundamental about the art of coding itself.

That comes with significant constraints, of course. “These are tiny, tiny programs,” he adds—totaling no more than a few dozen instructions in assembly code. But those tiny programs often represent big bottlenecks for computer performance, having been optimized as far as people can push them. Overall, AlphaDev’s new C++ sorting algorithms are 1.7 percent more efficient than the prior methods when sorting long sequences of numbers, and up to 70 percent faster for five-item sequences. At scale, these improvements add up, Mankowitz says. Since the AI-written code was submitted to Libc++, a major open-source library for C++, he estimates the algorithms have been used trillions of times a day.

Those improvements are thanks to a technique called reinforcement learning, which is the same approach used to help Deepmind’s AI master games like chess and Go. This type of AI learns by doing. It works by treating a given task—like writing an assembly program—as a game, in which the AI receives rewards for making smart moves that increase the program’s efficiency. Over time, the system works to maximize this reward, resulting in a winning Go strategy or a quicker assembly program. This differs from the sort of AI found in large language models like GPT-4, which rely on huge amounts of data to learn how to write words or code. That’s great for producing writing that mirrors the tone of the internet or producing common segments of code. But it’s not so good at producing novel, state-of-the-art solutions to coding challenges the AI has never seen before.