Classical programming on the verge of extinction

Classical programming on the verge of extinction

All images in this article are generated by the DALL-E 2 neural network

The end of the era of classical computer science is approaching, and most of us are like dinosaurs waiting for a meteorite to fall.

The period of my youth fell on the 80s. Back then I was programming PCs like Commodore VIC-20 and Apple IIe at home. While studying at the University of California, where I eventually received my Ph.D., the bulk of my curriculum was devoted to “classical” computer science: programming, algorithms, data structures, systems, and various languages. In its classical form, the task of computer science is to present a certain idea in the form of a program written by a person in a language such as Java, C + + or Python. Moreover, no matter how complex the idea is — starting from the algorithm for merging databases and ending with the extremely complex Paxos consensus protocol — it can be expressed in the form of a program that is understandable for humans.

In the early 90s, when I was still in university, the field of AI was in a period of deep stagnation, and it was also dominated by classical algorithms. As part of my first research at Cornell University, I collaborated with Dan Huttenlocker, a leading expert in computer vision (now dean of computing at MIT). In the computer vision course that Dan taught in the mid-90s, we never touched on topics like deep learning or neural networks – only classic algorithms like the Kenny operator, optical flow, and Hausdorff deviance. Deep learning was then in its infancy and was not yet considered a serious AI tool, let alone a serious segment of computer science.

Of course, that was 30 years ago, and a lot has changed since then, but one thing remains the same—computer science is taught as a discipline that introduces students to data structures, algorithms, and programming. I will be very surprised if in another 30, or even 10 years, we will still be looking at it from this point of view. Truth be told, I believe that computer science is undergoing a major transformation that few of us are truly ready for.

▍ Programming will lose relevance

In my opinion, the generally accepted concept of “writing a program” is doomed to be forgotten. Seriously. In most fields, except for highly specialized ones, software will soon be replaced by AI systems that do not program, but learn. In cases where we need a “simple” program (not everything requires a model that includes hundreds of billions of parameters and runs on a cluster of graphics accelerators), such a program will be generated by artificial intelligence, and not written by hand.

And it seems to me that this is a very real prospect. No doubt the pioneers of computer science emerging from the (relatively) primitive cave of electrical engineering firmly believed that all future programmers would need a deep knowledge of semiconductors, binary arithmetic, and microprocessor architecture to understand and develop software. But given the realities of today, I’m willing to bet a good amount that 99% of people who write software have little understanding of how a processor actually works, let alone the specifics of transistor design. And if we extrapolate this idea, it turns out that computer science specialists in the future will be so far removed from the classical definitions of “software” that they will hardly be able to reverse a linked list or implement quick sorting. Hell, I’m not even sure I remember how to implement it.

AI coding assistants like CoPilot are just the tip of the iceberg I’m talking about. It is absolutely obvious to me that all programs of the future will eventually be written with the help of artificial intelligence, and humans will at best be assigned the role of controller in this process. Anyone who doubts this prediction need only look at the rapid progress in the field of AI-powered content generation, such as images. The difference as well as the complexity of the results generated by DALL-E v1 and DALL-E v2, released after only 15 months, is impressive. Over the past few years of working with AI, I’ve definitely learned that it’s very easy to underestimate the potential of increasingly large-scale AI models. What seemed like science fiction a few months ago is already becoming a reality today.

So I’m not just talking about programs like CoPilot replacing programmers. It is about replacing the very principle of writing model training programs. In the future, computer science students will not need to master such intermediate skills as adding a node to a binary tree or writing code in C++. Learning such things will lose relevance, like learning to work with a logarithmic ruler.

Engineers of the future will, with a few keystrokes, launch an instance of a model that includes four quintillion parameters and carries all the baggage of human knowledge, ready to perform any task that can be given to a machine. All the intellectual effort to get the machine to do the right things will come down to coming up with the right examples, picking the right training data, and the right ways to evaluate the learning process.

Models that are powerful enough to generalize with little training will only need a few good examples of the task they need to perform. In most cases, massive human-curated datasets will no longer be needed, nor will running gradient descent cycles with PyTorch and similar tools. Learning will take place on examples, leaving the rest of the model itself.

In this new era of computer science, if we continue to call it all computer science at all, machines will become so powerful and able to perform all kinds of tasks that the field of computing itself will look less like an engineering one than an educational one. That is, we will already think mainly about how to better teach a machine by analogy with how we look for ways to effectively teach children at school. Although unlike children, these AI systems will control airplanes, power grids, and eventually, perhaps even entire countries. I am willing to bet that a significant part of classical computer science will lose its relevance when we switch from direct programming of intelligent machines to their training. This transition will mark the end of programming in its traditional sense.

▍ How does all this change our vision of the field of computer science?

Now, the atomic computing module will no longer be a processor, memory, or input-output system that implements a von Neumann machine, but a massive, trained, and highly adaptive AI model. This will be a seismic shift in our understanding of computing, which will no longer be a predictable, static process driven by instruction sets, type systems, and decision-making schemes. Computing based on AI has long since crossed the Rubicon, ceasing to obey the principles of static analysis and formal proofs. We are fast approaching a world where the fundamental building blocks of computing are mutable, mysterious, and adaptive components.

And this transition is emphasized by the fact that no one understands exactly how large-scale AI models work. People publish research that reveals new patterns of behavior in such models, even though these systems were designed by humans. Large-scale AI models are able to perform tasks they were not directly trained to do, which should scare the hell out of Nick Bostrom and all those who (with good reason) fear that AI will suddenly get out of control and start dictating its own rules.

To date, we have no way to determine the limits of the capabilities of existing AI systems, other than purely empirical research. What can we say about the future models, which will turn out to be orders of magnitude larger and more complex than the current ones – one can only wish us the best of luck!

The shift in focus from programs to models should be obvious to anyone reading modern machine learning. These works rarely touch upon the code or systems underlying the discoveries they describe. The basic components of AI systems are higher-level abstractions such as attention layers, tokenizers, and datasets. A time traveler coming to us from the early 2000s would be unlikely to understand even the following few GPT-3 work propositions (including 75 pages!) describing the software created for this model:

“We use the same model and architecture as GPT-2 [RWC+19], including the modified initialization, pre-normalization, and reversible tokenization described therein. An exception is, by analogy with the Sparse Transformer model [CGRS19]. To study the dependence of model training efficiency on model size, we train 8 models that vary in size by more than three orders of magnitude, spanning from 125 million to 175 billion parameters. At the same time, we call the last model GPT-3. In previous work [KMH+20] it is assumed that with a sufficient amount of training data, the increase in validation losses should be approximately in a smooth static dependence on the size; Training multiple models of different sizes allows us to test this hypothesis with respect to both validation losses and subsequent language tasks.”

This shift in the inner understanding of computing opens us up to enormous opportunities, and with them, enormous risks. Although I personally believe that it is time to accept the most likely future and develop a mindset according to it, and not just sit and wait for a “meteorite fall”.

Discounts, raffle results and news about the RUVDS satellite — in our Telegram channel 🚀

Related posts