Bret Victor: The Future of Programming

December 14, 2022

This talk goes in the category of (and I stole this quote from the Handmade Network Discord):

These talks sometimes make me feel like I’m in one of those alternate history shows, like Man in the High Castle. There’s a timeline where interesting ideas continued to flourish and we’re not in it.

Video: “The future of programming”.

If you’d prefer to read, here is the transcript1.

Transcript start

Good afternoon. I’m really excited to be talking to so many programmers of automatic computing machines. [laughter] As you know, computing is becoming more and more important in our society. There’s now literally thousands of computers around the world. [laughter] And they’re being used for everything from business and accounting to scientific experiments and, you know, who knows what they’re going to be used for. So, more and more computers, and coming down in price, coming down in size. A computer used to be the size of this entire hall and now they’ve shrunk to really tiny proportions. [laughter] So it’s this time of really rapid change in the field of computing. So I thought it would be interesting to kind of look ahead a little bit to the future of programming [laughter] and think about, given what we know now, what programming might be like, say, forty years from now.

[01:12]

So, what I’m going to talk about today: there’s been some really interesting research that’s been done just in the last ten years or so, and I think it’s going to have a lot of implications for what programming might be like in the future. So there’s four big ideas—kind of four big topics—that I want to talk about that have kind of come out of recent research. But before I get to those four big ideas, I want to talk about the nature of adopting ideas in the first place. So basically what we notice is that technology changes quickly; people’s minds change slowly. So it’s easy to adopt new technologies; it can be hard to adopt new ways of thinking. So, technology-wise, Gordon Moore—he has this company called Intel—and [laughter] he observed about ten years ago that computing capacity was increasing exponentially over time. And he kind of extrapolated this out to about now, and he’s been right on target, and who knows how long this will keep on going, but it seems pretty reasonable. So the lesson to be drawn from this is that we can kind of take this for granted. If we just kind of wait, our computers get faster, they get more capable. We can just wait, and that’s just going to happen. What won’t just happen if we wait is people changing, people adopting new ideas.

[02:32]

So as an example of that: I’m sure you all remember this guy—the old IBM 650. It’s the, you know, IBM’s first kind of general-purpose mass-produced computer. A lot of us cut our teeth programming on this guy. And in the beginning, right, we all programmed in absolute binary. When we coded, it was literally writing in numeric codes for each instruction. And that was what we did. That was programming. And then, after some years of that, Stan Polley came along and he invented this thing that he called an assembler. So that was the Symbolic Optimizing Assembly Program—this language where you could write in words. If you wanted the computer to add something you’d write the word “add.” You could use symbolic variable names instead of hard-coding memory addresses. It’s this much more powerful way of thinking about programming. You were much more productive, made much fewer errors. Assembly was shown to these guys—the guys coding in binary—and they just weren’t interested at all. They just didn’t get it. They didn’t see any value in doing this stuff. So there can be a lot of resistance to new ways of working that require you to, kind of, unlearn what you’ve already learned and think in new ways. And there can even be, like, outright hostility.

[03:50]

So Johnny Von Neumann—the great scientist who invented the Von Neumann computer architecture that we use and so many other things—he said, “I don’t see why anybody would need anything other than machine code.” And so one time, he had a bunch of students, and the students were all kind of coding along in binary, and one time one of his students took a little time out to write his own little assembler so he could write in assembly. And Von Neumann was furious at him—furious that he would waste precious machine time doing the assembly. That was clerical work; that was supposed to be for people, right? And so we saw the same story happen just a little bit later when John Backus and friends came up with this idea that they called Fortran—this so-called high-level language where you could write out your formulas as if you’re writing mathematical notation, and you could write out loops. And this was shown to the assembly programmers and, once again, they just, they weren’t interested. They didn’t see any value in that. They just didn’t get it.

[4:50]

So I want you to keep this in mind as I talk about the four big ideas that I’m going to talk about today: that it’s easy to think that technology’s always getting better because of Moore’s Law, because the computers are always getting more capable, but ideas that require people to unlearn what they’ve learned and think in new ways—there’s often an enormous amount of resistance. People over here—they think they know what they’re doing, they think they know what programming is—this is programming; that’s not programming. And so there’s going to be a lot of resistance to adopting new ideas.

[05:27]

The four ideas that I want to talk about today. This is all coming out of very recent research. The first one is: today we write our programs in code. We write basically a list of instructions for the computer to do. There’s been some really interesting research on direct manipulation of data where you directly manipulate the data structures, and that implicitly builds up a program for the computer to follow.

The second thing I want to talk about. Today we write procedures. Basically, here’s a procedure for the computer to do. Interesting research on programming using goals—telling the computer what you want, not how to do it. The computer itself kind of figures out how to do it.

Number three. Today we program using lines of text and text files. People are doing something really remarkable. Just, like, in the last five or ten years, they’re hooking video displays up to computers. And when you do that, everything changes. You can start thinking about spatial representations of information.

And the fourth thing is we program in a sequential programming model. Basically, here’s a bunch of instructions; the computer does them one after the other. But hardware’s changing. Soon we’re going to see massively parallel hardware, and we’re going to need a sound parallel programming model to program on that hardware.

[06:40]

So the first thing I’m going to talk about is that direct manipulation of data. And I’m going to show this project that Ivan Sutherland did—it’s his PhD thesis about ten years ago—this system called Sketchpad. And Sketchpad was a system that allowed you to draw pictures on a video display. So he took his light pen and put it on the screen. So he drew that line, drew that line, kind of drew some more lines, drew this little top. He’s trying to draw a rivet here. And he’s drawing it really sloppily—it’s kind of tilted off to the side, it’s kind of misshapen. So what he does is he then, he holds down a switch and indicates a couple of these lines to the computer system, indicating that he wants these lines to be mutually perpendicular. So the system runs an iterative solver—kind of wheels the lines around—and figures out how to turn them into something that’s mutually perpendicular, how to turn them into a rectangle. So, basically, the system doesn’t know anything about rectangles. He was able to get it to draw a rectangle by directly applying a set of constraints. And what makes this a program, as opposed to just a picture, is that these constraints are dynamically maintained. So he’s got his rivet that he drew and he resizes this corner of it and kind of resizes some other things—the solver kicks back in and turns it back into a perfect rectangle. So, essentially, he’s created a program that draws a rectangle, but he didn’t do it by writing code. He did it by directly manipulating the data and directly applying a set of constraints to them. And so this is kind of a simple example. And then he went off and did fancier things. Like here’s a bridge simulation. So it actually simulates the physics of a bridge. He drew this by hand. The numbers here represent the tension in those particular spans of the bridge. And he can vary the weight—that’s the load that’s hanging off from the center of the bridge—and it kind of deforms. And the Sketchpad system doesn’t know anything about bridges. He created this bridge simulation program by directly drawing it and by directly applying a certain set of very general constraints.

[08:57]

So I definitely see this as something that’s going to be really important in 30-40 years from now. I can imagine programming by directly manipulating data structures and letting that build up the code. But especially for things that are visual or physical like this. So if, say, in a few decades we get some sort of document format on some sort of web of computers, I guess, [laughter] I’m sure we’re going to be creating all those documents by direct manipulation. There won’t be any, like, markup languages or style sheet languages, right? That would make no sense. Ivan Sutherland showed us how to do it back here in 1962. So it’s all going to be direct manipulation in the future and that’s going to be fantastic.

[09:42]

So the second thing I wanted to talk about is programming using goals. So we saw a little bit of this with Sketchpad’s constraint system. So Ivan Sutherland wanted to draw a rectangle. He didn’t write a procedure to, like, draw each side of the rectangle. He applied a set of constraints and the system itself kind of figured out how to draw that rectangle. So he kind of said what he wanted: “I want things to be mutually perpendicular.” He didn’t say how to do it. The solvers figured out how to do it. So another great example of that—that just came up a few years ago—Carl Hewitt is doing this system called Planner, which is really great. It actually goes in both directions. So it can reason forward procedurally; it can reason backwards from goals. So if you tell Planner that apples are red; then, if you give it an apple, it knows, “Ah ha! It’s red.” But you can also say, “I want something red” and it’ll say, “Oh, let’s try an apple.” So you can express your program in terms of the goals—the results that you want from the program—but you can also provide procedural strategies for meeting certain types of subgoals. Really interesting way of thinking about programming.

[10:52]

And this led to another system a few years later called Prolog which just kept the backwards part of Planner. So in Prolog you can express your program as goals, and the system itself uses search or whatever to try to figure out how to meet those goals. So this is leading to a genre of programming that’s called logic programming, but that’s not really the important part here. What’s important is expressing your program as what you want it to do, not a set of instructions on how to do it—letting the computer itself figure out how to do it.

[11:24]

Another example of that same sort of concept is pattern matching. So I’m sure you remember—you all remember—SNOBOL [StriNg Oriented and symBOlic Language]. It’s the text manipulation language. If you have a bunch of checks you want to shoot through, you throw a SNOBOL script or program at it. And SNOBOL had built-in features for pattern matching. So you could express patterns that you want to match against the text. A little bit later Ken Thompson—he’s over at Bell Labs working on this system they call UNIX [laughter]. Um, I know, right, “UNIX”? But he adopted Kleene’s notion of regular expressions to do pattern matching on text. So, when you have pattern matching, if you want to digest a big chunk of text, you don’t go and write a parser that kind of goes procedurally through it, you provide a pattern, this kind of template—this is the sort of thing I’m looking for—and the system itself figures out how to match the text against that pattern.

[12:20]

So all of these examples: Sketchpad’s constraints, Planner, Prolog, pattern matching—again, they’re all examples of giving the computer high-level goals, saying, “Here’s the sort of thing I’m looking for” and letting the computer itself figure out how to do it. And we’re seeing a little bit of that sort of thing in optimizing compilers, but I think it’s going to be really prevalent in a few decades from now. And the reason that this is going to be so important—this goal-directed stuff—has to do with this idea that Licklider’s kicking around. So, as you all know, Licklider is heading up ARPA [Advanced Research Projects Agency], the government funding agency, and he’s been pushing this idea of a global network of computers. Just taking all the computers in the world and hooking them up to each other. And he calls it the intergalactic computer network. [laughter] ‘Cause he knows that engineers always deliver the minimum, so if he asks for a network that spans the galaxy he hopes to at least get one that spans the world. [laughter] And people are calling this the ARPANET now; it’s, you know, turning into some sort of inter-net. I don’t know, it’s kind of a cute idea. It might work. And when you have this kind of global network of computers, you run into what Licklider calls the “communicating with aliens problem.” So he put it here: “The problem is essentially the one discussed by science fiction writers: “how do you get communications started among totally uncorrelated “sapient” beings?” And I’ll explain what he means by that.

[13:54]

So, say you’ve got this network of computers, and you’ve got some program out here that was written by somebody at some time in some language; it speaks some protocol. You’ve got another program over here written by somebody else some other time, speaks a totally different language, written in a totally different language. These two programs know nothing about each other. But at some point, this program will figure out that there’s a service it needs from that program, that they have to talk to each other. So you’ve got these two programs—don’t know anything about each other—written in totally different times, and now they need to be able to communicate. So how are they going to do that? Well, there’s only one real answer to that that scales, that’s actually going to work, which is they have to figure out how to talk to each other. Right? They need to negotiate with each other. They have to probe each other. They have to dynamically figure out a common language so they can exchange information and fulfill the goals that the human programmer gave to them. So that’s why this goal-directed stuff is going to be so important when we have this internet—is because you can’t write a procedure because we won’t know the procedures for talking to these remote programs. These programs themselves have to figure out procedures for talking to each other and fulfill higher-level goals. So if we have this worldwide network, I think that this is the only model that’s going to scale. What won’t work, what would be a total disaster, is—I’m going to make up a term here, API [Application Programming Interface]—this notion that you have a human programmer that writes against a fixed interface that’s exposed by some remote program. First of all, this requires the programs to already know about each other, right? And when you’re writing this program in this one’s language, now they’re tied together so the first program can’t go out and hunt and find other programs that implement the same service. They’re tied together. If this one’s language changes, it breaks this one. It’s really brutal, it doesn’t scale. And, worst of all, you have—it’s basically the machine code problem. You have a human doing low-level details that should be taken care of by the machine. So I’m pretty confident this is never going to happen. We’re not going to have API’s in the future. What we are going to have are programs that know how to figure out how to talk to each other, and that’s going to require programming in goals.

[16:25]

The third big idea that I want to talk about is spatial representation of information. So, today, our programs are basically lots of lines of text—a big file full of lines of text. And that makes sense when your program is on a stack of punch cards, or it’s a paper tape or a magnetic tape—this very linear media—it makes sense to have your program that’s kind of linear form. If you’re using a teletype then a teletype is made for spitting out lines of text—that’s what it does. So of course your programs are going to be in lines of text. But, as I mentioned, people are doing something really wild and crazy right now, which is hooking video displays up to computers. And when you have a video display hooked up to a computer, you can start thinking of your computer as kind of this very dynamic sheet of paper where you can represent things spatially.

[17:20] So Doug Engelbart, over at SRI [Stanford Research Institute], has this system that he calls “online system”—NLS. He gave a big demo five years ago. You might have seen it. And there’s a lot of really remarkable things about the system. One of the most remarkable is this notion of displaying information over a screen—over a video screen. So he has this device called a mouse where you kind of roll it around on the table and it’s kind of hard to explain, but you can use this to point to different parts of the screen and indicate that you want more information about something that you’re pointing to. And it also has this notion of different views of information. So you can see here he has some data that’s in a list, and they can flip that over and look at that same data as this kind of two-dimensional diagram. So really starting to think about how can we represent information—dynamic information—spatially.

[18:16]

Another great system, kind of about the same time, coming out of the RAND Corporation, was called GRAIL. And this is a system for programming using flowcharts on a video display. And the input device here is a stylus on a tablet. And you can draw up these flowcharts. And let me show you how that works. The programmer is drawing this box—and just kind of totally freehanding it—and he draws a box, and the system recognizes that as a box and turns it into a flowchart box, so it assigns semantic meaning to these drawings that he’s doing. He wants to give it a label so he just starts writing letters. This system recognizes his handwriting. It’s 1968. The system recognizes his handwriting, turns it into text. Here he connects up this box to that one with a line. And so on. So it’s all very direct manipulation. If he wants to get rid of this line he just kind of scribbles it out and it goes away. And, so, really thinking about what programming means when you have a video display, when you can express things in two dimensions. But when I’m talking about spatial representation of information I’m not just talking about things like flowcharts.

[19:33]

So Xerox has a little research center in Palo Alto. There’s some kids over there working on something that they call Smalltalk. And in Smalltalk the source code is expressed in text, but there’s no, like, big long text file with a whole bunch of code in it. It’s organized in a spatial fashion. So here’s what they call a browser. So in this list here: here’s all the collection of classes, here’s all the classes in that collection, here’s all the protocols in that class, here’s all the methods in that protocol, and here’s the source code for that particular method. So the method definitions are text, but they’re not one huge line of text. They’re organized spatially so you can get around the system very quickly and see what’s going on.

[20:20]

So between Engelbart’s NLS and GRAIL, Smalltalk—these very different ways of representing information spatially—so I’m totally confident that in forty years we won’t be writing code in text files. Right? We’ve been shown the way.

[20:38]

And, as a side note, all of these systems I just showed you—Engelbart’s system, GRAIL, Smalltalk, this thing that’s going on at University of Illinois called PLATO [Programmed Logic for Automatic Teaching Operations], also a really interesting system—these are part of this new wave of interactive computing where you sit down at the computer and you’re actually interacting with the computer in real time. And these guys know that they’re trying to prove out this new concept, and so they’ve designed the system from the very bottom to have an immediate response. The user interface is always immediately responsive—you interact with anything, you immediately get a response. So it’s kind of simulating a physical object. And so if interactive computing takes off—and I think it will—then I think it’s pretty obvious that in forty years our user interfaces, if you interact with them, you’ll never experience any kind of delay or lag. Right? Because these guys proved how important it is to have an immediately responsive UI. And they were doing this in the 60s, so as our computers get a million times faster, obviously there’s no reason to have any kind of delay or lag in the operating system in the user interface. So that’s going to be really exciting.

[21:26]

The fourth thing that I want to talk about is parallel programming. So today our programs are basically a sequence of instructions: computer, do this, then do that, then do that, then do that. And one of the reasons that we program in the sequential model has to do with the hardware. So we’ve been using this computer architecture called the Von Neumann computer architecture where you have a processor, and then it’s hooked up to a big memory and it’s fetching words from memory. And so the sequential programming model makes sense when you just have one processor. A processor can only do one thing at a time. You give this list of things for the processor to do and it just kind of does each one of those. One characteristic of the Von Neumann architecture, though, is that most of this memory is idle most of the time. So you’ve got this little processor over here, and it’s kind of processing as fast as it can, but only one word of memory is ever being accessed. The rest of the memory is just kind of sitting there waiting. And that works when your processor is made out of vacuum tubes or relays and it’s kind of big and expensive, and your memory is made out of core or a rotating drum. It’s also big and expensive and different. Then you can kind of get away with that. But we’re starting to see an incredible invention coming into the field of computing right now which is, I think, going to change everything. And that is the integrated circuit, semiconductor integrated circuit.

[23:34]

So this is a thing that the company called Intel made. It’s called a microprocessor. And it’s an entire processor on a single piece of silicon. So the entire processor’s just made out of transistors. And a transistor is just a little picture that you etch into silicon. And the entire circuit is just, like, one big complicated picture that you etch into the silicon. So our processors are just made out of transistors and silicon. Our memories as well are also going to be made of transistors on silicon. It’s all the same stuff. So when you look at the Von Neumann architecture from that perspective, you’ve got these transistors over here that are working really hard—they’re processing things—and you’ve got this huge array of transistors over here, most of which are just kind of sitting, waiting. They’re not processing. They’re not doing anything. And so if you want to put those transistors to work, if you actually want to maximize the amount of processing that you’re going to get out of the piece of silicon, you need to start looking at things that are more like this. Right? So what computers want to be on silicon is they want to be lots of little computers, like a huge array of tiny little computers with their own processors, their own little state, kind of doing their own thing, communicating with each other. That’s how you maximize the amount of compute per area of silicon, and it scales, so when the transistor gets smaller—when the silicon die area gets bigger—you have all this extra space, you just fill it up with more processors. Right? Done. Really easy. So this is the kind of architecture that we’re going to be programming on in the future. Unless, you know, unless Intel somehow gets a stranglehold on the [word unintelligible] processing market and kind of pushes this architecture forward for thirty years. But that’s not going to happen. We’re going to be programming on these things.

[25:28]

And when you have this hardware you have to start thinking about: how do we program on that? What’s our programming model for this sort of hardware. And the way that we do programming today is with threads and locks, right? You have a few sequential threads of control, and you kind of pretend that they’re going in parallel by multiplexing them onto a processor, and they try to lock each other out from shared resources, and, like, this is never going to work, right? This does not scale. You can’t reason about hundreds of threads all banging on the same shared memory at the same time. Threads and locks—they’re kind of a dead end, right? So I think if in forty years we’re still using threads and locks, we should just, like, pack up and go home, ‘cause we’ve clearly failed as an engineering field. [laughter]

[26:13]

So if it’s not threads and locks, then what’s going to work? And Carl Hewitt—that’s the same Carl Hewitt that did Planner—came up with this idea that he called the actor model. So the actor model is a model of computation that’s inspired by physics. So in physics you’ve got all these particles, and all the particles are just independently doing their own thing and they have their own little state and they interact with the ones that are around them. And Carl Hewitt was thinking of computer processes in the same way. You’ve got a whole bunch of processes and they’re all kind of asynchronously doing their own little thing, they have their own little state and they’re sending messages to each other. So—really interesting, really new and exciting way of thinking about programming. So it’s kind of heating up right now. Gilles Khan over in France has some ideas. I think Tony Hoare’s getting into it with something he’s going to be calling communicating sequential processes, and maybe even Robin Milner’s going to join the party. So really exciting stuff happening here.

[27:11]

Now for this talk the details of these particular models don’t really matter. I do think it would be kind of cool if the actor model was, like, picked up by the Swedish phone company or something. That would be kind of weird. [laughter] But what matters here is we are going to have massively parallel hardware. We need a sound parallel programming model that fits the hardware, and something like this is going to be what we’re going to be using.

[27:48]

So those are the four things I wanted to talk about.

Direct manipulation of data: something like the Sketchpad where you’re drawing pictures, dynamically adding constraints to those pictures, directly manipulating the data structures instead of writing instructions for the program.

Programming using goals and constraints: things like Sketchpad’s constraints, Planner and Prolog, regular expressions, other types of pattern matching where you’re telling the computer what you want to do and the computer itself has solvers that figure out how to do that.

Spatial representation of information. We’re not going have text files anymore. We’re going to be representing information spatially ‘cause we have video displays.

And fundamentally parallel ways of thinking: parallel hardware, parallel programming models. No more threads and locks. No more sequential thinking.

[28:39]

So those are the four things I wanted to talk about. And, you know, I’ve tried to make some predictions about the future, and you can’t really predict the future, right? So these are some good ideas. I don’t know what’s going to happen to them. Ideas kind of split and merge and go in and out of fashion, so, you know, anything could happen. But I do think that it would be kind of a shame if in forty years we’re still coding in procedures and text files in a sequential programing model. I think that would suggest we didn’t learn anything from this really fertile period in computer science. So that would kind of be a tragedy. But even more of a tragedy than these ideas not being used would be if these ideas were forgotten. Right? If anybody were ever to be shown this stuff and actually be surprised by it. Right? But even that’s not the biggest tragedy. That’s not the real tragedy. The real tragedy would be if people forgot that you could have new ideas about programming models in the first place. So let me explain what I mean by that.

[29:50]

Here’s what I think the worst case scenario would be: is if the next generation of programmers grows up never being exposed to these ideas. The next generation of programmers grows up only being shown one way of thinking about programming. So they kind of work on that way of programming—they flesh out all the details, they, you know, kind of solve that particular model of programming. They’ve figured it all out. And then they teach that to the next generation. So that second generation then grows up thinking: “Oh, it’s all been figured out. We know what programming is. We know what we’re doing.” They grow up with dogma. And once you grow up with dogma, it’s really hard to break out of it. Do you know the reason why all these ideas and so many other good ideas came about in this particular time period—in the 60s, early 70s? Why did it all happen then? It’s because it was late enough that technology had kind of got to the point where you could actually kind of do things with the computers, but it was still early enough that nobody knew what programming was. Nobody knew what programming was supposed to be. And they knew they didn’t know, so they just, like, tried everything. Their minds were totally free and they just, like, said, “Maybe we could program like this. Maybe we could program like that.” They just, you know, tried anything they could think of.

[31:33]

So the most dangerous thought that you can have as a creative person is to think that you know what you’re doing. Because once you think you know what you’re doing, you stop looking around for other ways of doing things. And you stop being able to see other ways of doing things. You become blind. You become like these guys over here, kind of coding along in binary. Someone shows them assembly language, somebody shows them Fortran, and they can’t even see it. It just goes right over their head. Because they know what they’re doing. They know what programming is. This is programming; that’s not programming. And so they totally miss out on this much more powerful way of thinking.

[32:04]

So the message of this talk, you know, it’s not really this stuff, right? The message of this talk is: if you don’t want to be this guy, if you want to be open or receptive to new ways of thinking, to invent new ways of thinking, I think the first step is you have to say to yourself, “I don’t know what I’m doing. We as a field don’t know what we’re doing.” I think you have to say, “We don’t know what programming is. We don’t know what computing is. We don’t even know what a computer is.” And once you truly understand that—and once you truly believe that—then you’re free. And you can think anything. Thank you.

1: I found the transcript here here. I don’t take any credit for it’s creation, and I’m only sharing it here as a mirror to propogate the ideas in the talk. Full credit goes to the Glamour and Discourse blog, or whoever that blog got the transcript from.