It seems to me that the range or the number of problems involved in making data comprehensible to humans is practically infinite. It seems just outrageously laborious, this task of figuring out the ways that people need to have things translated in order to make them understandable.
You have to think about how many kinds of things are there in the world. How many different domains of knowledge are there? For example, somebody like me who deals with data, I can tell you random facts like there are six million streets in the world, there are 30,000 items in a typical grocery store, I could read off tons of these things, but there are a finite number of these kinds of domains. In each domain, there's a large collection of similar things in that domain. And the number of domains, it's in the thousands. In Wolfram Alpha, we've crunched through a few thousand of these domains. Each time we run into a domain, there are new and different issues. Pick a domain—plants, for instance. There are all kinds of issues about plants: what's the species, how tall does it get, how does it get harvested. Plants grow at certain critical temperatures, they start germinating. How does that relate to, we have all these things around climate, how does that all connect up? When you get some new domain like plants, you say, That's quite similar to this and this and this other domain, but it has all its own extra twiddles in it that you have to deal with. Each one of these domains, it's a fair amount of effort, but it's finite, and there's a finite number of these domains. In the Wolfram Alpha project, there were two basic observations that made me decide that this project was not just abjectly impossible.
That must have been a glorious day when you figured that out.
I started the project before I was absolutely sure of the conclusion. One of those was: there's a lot of data in the world, but it's finite. It's not the case that there's just so much out there. It's like the Web, for example. The Web is very big. But how big? Well, there's probably about 10 billion meaningful pages on the Web. It's big, but it's a finite number. You can say, well, how much stuff is there in all the reference books? How big is the biggest reference library? What kind of stuff is in these reference libraries? Well, it's big, but it's something where it's trillions of elements, maybe quadrillions of elements, but it's something where you can name the numbers. We can be quantitative about how big it is; and it's big, but it's not infinitely big. Sometimes people say in order to do anything like this, you just have to scale infinitely. Well, you don't have to scale infinitely. The world is a finite place. Big, but finite.
Most people think that the scale of the numbers we're talking about is so huge that it might as well be infinite. That it seems so daunting.
I've been lucky in these kinds of things that I've developed some sort of absurd confidence that just because it's daunting doesn't mean I can't do it. Like, how many named laws are there in the world, things like Ampere's law or the universal law of gravitation? Between 5,000 and 10,000. What does that mean? Well, maybe it's half a million lines of Mathematica code to implement all of those. It's big, but it's finite.
So that's the first insight.
Right. The second one was, I had kind of imagined this notion of machines being able to compute answers to questions, that that was very much an intelligence activity. That was something, you see the science-fiction movies of the '50s and '60s where people are walking up to computers and talking to them, and the computers are giving answers. I had always imagined that what would need to be inside that computer would be a general artificial intelligence. That the only way that one would be able to do a decent job of answering questions is to imitate what humans do when they think and figure out answers to questions. As a result of a lot of basic science that I did, I kind of came to the conclusion—how do sophisticated things happen in the universe? In physics, in other places. How do things that look like they're intelligence look? How do sophisticated things happen in nature? How does the sophistication of what happens in nature compare with the sophistication of what we're able to do with our minds? What I came to realize—it's part of the thing I call the principle of computational equivalence—is that actually there is no sharp distinction between the stuff that happens in nature, or happens in the computational universe—the possible programs and things—and the stuff that we as humans with our minds are able to do. If somebody says, Well, we need this magic idea to make artificial intelligence, and that's some idea that isn't present in ordinary computation, it's some extra idea, what I came to realize is that there really isn't that kind of bright-line distinction between the intelligent and the merely computational. So that made me realize that you didn't need to build a whole AI in order to be able to answer the same kinds of questions that people would expect expert humans to answer. It surprises me a bit that an issue that philosophical has a consequence as practical as a site that actually—create 15 million lines of code, run billions of servers, that sort of thing. And I find it kind of charming that a philosophical issue of what it means to be intelligent should actually have such a practical consequence, but at least for me it definitely did in the sense that what I realized is you don't have to invent artificial intelligence in order to be able to succeed at building a system that can do computational knowledge. And more than that, I even realized later on after we got into it, that actually we humans have the exact wrong way of thinking about artificial intelligence, because that gets you into the mode of thinking, "Let's reason about things the way humans reason about things." Let's say you're trying to solve some physics problem. You reason about the physics problem; like you say, This mechanical object pushes this mechanical object, and then it does this and then it does that and so on, and you've got some whole logical chain of reasoning. That turns out to be an incredibly inefficient way to figure out what the system will actually do. A much more efficient way is just to set up the equations that people invented 150 years ago to represent those things and just blast through to the answer using the best modern scientific methods.
So your AI, which isn't strictly AI, should figure out a way to do it on its own that's better than the screwed-up, roundabout way humans would figure it out, or program it to figure it out if they were required to do so?
I think the main point is, Can you compute the answer? Well, what does it mean to figure out how to compute the answer? Anytime there is an algorithm of any degree of sophistication, it's doing a certain amount of figuring out how to, as well as just going and getting the answer. There really isn't a distinction between the figuring out how to and the getting to the answer. Just to change direction a little bit, I think one of the things that you were saying before, which is one of the things I wonder about, is we're in a world now where we can readily compute lots of things about the world, we can figure out a lot of things, we can predict a certain number of things, we can go up and just sort of ask a question about something and often be able to figure out answers, predictions and so on. How should we think about what that will mean for the future of how people do things? I think one of the themes has been: the world goes from a point where people just guess how to do things to a time where people actually compute what they should do in a more precise fashion. We see that happening in more and more places, and sometimes it happens inside the devices we use; they automatically do the computation and just automatically focus the camera or whatever it is. A GPS that figures out where to go. Sometimes they're simply telling us something. I think what will happen in the not-very-distant future is the much more preemptive delivery of knowledge. Right now, a lot of the kind of knowledge we get we have to ask for. You walk up to Wolfram Alpha, you ask it a question. It's not something where you're preemptively being told something that you might find interesting. I think increasingly, things will be set up so that one is preemptively being told something one might find interesting. This relates to a whole other world of data which I think will emerge in the next few years: the personal analytics world of data. Record everything about yourself, and it concludes. People like me, because I've been interested in data, I've recorded every keystroke I've typed in the last 20 years. I've recorded tons of other stuff—I don't look at those as often as I might, but I was about to do a big effort to just go and look at all my data for the past decade or so and try and see what I can learn about myself from it. That's a good example of where just having the raw data is amusing, but without being able to compute from it and knowing how to present it, it's not immediately useful. I think increasingly, based on what has been recorded about oneself, there will be an ability to preemptively deliver knowledge that is relevant, to compute knowledge that is useful.
Ultimately, the most useful insights that can come out of this sort of thing are insights about what the right questions are to ask. You'll want to learn things about yourself that you wouldn't even have considered asking in the first place. It will tell you what questions to ask based on what comes out of it.
Which is similar to the issue when you get a result, something like the example we were talking about earlier, 235 inches—what do want to know based on that? Can you have algorithms and heuristics that can figure out what's likely to be relevant? One of the things that's strange, for me at least, is the extent in Wolfram Alpha's form to which we can make predictions about things based on knowledge, data, whatever, yet in a lot of science that I've done, one of the upshots to that science is there's a limit to what can actually be predicted in the world. What it comes down to is that there are these processes that are computationally irreducible. In other words, when you look at a system doing what it does, it's going through some series of steps to produce the behavior it produces. The issue is, Can we jump ahead and compute what the system is going to do more efficiently than the system itself does it? One of the great achievements of mathematical-type sciences tends to be: let's just get a formula for the answer. And what does that mean? Well, that means we don't have to follow all the steps the system goes through. We just plug a number into the formula, and immediately we get an answer about what the system will do. Computational irreducibility is what happens in a surprising range of systems when it doesn't allow you to do that; it's irreducible. In order to figure out what the system will do, you effectively have to go through the same series of computational steps that the system itself goes through. There's no shortcut to getting to the answer. I think in building technology, lots of technologies are specifically set up to avoid having to make things computationally irreducible. A machine has some simple motion where you can readily predict that after three seconds it will have returned to its initial state. So this whole question of what's predictable in the world based on the data that we have, it's all wound up with this computational reducibility versus computational irreducibility. There's certain kinds of things that we can expect to predict; there's certain types of things where we can't expect to predict them. Often we set up our technology to be stuff that we can predict. Nature doesn't necessarily set itself up the same way, so it ends up with things like the weather, which may be quite hard to predict. Increasingly I suspect that technology will end up being harder and harder to predict, because it's an inevitable feature of technology being able to be more efficient that it runs into this computational irreducibility and gets to be harder to predict. The challenge for us is to use all this data, all this knowledge, to predict what can be predicted and to get as far as we can with things where we just have to compute in order to work out what will happen.
Can we set it up so that the speed through which we can run it through its courses is significantly faster anyway than it would happen in real life?
Often. Often, but not always. I expect that lots of things that show up in biomedicine will have this computational irreducibility issue when we understand all these protein interactions and all sorts of details about how do we go from the genome to the actual biomedical, clinical kinds of phenomena. There'll be lots of computational irreducibility there, but chances are that we'll be able to compute things in big enough chunks that as a practical matter, by running enough computations, we'll be able to compute that if you apply this drug in this way then these things will happen, and so on. When it comes to a more extreme level, a case that I've thought lots about is the whole universe, and to what extent is computational irreducibility an issue in understanding the whole universe. One question is, how much data do you need to specify our universe? Is it the case that with an algorithm that's a few lines of computer code long, if you just run that for long enough, can you get a whole universe? How big does that underlying seed need to be in order to get our whole universe? And I think we don't know how big that underlying seed needs to be. Let's say we have the underlying data that completely specifies our universe, but then we have to actually go from that underlying data, that underlying algorithm, to the actual behavior of the universe. And one of the points is that this computational-irreducibility phenomenon implies that that's irreducibly difficult to do.
Would you have to run it for 13 billion years?
Well, I mean, OK, so in a first approximation, yes. The good news is that there are inevitably pockets of computational reducibility, and that's our best hope for being able to match up what we've already figured out in physics and so on with what the predictions of a particular model are. The universe, it's sort of obvious that there are pockets of reducibility, because there's lots of order that we can perceive in the universe. It's an inevitable feature. It's just one of these self-referential facts.
Sometimes you just have to try it, and then you will know.
How many tries will it take for a robot to do a kickflip?
Wolfram Alpha says:
Let's see it happen!