Kleanthes sent me this
link to a recent lecture by Gary Marcus (GM) on the status of current AI
research. It is a somewhat jaundiced review concluding that, once again, the
results have been strongly oversold. This should not be surprising. The rewards
to those that deliver strong AI (“the kind of AI that would be as smart as, say
a Star Trek computer” (3)) will be
without limit, both tangibly (lots and lots of money) and spiritually (lots and
lots of fame, immortal kinda fame). And given hyperbole never cripples its
purveyors (“AI boys will be AI boys” (and yes, they are all boys)), it is no
surprise that, as GM notes, we have been 20 years out from solving strong AI
for the last 65 years or so. This is a bit like the many economists who
predicted 15 of the last 6 recessions but worse. Why worse? Because there have
been 6 recessions but there has been pitifully small progress on strong AI, at
least if GM is to be believed (and I think he is).
Why despite the hype (necessary to drain dollars from
“smart” VC money) has this problem been so tough to crack? GM mentions a few
reasons.
First, we really have no idea how open ended competence
works. Let me put this backwards. As GM notes, AI has been successful precisely
in “predefined domains” (6). In other words, where we can limit the set of
objects being considered for identification or the topics up for discussion or
the hypotheses to be tested we can get things to run relatively smoothly. This
has been true since Winograd and his block worlds. Constrain the domain and all
goes okishly. Open the domain up so that intelligence can wander across topics
freely and all hell breaks loose. The problem of AI has always been scaling up,
and it is still a problem. Why? Because we have no idea how intelligence manages to (i) identify relevant information for
any given domain and (ii) use that information in relevant ways for that
domain. In other words, how we in general
figure out what counts and how we figure out how much it counts once we have figured it out is a complete and
utter mystery. And I mean ‘mystery’ in the sense that Chomsky has identified
(i.e. as opposed to ‘problem’).
Nor is this a problem limited to AI. As FoL has discussed before, linguistic
creativity has two sides. The part that has to do with specifying the kind of
unbounded hierarchical recursion we find in human Gs has been shown to be
tractable. Linguists have been able to say interesting things about the kinds
of Gs we find in human natural languages and the kinds of UG principles that FL
plausibly contains. One of the glories (IMO, the glory) of modern GG lies in its having turned once mysterious
questions into scientific problems. We may not have solved all the problems of
linguistic structure but we have managed to render them scientifically
tractable.
This is in stark contrast to the other side linguistic
creativity: the fact that humans are able to use their linguistic competence in
so many different ways for thought and self-expression. This is what the
Cartesians found so remarkable (see
here for some discussion) and that we have not made an iota of progress
understanding. As Chomsky put it in Language
& Mind (and is still a fair summary of where we stand today):
Honesty forces us to admit
that we are as far today as Descartes was three centuries ago from
understanding just what enables a human to speak in a way that is innovative,
free from stimulus control, and also appropriate and coherent. (12-13)[1]
All-things-considered judgments, those that we
deploy effortlessly in every day conversation, elude insight. That we do this is apparent. But how we do this remains mysterious. This
is the nut that strong AI needs to crack given its ambitions. To date, the
record of failure speaks for itself and there is no reason to think that more
modern methods will help out much.
It is precisely this roadblock that limiting
the domain of interest removes. Bound the domain and the problem of open-endedness
disappears.
This should sound familiar. It is the message
in Fodor’s Modularity of Mind. Fodor
observes that modularity makes for tractability. When we move away from modular
systems, we flat on our faces precisely because we have no idea how minds
identify what is relevant in any given situation and how it weights what is
relevant in a given situation and how it then deploys this information
appropriately. We do it all right. We just don’t know how.
The modern hype supposes that we can get around
this problem with big data. GM has a few choice remarks about this. Here’s how
he sees things (my emphasis):
I opened this talk with a prediction from
Andrew Ng: “If a typical person can do a mental task with less than one second
of thought, we can probably automate it using AI either now or in the near
future.” So, here’s my version of it, which I think is more honest and
definitely less pithy: If a typical person can do a mental task with less than
one second of thought and we can gather an enormous amount of directly relevant data, we have a
fighting chance, so long as the test data aren’t too terribly different from the training data and the domain doesn’t change too much over time.
Unfortunately, for real-world problems, that’s rarely the case. (8)
So, if we massage the data so that we get that
which is “directly relevant” and we test our inductive learner on data that is
not “too terribly different” and we make sure that the “domain doesn’t change
much” then big data will deliver “statistical approximations” (5). However,
“statistics is not the same thing as knowledge” (9). Big data can give us
better and better “correlations” if fed with “large amounts of [relevant!, NH] statistical
data”. However, even when these correlational models work, “we don’t
necessarily understand what’s underlying them” (9).[2]
And one more thing: when things work it’s
because the domain is well behaved. Here’s GM on AlphaGo (my emphasis):
Lately, AlphaGo is probably the most
impressive demonstration of AI. It’s the AI program that plays the board game
Go, and extremely well, but it works because the rules never change, you can gather an infinite amount of data, and you just play it over and over again. It’s not open-ended. You don’t have to worry about the world
changing. But when you move things into the real world, say driving a
vehicle where there’s always a new situation, these techniques just don’t work
as well. (7)
So, if the rules don’t change, you have
unbounded data and time to massage it and the relevant world doesn’t change,
then we can get something that approximately fits what we observe. But fitting
is not explaining and the world required for even this much “success” is not the
world we live in, the world in which our cognitive powers are exercised. So
what does AI’s being able to do this
in artificial worlds tell us about what we do in ours? Absolutely nothing.
Moreover, as GM notes, the problems of interest
to human cognition have exactly the opposite profile. In Big Data scenarios we
have boundless data, endless trials with huge numbers of failures
(corrections). The problems we are interested in are characterized by having a
small amount of data and a very small amount of error. What will Big Data
techniques tell us about problems with the latter profile? The obvious answer
is “not very much” and the obvious answer, to date, has proven to be quite
adequate.
Again, this should sound familiar. We do not
know how to model the everyday creativity that goes into common judgments that
humans routinely make and that directly affects how we navigate our open-ended
world. Where we cannot successfully idealize to a modular system (one that is
relatively informationally encapsulated) we are at sea. And no amount of big
data or stats will help.
What GM says has been said repeatedly over the
last 65 years.[3]
AI hype will always be with us. The problem is that it must crack a long lived
mystery to get anywhere. It must crack the problem of judgment and try to
“mechanize” it. Descartes doubted that we would be able to do this (indeed this
was his main argument for a second substance). The problem with so much work in
AI is not that it has failed to crack this problem, but that it fails to see
that it is a problem at all. What GM observes is that, in this regard, nothing
has really changed and I predict that we will be in more or less the same place
in 20 years.
Postscript:
Since
penning(?) the above I ran across a review of a book on machine intelligence by
Gary Kasparov (here).
The review is interesting (I have not read the book) and is a nice companion to
the Marcus remarks. I particularly liked the history on Shannon’s early
thoughts on chess playing computers and his distinction on how the problem
could be solved:
At the dawn of the computer
age, in 1950, the influential Bell Labs engineer Claude Shannon published a
paper in Philosophical Magazine called “Programming a
Computer for Playing Chess.” The creation of a “tolerably good” computerized
chess player, he argued, was not only possible but would also have metaphysical
consequences. It would force the human race “either to admit the possibility of
a mechanized thinking or to further restrict [its] concept of ‘thinking.’” He
went on to offer an insight that would prove essential both to the development
of chess software and to the pursuit of artificial intelligence in general. A chess
program, he wrote, would need to incorporate a search function able to identify
possible moves and rank them according to how they influenced the course of the
game. He laid out two very different approaches to programming the function.
“Type A” would rely on brute force, calculating the relative value of all
possible moves as far ahead in the game as the speed of the computer allowed.
“Type B” would use intelligence rather than raw power, imbuing the computer
with an understanding of the game that would allow it to focus on a small
number of attractive moves while ignoring the rest. In essence, a Type B
computer would demonstrate the intuition of an experienced human player.
As the review goes on to note, Shannon’s mistake was to
think that Type A computers were not going to materialize. They did, with the
result that the promise of AI (that it would tell us something about
intelligence) fizzled as the “artificial” way that machines became
“intelligent” simply abstracted away from intelligence. Or, to put it as
Kasparov is quoted as putting it: “Deep
Blue [the machine that beat Kasparov, NH] was intelligent the way your
programmable alarm clock is intelligent.”
So, the hope that AI would illuminate human
cognition rested on the belief that technology and brute calculation would not
be able to substitute for “intelligence.” This proved wrong, with machine
learning being the latest twist in the same saga, per the review and
Kasparov.
All this fits with GM’s remarks above. What both
do not emphasize enough, IMO, is something that many did not anticipate; namely
that we would revamp our views of intelligence rather than question whether our
programs had it. Part of the resurgence
of Empiricism is tied to the rise of the technologically successful machine.
The hope was that trying to get limited
machines to act like we do might tell us something about how we do things. The
limitations of the machine would require intelligent
design to get it to work thereby possibly illuminating our kind of intelligence.
What happened is that getting computationally miraculous machines to do things
in ways that we had earlier recognized as dumb and brute force (and so telling
us nothing at all) has transformed into the hypothesis that there is no such
things as real intelligence at all and everything is “really” just brute force.
Thus, the brain is just a data cruncher, just like Deep Blue is. And this shift
in attitude is supported by an Empiricist conception of mind and explanation.
There is no structure to the mind beyond the capacity to mine the inputs for
surfacy generalizations. There is no structure to the world beyond statistical
regularities. On this Eish viw, AI has not failed, rather the right conclusion
is that there is less to thinking than we thought. This invigorated Empiricism
is quite wrong. But it will have staying power. Nobody should underestimate the
power that a successful (money making) tech device can have on the intellectual
spirit of the age.
All-things-considered judgments, those that we deploy effortlessly in every day conversation, elude insight. That we do this is apparent. But how we do this remains mysterious. This is the nut that strong AI needs to crack given its ambitions. To date, the record of failure speaks for itself and there is no reason to think that more modern methods will help out much.
ReplyDeleteThis point about the challenges of everyday conversation was echoed recently by Microsoft's Satya Nadella in an interview with Quartz: "Before we even generate language, let us understand turn by turn dialogue. (…) Whenever you have ambiguity and errors, you need to think about how you put the human in the loop. That to me is the art form of an AI product."
While you connect 'error' with big data etc., Nadella's point is more subtle. Dealing with errors in interaction is something AI hasn't cracked yet, and it is where human competences shines. No other species can recover so gracefully from troubles of speaking, hearing and understanding. No other species, as far as we know, has a communication system that allows enough self-referentiality to carry out conversational repair as smoothly and frequently as we do it.
The second-most cited paper published in Language is Chomsky's review of Verbal Behavior (Chomsky 1959). It makes some interesting points about conversation, though these are often overlooked. For instance, there is the implicit realisation that even though this is clearly not behaviour under the control of some stimulus as Skinner envisaged, there are nonetheless rules of relevance, coherence and sequence that structure contributions to conversation.
The most cited paper published in Language (with twice the number of citations of Chomsky's review) is a study of the turn-taking system of informal conversation — using raw performance data to discover rules of conversational competence (Sacks et al. 1974). The field of conversation analysis has gone on to explore rules and regularities of sequence organization, repair, and so on — discovering and explaining some of the mechanisms by which people make meaning together in open-ended yet orderly ways. So some headway has surely been made.
Maybe the failure of AI isn't so much it's obsession with big data, but that it's forgot the problem that big data is supposed to solve.
ReplyDeleteGo back far enough and you discover that before the early logicists were making particular claims about which particular kinds of logics can do the job of generating protocol sentences best, you have the idea of a logic as presenting us with a kind of world with its own individual set of implications which we can tease out and examine: logic was about exploring the world revealed by our most basic statements.
Big data doesn't imagine a world. Its concepts are flat points on a field which just don't have the semantic depth of a world. Sure, this another way of making Shannon's point about Type A vs Type B intelligence (and for that matter your own point about human intuition), but it still seems to me like a more optimistic way of putting it: we can at least imagine a computer possessing a kind of internal world.