In a previous post, I outlined the three kinds of gaps that
fuel PoS arguments. I also noted that one of the gaps (the one that focuses on
utterances as imperfect exemplars of sentences in that they are subject to the
exigencies that afflict (enhance?) all performances) was less serious than the
other two in that it can likely be bridged using generic statistical techniques
that can be applied to “raw” data to regiment it, scrub it, and normalize it. Despite
its relative marginality for PoS claims, this kind of data degeneracy is often
the focus of lots of empirical investigation claiming to refute the supposition
that it is particularly poor, or misleading, or incomplete. Motherese, it is
often retorted, is actually mostly well-formed, smoothly uttered, and even
helpful intonationally. This kind of retort is (sometimes tacitly, sometimes
not) used to support the claim that there is no real PoS problem because the
data is quite clean and therefore the degeneracy problem is not serious and
given that the data is good and clean, wholesome even, there is no reason to
doubt that it is sufficient to guide an LAD to its G. What the previous post
argued is that even if correct
(which, frankly I doubt, but let’s be concessive[1])
it is completely irrelevant for there are two other gaps that have little to do
with the quality of the data and this is where the real power of PoS arguments
lie.
That’s what the earlier post argued at length so why am I
regurgitating the points again here? Not to reiterate the main claim (though
repetition is the soul of insight (or at least belief fixation)), but to
observe that, in a curious sense, providing a thorough catalogue of problems for E(mpiricist) approaches to G
acquisition serves to obscure the most important part of the argument. How so? It
provides critics of the PoS a weak supposition on which to concentrate its fire
which, if even partially successful (again, I am skeptical, but…) serves to
move the argumentative focus away from the strong arguments (having to do with
data deficiency, not degeneracy) and allow for a very premature declaration
that there is no PoS problem at all. So, being thorough and exhaustive drops
bread crumbs that Eish Hansel and Gretels eagerly gather thereby allowing them
to get lost in a forest of irrelevancies.[2] And the reason I mention this is that the
same kind of poor argumentative behavior infects many Evo Lang discussions,
which is what I want to concentrate on today by discussing a particularly
obtuse piece that appeared in Aeon
penned, by you guessed it, Dan Everett, entitled Did Homo Erectus Speak (henceforth DHES (here)).[3]
The goal of DHES is to argue for the Continuity Thesis. This
is roughly the idea that current human linguistic facility is qualitatively
identical to what our ancestors (and indeed other animals) have. So what we
have is just like what they have but more so. Here is DHES (p.2):
…the ‘leap’ to language was
little more than a long series of baby steps, requiring no mutations, nor any
complex grammar. In fact, the language of erectus would
have been every bit as much a ‘real language’ as any modern language.
The main argument for this conclusion is that Homo Erectus (Erectus
(E) to friends and family) already spoke a language largely like ours and thus,
our linguistic capacity has been gradually evolving for millions of years. Here
is DHES (p. 2):
To
discover the answers to these questions, we need to travel back in time at
least 1.9 million years ago to the birth of Homo
erectus, as they emerged from the ancient process of primate evolution. Erectus had nearly double the brain size of
any previous hominim, walked habitually upright, were superb hunters, travelled
the world, and sailed to ocean islands. And somewhere along the way they got
language. Yes, erectus.
Not Neanderthals. Not sapiens.
And if erectus invented
language, this means that Neanderthals, born more than a million years later,
entered a world already linguistic.
Likewise,
our species would have emerged into a world that already had language…
And, consequently, there is little reason to think that
there is anything linguistically special
about us. Our capacities are identical to E’s
give and take a little (very little). That’s the claim DHES advances, based on
the “fact” (that DHES concedes is not widely accepted in the paleoanthropology
community (p.2))[4]
that E’s artifacts (“their settlements, their art, their symbols, their sailing
ability, and their tools) all point to something like “an animal that can
communicate via symbols.” Or, as DHES puts it: “a linguistic animal” (p. 6).
So, if E was a
symbol manipulator (witness the artifacts) and was able to “transfer
information by symbols” (5), then E was linguistic, i.e. graced with the same
FL as us and there is no reason to assume that
…humans possess special
cognitive abilities absent from the brains of all other creatures or whether,
more simply, humans have language because they are smarter than other creatures
(whether through higher densities of neurons, or other advantages of brain
organisation). (p.6)
So what distinguishes us from other homos (and even other
animals for the logic deployed leads here) is a bigger brain that is
qualitatively the same as that of our ancestors but bigger and bigger gives you
“language.”
Before examining the argument, you can see why DHES
emphasizes and argues for E having language and the long time period separating
E from us. The secret sauce of gradualist explanations is long expanses of time
in which little changes can add up (see here for the definitive
account). This is why DHES emphasizes the millions
of years theme. The supposition seems to be that the only problem with a
gradualist account of the evolution of the human FL is that it arose relatively
recently (roughly 100kya) and that there is not enough time for natural
selection to work its magic. So, the thinking seems to go, if DHES can show that
this supposition is incorrect it implies that the continuity thesis does apply
to human FL and the fact that humans speak as we do (in particular, have the
kinds of Gs that we do) requires no novel cognitive architecture. It’s
linguistic facility all the way down, with bigger brains adding more of the
same doohickies and thingamabobs we find in smaller ones leading to an
“apparent” qualitative (but in reality, merely a quantitative) change in
capacity.
Now, as you can imagine, this is a very bad argument, though
I concede that people like me (and perhaps Chomsky) have invited this kind of
response. Chomsky has pointed out (following a pretty impressive bunch of
people who think about this topic for a living (e.g. Tattersall)) that the
indirect evidence for language is relatively recent. If one measures things
using cultural artifacts, then the explosion of these around 50kya (rather than
a handful of contentious ones further back) seems to indicate that something
significant happened rather recently (not
millions of years ago). If one takes such cultural artifacts to piggy back on our linguistic faculties and one takes
these to prominently include the capacity to acquire Gs that generate an
unbounded number of hierarchically structured interpretable objects, then one
has indirect evidence for something
like our kinds of Gs (Merge based ones) arising in the (relatively) recent
past. If.
As I’ve noted before, this is a very indirect kind of
argument for Merge, and it is not clear how much cultural artifacts implicate
Merge. It is not unreasonable to suppose that the thing that goosed culture
(maybe given its significance we should spell it Kultur!)[5]
was hooking the system up with externalization thereby facilitating
communication and the gradual build-up of retained and retainable knowledge.
Who knows really. All of this is very speculative (i.e. it’s not as if there is
a transparent logical entailment from elaborate burial rituals or cave
paintings to hierarchical recursion). However, it is also not that important
for even were this false the Evo Lang problem would remain fundamentally
unaltered. Let me explain.
The hard problem for GGers is explaining how unbounded
hierarchical recursion could have arisen (actually, how it actually arose is the problem, but right now would declare a small
victory if we could redeem the modal). The assumption (and this is
based on empirical evidence) is that there is nothing quite like it anywhere
else in animal cognition. This is not to demean the powers of other animals.
They really are amazing in many ways. But so far as we can tell there is
nothing formally analogous to the kinds of operations and structures we
regularly find in human Gs and there is no reason to believe that any other
animal either uses or acquires systems with these properties. If this is true (and it is, really!),
then one major Evo Lang problem is explaining how systems with this formal property could have
biologically arisen in humans given that nothing like it was there before. The
problem then is not a matter of temporal distance, but of logical distance. If
the above correctly describes the current
state of play and it is true that other animals don’t have cognitive capacities
with these formal properties then explaining how these properties arose and the mental powers need to deploy them
and acquire them requires not just more
but also different. So what was the
different and how did it happen? That’s the GG Evo Lang question.
Several comments: does this mean
that this is the only EvoLang
question? No, there are others and I will return to some discussed in DHES.
Does this make the temporal question irrelevant? Yes and No.
Yes in the sense that the problem
is more or less the same regardless of the time it took to arise. Why? Because
the problem is how to get from non-recursive structural hierarchy to recursive
structural hierarchy and no amount of non-recursive
hierarchy or recursive flat structure or non-recursive
flat structure will get you there by adding more of it. How hierarchy arose and
how, furthermore, embedded hierarchy within hierarchy ad nauseum arose is not
explained by noting finite examples of hierarchy or unbounded examples of iterated
beads on a string concatenations. Let me be clear: there is nothing wrong in
pointing out that either of these exist in the mental repertoires of other
animals, but this is not enough. There needs to be a story getting you from
these non recursive hierarchical systems to the qualitatively different recursive hierarchical one we have now. No
story, no proposed additional mechanism, no Evo Lang account.
And the No? Well, if FL arose
(very) recently, then there would be no temptation to look for a gradualist
account. So were FL of very recent vintage it would serve to block the bad (Eish)
impulse to always look to the shaping effects of the environment for an
explanation of anything.
Think of an analogy in the domain
of language acquisition. Imagine that kids popped out speaking their native
language. Does anyone think that we would be expecting Eish accounts of the
process? Nope. Does anyone look for the shaping effects of the environment in
explaining why kids are (normally) born with two legs and two arms, one heart,
two kidneys etc.? Analogous impulses would dominate were LADs to pop out
speaking their mother’s native tongue (actually I am not sure this is so, but I
would hope it was). Ditto with a short evo time span and gradualist accounts. A
long time span is a pre-requisite for a continuity story to even make sense.
However, when you think about the issue just a little bit, even a very long
time span does not bridge the logical
gap, and that is the one that needs traversing.
Curiously, the problem DHES
presupposes away is one that we have seen before in a slightly different venue.
Piaget was a cognitive gradualist, famously claiming that logical thinking in
children gradually developed in them. Jerry Fodor (in the Royaumont volume,
sadly under-read nowadays) pointed out that gradually developing richer logical competence is impossible.
There is no way of getting from the propositional to the predicate calculus
without presupposing the resources of the latter as a precondition. This is a
logical bridge too far, and it cannot be gradually navigated. The same holds
true with the recursive hierarchy problem. It is not the sort of thing you get
by adding more non recursive non hierarchical systems of representation. You
need to add something else. The Evo Lang question is what.
DHES does not say. It does make
many other points. It points to another important property of natural language,
namely that its atoms are symbols and that this allows for “displacement” (i.e.
not stimulus bound reference). DHES further insists (p.6) that
Symbols, not grammar, are thus
the sine qua non of language. They alone guarantee
communication that is displaced, that is shared by an entire community of
speakers, that can be transmitted between speakers and between generations, and
that can represent either abstract or concrete ideas or things.
Maybe DHES is right. As I’ve noted
before, Chomsky agrees that there is something interestingly different about
the atoms of human language wrt their semantic properties. But even were this
is so (which it likely is), it doesn’t answer the GG question of how the kind
of recursive hierarchy we find in human Gs (and that humans with FLs can all
acquire) arose. In other words, even if we agree with DHES concerning the importance
of atomic symbols as one key feature of human language (which, as I’ve noted
many times before, Chomsky has highlighted often in the past) unless DHES shows
us how symbolic terminals leads to recursive hierarchy, we have not progressed
on the GG question.[6]
And though the GG question is not the only
question, it is one important one given that one of the distinctive features of
human language is that it is G based
(see here
for recognition by some of the biologically informed that being G based is
indeed a critical feature of human language).
So the big problem with DHES is
its failure to recognize the logical problem the GG facts present. I say this
because it appears to suppose that one explains how the capacity of interest
arose by showing a chart that tracks its progression. The chart is on p. 7 and
has arrows pointing from one kind of representational format to another. So,
indices begat icons which begat symbols which via duality of patterning begat
compositionality, which begat linearity, which begat hierarchy, which finally
begat recursion. All very impressive, but for the fact that DHES says nothing
about how all this begetting took
place (DHES leaves out all the salacious prurient detail). How exactly does linearity begat hierarchy and hierarchy
begat recursion? All DHES tells us is that it does. Or more accurately (pp.
9-10. I have quoted the parts where “the miracle happens” as the old New
Yorker cartoon put it).
Once
you have a set of symbols and a linear order agreed upon by a culture, you have
a language.
That
is really all there is to it, though of course most languages become more
complex over time….
All of the embellishments of
grammar such as hierarchical structures, recursion, relative clauses and other
complex constructions are secondary, based on a slot-filler arrangement of and
composition of symbols, in conjunction with cultural conventions and general
principles of efficient computation…
Thus, once cultures and symbols
appear, grammar is on the way...
So DHES does nothing to advance the GG question, except
avoid saying anything about it while appearing to address it.
Before ending, let me admit that it might be that I am
somewhat unfair to DHES. There are times when it appears that its interest is
not engaging the Evo Lang question as GG poses it but in addressing another
question: does a recursively hierarchical G have more expressive power than one
without such a G? Note, that this is not
the GG question and, to my knowledge, this has not been a question that has
occupied my GG community. However, if this is the question that interests DHES,
then it seems either irrelevant to, or problematic for, the standard Evo Lang
question of how our G systems and the capacity to acquire them arose. Say that
the two kinds of Gs are not expressively the same: how does this help answer
the question of how our FL arose? Say
they have the same expressive power, then why did we evolve an FL that could
acquire Gs with unbounded hierarchy even though, by assumption, these add
nothing to the “expressive power” of language? Again, it really does not
advance the Evo Lang question of interest (which, to repeat, does not mean that
this is the only question of
interest).
Let me end here. DHES is a very messy piece and I think I
know why. It really wants to argue that there is no Evo Lang problem of the
kind GG (actually Herr Chomsky) poses because what we see all rose gradually
over a very long period of time in small incremental steps. The problem is that
DHES nowhere suggests how these steps could have been taken and how they could
have added up to what we now have. How does one get from flat systems to
hierarchical ones to recursively hierarchical ones? How does one get from
strongly referential terminals (Chomsky’s observations concerning animal
communication systems) to those that allow for pretty radical displacement?
What mental changes are required to allow this kind of symbol or
representational format to arise? What kind of mental changes are required to
get from linear to unboundedly hierarchical? These are hard questions, and
maybe we will never be able to answer them. But better to fail to answer a real
question then fail to see what the question is.
[1]
I hereby preemptively apologize to Jerry Fodor who wisely counciled against
ever conceding anything even for the
sake of argument. I am sinning here, I know.
[2]
And yes I know that they did not pick up their own crumbs and get themselves
lost, but the metaphor got away from me.
[3]
His work really is the gift that keeps on giving, your one stop shopping venue
for largely irrelevant arguments intended to buttress insupportable arguments.
I am starting to think that DE is doing this all as public service for the
enlightenment of the young. Master the non-sequiturs in the core DE oeuvre and
you’ve seen through all (or at least many of) the non-sequiturs you are likely
to encounter in the vast irrelevant anti- Chomsky literature. Like I said, an
invaluable resource (sorta like the role that Piaget’s work played in early
developmental psych work. Work through the many failures of logic there and you
end up with modern developmental cognition of the Carey-Spelke variety (i.e.
the good stuff)).
[4]
Though I am not suggesting that this should be held against the view. Experts have been known to be wrong before, and
for all I know E had some linguistic skills. That is not the issue, as I show
below. The issue is how similar E’s capacities were to our own.
[5]
The ‘K’ is also in honor of the fact that I’m posting from Germany where I am
scheduled to give a talk that I stupidly agreed to give months ago sure that I
would never have to do this as I would be lucky enough to be hit by a bus but
my luck has turned and here I am in Stuttgart. So Kultur it is!
[6]
DHES might be making a point that I am sympathetic to: that if one is interested in how Kultur
arose, then the fact that Gs are recursively hierarchical is less important than that they deploy
symbols closely semantically tied to the 4Fs. In other words, for communicative
purposes, displacement might be critical and for Kultur communication. might
be. This does not eliminate the GG question concerning recursive hierarchy, but
it suggests that it contends that it is not the sine qua non of Kultur. I don’t
know if this is right, but it might be for all I know. Like I’ve said before,
given a simple ‘N V N’ template and 25,000 Ns and 15,000 Vs allows you to say a
hell of a lot of things, maybe enough to sail the oceans and leave behind fancy
artifacts.
I find it hard to see the GG story as significantly better than Everett's, due to being unable to take seriously the idea that a facility for doing binary Merge arose by saltation, somehow acquiring the other machinery it needs to actually build grammatical structures (labelling and feature mechanics of some kind) and then hooked itself up to enough interfaces to do anything of sufficient value for it to be preserved by natural selection. Furthermore I think Everett has a point that the achievements of Erectus are beyond what a creature without any kind of language including some kind of syntax could manage.
ReplyDeleteMy suggestion is that a speech generation architecture called Salix, due to Penelope Sibun ('Text generation without trees', 1992) might be a plausible precursor for our form of language. What Salix does is traverse in an organized manner a graph with nodes representing entities (house rooms and people in her implemented models) and arcs representing relations (family and spatial), emitting symbols as it goes. Adding nodes for events and appropriate relationships for them would seem like a plausible extension, which would make the system suitable for giving instructions of various kinds, such as how to get to places where useful resources are present (so you don't have to accompany somebody there to show them where they are).
Importantly, the connections between cognition and externalization are present from the beginning, because they are direct. There are no sentence structures as such: the structure of the graph determines what symbols are produced in what order directly. Salix will also of course need a parser to be a proposed working model for Erectus proto-language.
If something like this is already present in human communities, it doesn't seem so implausible (to me, at any rate) that some kind of syntactic structure could arise as an intervening level between the conceptual (knowledge graph) level and overt performance, perhaps due to conferring advantages such as faster processing. Indeed, noting that social animals must have some way to 'parse' the activities of their peers, some kind of sentence structure might have developed immediately, with the big step forward being some kind of improvements in capacity to process it in a much faster and more flexible manner.
The Chomsky account has one thing going for it that the Everett account does not: it recognizes the problem, which is how to get to a system with unbounded hierarchy. Chomsky's solution is Merge, an operation that comes in at once. You don't like this kind of "salutational" account because it is salutational, I assume. Oddly, as I've noted before, this does not appear to bother Dawkins, someone who is otherwise bothered by these things because he agrees that some chasms cannot be jumped in several steps and that this getting to recursion seems to him like on of these. At any rate, I have my own views on Chomsky's proposal, but what is clear is that if he states the problem correctly, has identified the right end state, (which I believe he has) then nothing Everett says even approaches a solution. What is proposed in the article less unsatisfactory than it is irrelevant, and that is a big problem. Whether Chomsky is right is another matter. At least it is on target.
DeleteThe trouble is that it is not merely saltational; it's a jump that magically incorporates or integrates everything else that is needed for spoken language. ("Dawkins doesn't see that as a problem" is not even close to an argument; no need to address it further.) One might as well conclude that it happened by magic.
DeleteThe underlying flaw is that Chomsky seems to assume unbounded hierarchy when there's no evidence that that actually exists *in human beings* (it's a different story in a theoretical model divorced from empirically-observable reality--the same sort of place that has frictionless surfaces). The framing of the problem is entirely wrong.
I was, however, pleased to see this:
"I also noted that one of the gaps (the one that focuses on utterances as imperfect exemplars of sentences in that they are subject to the exigencies that afflict (enhance?) all performances) was less serious than the other two in that it can likely be bridged using generic statistical techniques...."
...because it's the first time I've seen a Chomskyan admit anything of the sort (usually the generic statistical techniques are waved aside). Of course, now the problem they address is being dismissed as irrelevant, which increasingly is how I know they're on target. About which....
"I hereby preemptively apologize to Jerry Fodor who wisely counciled against ever conceding anything even for the sake of argument."
What a shame. One of the great and telling weaknesses of the Chomskyan approach is that it follows this precept. Heck, it's practically a Skinnerian stimulus-response paradigm: when someone brings up a counterargument, automatically claim it's not only wrong but it'd be irrelevant even if it were right. When someone quotes Chomsky, automatically claim that it's out of context. I guess it's fun as a rhetorical style, but it doesn't seem to have led to much progress, and it's why Chomsky's work is gradually being relegated to the realm of philosophy rather than science.
[to Norbert] I think people usually get points for perceiving a problem that others ignore, so Chomsky gets some for noticing the unbounded hierarchy problem, but loses points for ignoring the issues of the interfaces and other stuff that Merge needs to do anything remotely useful.
DeleteWhile Everett gets points for focussing on the problem of what Erectus could do, and what kinds of linguistics resources they probably needed to do it, but loses points for not seeing any problem with the emergence of any kind of grammar. I will not attempt to proclaim a winner.
But I will point out that the Salix model suggests that many of the usual terms in which these debates are carried out are not fully appropriate. Even the simplest story-telling graph externalizer cannot be finite state because it has to remember where it has been before so as not to say the same thing twice (assuming that the input graph=storyline/navigation routes being externalized are of unbounded length, as per the usual idealizations).
To get basic constituent structure without 'recursive symbols/subroutines' (Bach 1964, 1976 and standard computer programming terminology) we could write the graph-traversing algorithms in Fortran, to add full X within X recursion we might switch to Algol, but maybe that's not actually such a big leap as people seem think it is).
One way of interpreting Salix is as an attempt to show that sentence structure doesn't exist; I think this claim is almost certainly false, but Salix does show that sentence structure as we normally think of it does not *have* to exist, so we need to justify it better than I think we have actually done so far.
I agree that perceiving a problem is a big deal. Indeed, without perceiving one it is hard to solve it. That is why I think that Chomsky has made an important contribution to the Evo Lang problem: he has identified one feature that needs addressing. I am less clear on what E's contribution is to the discussion. If I understood his piece then whether or not Erectus had the properties he attributed to it does not solve any identifiable problem, or at least not one that I can see. Did Erectus have these features? I have no idea, but I am happy to concede that Erectus did. I just do not see what follows.
DeleteNow there is a second issue: does Chomsky's proposal wrt to Merge solve the problem he identified? Well if the problem is the emergence of merge and merge suffices for unbounded recursive hierarchy then it is a potential solution if Merge could have emerged all at once. This is the salutational view. The claim seems to be that Merge could not have emerged all at once. I do not see why not, and I buttressed this view by noting that others generally skeptical of salutational accounts (Dawkins) appear to think that this kind of "hopeful monster" is not unreasonable. I take this to mean it is possible. you suggest another route to the same end Salix to Algol. I do not know the details, but if what needs adding to Salix is trivial enough and it is plausible that we had something like it before the addition, then why not, another plausible route to recursive hierarchy. I also have a horse in this game: start with linear beads on a strong (via something like concatenation) add labels and we get unbounded hierarchy. Is this right? Damn if I know. But CHomsky's suggestion, yours and mine at least have the right FORM. What is depressing is that most of the Evo Lang proposals (including E's) fails this simple prerequisite. That's what I am criticizing. It is less a defense of Chomsky than the observation that at least his answer has the right form. Given the lay of the land, this is, sadly, quite an achievement.