The last post (here) prompted three useful comments by Max, Avery and Alex C. Though they appear to make three different points (Max pointing to Fodor’s thoughts on modularity, Avery on indirect negative evidence and Alex C on domain specific nativism) I believe that they all end up orbiting a similar small set of concerns. Let me explain.
Max links to (IMO) one of Fodor’s best ever book reviews (here). The review brings together many themes in discussing a pair of books (one by Pinker, the other by Plotkin). It outlines some links between computationalism, modularity, nativism and Darwininan natural selection (DNS). I’ll skip the discussion on DNS here, though I know that there will be many of you eager to battle his pernicious and misinformed views (not!). Go at it. What I think is interesting given the earlier post is Fodor’s linking together computationalism, modularity and nativism. How do these ideas talk to one another? Let’s start by seeing what they are.
Fodor takes computationalism to be Turing’s “simply terrific idea” about how to mechanize rationality (i.e. thinking). As Fodor puts it (p. 2):
…some inferences are rational in virtue of the syntax of the sentences that enter into them; metaphorically, in virtue of the ‘shapes’ of these sentences.
Turing noted that, wherever an inference is formal in this sense, a machine can be made to execute the inference. This is because…you can make them [i.e. machines NH] quite good at detecting and responding to syntactic relations among sentences.
And what makes syntax so nice? It’s LOCAL. Again as Fodor puts it (p. 3):
…Turing’s account of computation…doesn’t look past the form of sentences to their meanings and it assumes that the role of thoughts in a mental process is determined entirely by their internal (syntactic) structure.
Fodor continues to argue that where this kind of locally focused computation is not available, computationalism ceases to be useful. When does this happen? When belief fixation requires the global canvassing and evaluation of disparate kinds of information all of which have variable and very non-linear effects on the process. Philosophers call this ‘inference to the best explanation’ (IBT) and the problem with IBT is that it’s a complete and utter mystery how it gets done. Again as Fodor puts it (p. 3):
[often] your cognitive problem is to find and adopt whatever beliefs are best confirmed on balance. ‘Best confirmed on balance’ means something like: the strongest and simplest relevant beliefs that are consistent with as many of one’s prior epistemic commitments as possible. But as far as anyone knows, relevance, strength, simplicity, centrality and the like are properties, not of single sentences, but of whole belief systems: and there’s no reason at all to suppose that such global properties of belief systems are syntactic.
And this is where modularity comes in; for modular systems limit the range of relevant information for any given computation and limiting what counts as relevant is critical to allowing one to syntactify a problem and allow computationalism to operate. IMO, one of the reasons that GG has been a doable and successful branch of cog sci is that FL is modular(ish) (i.e. that something like the autonomy of syntax is roughly correct). ‘Modular’ means “largely autonomous with respect to the rest of one’s cognition” (p. 3). Modularity is what allows Turing’s trick to operate. Turing’s trick, the mechanization of cognition, relies on the syntacticifcation of inference, which in turn relies on isolating the formal features that computations exploit.
All of which brings us (at last!) to nativism. Modularity just is domain specificity. Computations are modular if they are “more or less autonomous” and “special purpose” and “the information [they] can use to solve [cognitive problems] are proprietary” (p. 3). So construed, if FL is modular, then it will also be domain specific. So if FL is a module (and we have lots of apparent evidence to suggest that it is) then it would not be at all surprising to find that FL is specially tuned to linguistic concerns. And that it exploits and manipulates “proprietary information” and that its computations were specifically “designed” to deal with the specific linguistic information it worries about. So, if FL is a module, then we should expect it be contain lots of domain specific computational operations, principles and primitives.
How do we go about investigating the if-clause immediately above? It helps go back to the schema we discussed in the previous post. Recall the general schema in (1) that we used to characterize the relevant problem in a given domain, ‘X’ ranging over different domains. (2) is the linguistic case.
(1) PXD -> FX -> GX
(2) PLD -> FL -> GL
Linguists have discovered many properties of FL. Before the Minimalist Program (MP) got going, the theories of FL were very linguistically parochial. The basic primitives, operations and principles did not appear to have much to say about other cognitive domains (e.g. vision, face recognition, causal inference). As such it was reasonable to conclude that the organization of FL was sui generis. And to the degree that this organization had to be take as innate (which, recall, was based on empirical arguments about what Gs did) then to that degree we had an argument for innate domain specific principles of FL. MP has provided (a few) reasons for thinking that earlier theories overestimated the domain specificity of FL’s organization. However, as a matter of fact, the unification of FL with other domains of cognition (or computation) has been very very very modest. I know what I am hoping for and I try not to confuse what I want to be true with what we have good reason to be true. You should too. Ambitions are one thing, results quite another. How one might go about realizing these MP ambitions?
If (1) correctly characterizes the problem, then one way for arguing against a dedicated capacity is to show that for various values of ‘X,’ FX is the same. So, say we look at vision and language, then were FL = FV we would have an argument that the very same kind of information and operations were cognitively at play in both vision and language. I confess, that stating things this baldly makes it very implausible that FL does equal FV, but heh, it’s possible. The impressive trick would show how to pull this off (as opposed to simply expressing hopes or making windy assertions that this could be done), at least for some domains. And the trick is not an easy one to execute: we know a lot about the properties of natural language Gs. And we want an FL that explains these very properties. We don’t want a unification with other FXs that sacrifices this hard won knowledge to some mushy kind of “unification” (yes, these are scare quotes) which sacrifices the specifics that we have worked so hard to establish (yes Alex, I’m talking to you). An honest appraisal of how far we’ve come in unifying the principles across modules would conclude that, to date, we have very few results suggesting that FL is not domain specific. Don’t get me wrong: there are reasons to search for such unifications and I for one would be delighted if this happens. But hoping is not doing and ambitions are not achievements. So, if FL is not a dedicated capacity, but is merely the reflection of more general cognitive principles then it should be possible to find FL being the same as some FX (if not vision, then something else) and that this unified FX’ (i.e. which encompasses FL and FX) can derive the relevant Gs with all their wonderful properties given the appropriate PLD. There’s a Nobel prize awaiting such a unification, so hope to it.
It is worth noting that there is tons of standard variety psycho evidence that FL really is modular with respect to other cognitive capacities. Susan Curtiss (here and here) reviews the wealth of double dissociations between language and virtually any other capacity you might be interested in. Thus, at least in one perfectly coherent sense, FL is a module and so a dedicated special purpose system. Language competence swings independently of visual acuity, auditory facility, IQ, hair color, height, voacab proficiency, you name it. So if one takes such dissociations as dispositive (and it is the gold standard) then FL is a module with all that this entails.
However, there is a second way of thinking about what unification of the cognitive modules consists in and this may be the source of much (what I take to be) confused discussion. In particular, we need to separate out two questions: ‘Is FL a module?’ and ‘Is FL contain linguistically proprietary parts/circuits?’ One can maintain that FL is a module without also thinking that its parts are entirely different from those in every other module. How so? Well, FL might be composed from the same kinds of parts present in other modules, albeit put together in distinctive ways. Same parts, same computations, different wiring. If this were so, then there would be a sense in which FL is a module (i.e. it has special distinctive proprietary computations etc.), yet when seen at the right grain it shares many (most? All?) of its basic computational features with other domains of cognition. In other words, it is possible that FL’s computations are distinctive and dedicated, and that they are built from the same simple parts found in other modules. Speaking personally, this is how I now understand the Minimalist Bet (i.e. that FL shares many basic computational properties with other systems).
This is a coherent position (which does not imply it is correct). At the cellular level our organs are pretty similar. Nonetheless, a kidney is not a heart, and neither is a liver or a stomach. So too with FL and other cognitive “organs.” This is a possibility (in fact, I have argued in places that this is also plausible and maybe even true). So, seen from the perspective of the basic building blocks, it is possible that FL, though a separate module, is nonetheless “just like” every other kind of cognition. This version of the “modularity” issue asks not whether FL is a domain specific dedicated system (it is!), but whether it employs primitive circuits/operations proprietary to it (i.e. not shared with other cognitive domains). Here ‘domain specific’ means uses basic operations not attested in the other domains of non-linguistic cognition.
Of course, the MP bet is easy to articulate at a general level. What’s hard is to show that it’s true (or even plausible). As I’ve argued before, to collect on this bet requires, first, reducing FL’s internal modularity (which in turn requires showing Binding, movement, control, agreement, etc. are really only apparently different) and, second, showing that this unification rests on cognitively generic basic operations. Believe me when I tell you that this program has been a hard sell.
Moreover, the mainstream Minimalist position is that though this may be largely correct, it is exactly wrong: there are some special purpose linguistic devices and operations (e.g. Merge), which are responsible for Gs distinctive recursive property. At any rate, I think the logic is clear so I will not repeat the mantra yet again.
This brings me to the last point I want to make: Avery notes that more often than not positive evidence relevant to fixing a grammatical option is missing from the PLD. In other words, Avery notes that the PLD is in fact even more impoverished than we tend to believe. He rightly notes that this implies that indirect negative evidence (INE) is more important than we tend to think. Now if he is right (and I have no reason to think that he isn’t), then FL must be chocked full of domain specific information. Why? Because INE requires a sharp specification of options under consideration to be operative. Induction that uses INE effectively must be richer than induction exploiting only positive data. INE demands more articulated hypothesis space, not less. INE can compensate for poor direct evidence but only if FL knows what absences it’s looking for! You can hear the dogs that don’t bark but only if you are listening for barking dogs. If Avery’s cited example is correct (see here), then it seems that FL is attuned to micro variations, and this suggests a very rich system of very linguistically specific micro parameters internal to FL. Thus, if Avery is right, then FL will contain quite a lot of very domain specific information and given that this information is logically necessary to exploit INE it looks like these options must be innately specified and that FL contains lots of innate domain specific information. Of course, Avery may be wrong and those that don’t like this conclusion are free (indeed urged) to reanalyze the relevant cases (i.e. to indulge in some linguistic research and produce some helpful results).
This is a good place to stop. There is an intimate connection between modularity, computationalism, and nativism. Computations can only do useful work where information is bounded. Bounded information is what modules provide. More often than not the information that a module exploits is native to it. MP is betting that with respect to FL, there is less language specific basic circuitry than heretofore assumed. However, this does not imply that FL is not a module (i.e. part of “general intelligence”). Indeed, given the kinds of evidence that Curtiss reviews, it is empirically very likely that FL is a module. And this can be true even if we manage to unify the internal modules of FL and demonstrate that the requisite remaining computations largely exploit domain general computational principles and operations. Avery’s important question remains: how much acquisition is driven by direct and how much by indirect negative evidence? Right now, we don’t really know (at least not to the level of detail that we want). That’s why these are still important research topics. However, the logic is clear, even if the answers are not.
 Incidentally, IBT is one of the phenomena that dualists like Descartes pointed to in favor of a distinct mental substance. Dualism, in other words, is roughly the observation that much of thought cannot be mechanized.
 It’s important to understand where the problem lies. The problem is not giving a story in specific cases in specific contexts. We do this all the time. The problem is providing principles that select out the IBT antecedent to a specification of the contextually relevant variables. The hard problem is specifying what is relevant ex ante.
 Successful unifications almost always win kudos. Think electricity and magnetism, the the latter two with the weak force, terrestrial and celestial mechanics, chemistry and mechanics. These all get their own chapters in the greatest hits of science books. And in each case, it took lots of work to show that the desired unification was possible. There is no reason to think that cognition should be any easier.
 I include generic computational principles here, so-called first factor computational principles.
 In fact, if I understand Gold correctly (which is a toss up), acquiring modestly interesting Gs strictly using induction over positive data is impossible.