I have been thinking again about the relationship between Plato’s Problem and Darwin’s. The crux of the issue, as I’ve noted before (see e.g. here) is the tension between the two. Having a rich linguo-centric FL makes explaining the acquisition certain features of particular Gs easy (why? Because they don’t have to be learned, they are given/innate). Examples include the stubborn propensity for movement rules to obey island conditions, for reflexives to resist non-local binding etc. However, having an FL with rich language specific architecture makes it more difficult to explain how FL came to be biologically fixed in humans. The problem gets harder still if one buys the claim that human linguistic facility arose in the species in (roughly) only the last 50-100,000 years. If this is true, then the architecture of FL must be more or less continuous with that we find in other domains of cognition, with the addition of a possible tweak or two (language is more or less an app in Jan Koster’sense). In other words, FL can’t be that linguo-centric! This is the essential tension. The principle project of contemporary linguistics (in particular that of the Minimalist Program (MP)), I believe, should be to resolve this tension. In other words, to show how you can eat your Platonic cake and have Darwin’s too.
How to do this? Well, here’s an unkosher way of resolving the tension. It is not an admissible move in this game to deny Plato’s Problem is a real problem. That does not “resolve” the tension. It denies that there is/was one to begin with. Denying Plato’s Problem in our current setting includes ignoring all the POS arguments that have been deployed to argue in favor of linguo-centric structure for FL. Examples abound and I have been talking about these again in recent posts (here, here). Indeed, most of the structure GB postulates, if an accurate description of FL, is innate or stems from innate mental architecture. GB’s cousins (H/GPSG, LFG, RG) have their corresponding versions of the GB modules and hence their corresponding linguo-centric innate structures. The interesting MP question is how to combine the fact that FL has the properties GB describes with a plausible story of how these GBish features of FL could have arisen. To repeat: denying that Plato’s Problem is real or denying that FL arose in the species at some time in the relatively recent past does not solve the MP problem, it denies that there is any problem to solve.
There is one (and so far as I can tell, only one) way of squaring this apparent circle: to derive the properties of GB from simpler assumptions. In other words, to treat GB in roughly the way the theory of Subjacency treats islands: to show that the independent principles and modules are all special cases of a simpler more plausible unified theory.
This project involves two separate steps.
First, we need to show how to unify the disparate modules. A good chunk of my research over the last 15 years has aimed for this (with varying degrees of success). I have argued (though I have persuaded few) that we should try and reduce all non-local dependencies to “movement” relations. Combine this with Chomsky’s proposal that movement and phrase building devolve to the same operation ((E/I)-Merge) and one gets the result that all grammatical dependencies are products of a single operation, viz. Merge. Or to put this now in Chomsky’s terms, once Merge becomes cognitively available (Merge being the evolutionary miracle, aka, random mutation), the rest of GB does as well for GB is nothing other than a catalogue of the various kinds of Merge dependencies available in a computationally well-behaved system.
Second, we need to show that once Merge arises, the limitations on the Merge dependencies that GB catalogues (island effects, binding effects, control effects, etc.) follow from general (maybe ‘generic’ is a better term) principles of cognitive computation. If we can assimilate locality principles like the PIC and Minimality and Binding Domain to (plausibly) more cognitively generic principles like Extension (conservativity) or Inclusiveness then it is possible to understand that GB dependencies are what one gets if (i) all operations “live on” Merge and (ii) these operations are subject to non-linguocentric principles of cognitive computation.
Note that if this can be accomplished, then the tension noted at the outset is resolved. Chomsky’s hunch, the basic minimalist conjecture, is that this is doable; that it is possible to reduce grammatical dependencies to a (at most) one (or two) specifically linguistic operations which when combined with other cognitive operations plus generic constraints on cognitive computation (one’s not particularly linguo-centric) we get the effects of GB.
There is a second additional conjecture that Chomsky advances that bears on the program. This second independent hunch is the Strong Minimalist Thesis (SMT). IMO, it has not been very clear how we are to understand the SMT. The slogan is that FL is the “optimal solution to minimal design specifications.” However, I have never found the intent of this slogan to be particularly clear. Lately, I have proposed (e.g. here) that we understand the SMT in the context of the question one of the four classic questions in Generative Grammar: How are Gs put to use? In particular, the SMT tells us that grammars are well designed for use by the interface.
I want to stress that SMT is an extra hunch about the structure of FL. Moreover, I believe that this reconstruction of the problematic (thanks Hegel) might not (most likely, does not) coincide with how Chomsky understands MP. The paragraphs above argue that reconciling Darwin and Plato requires showing that most of the principles operative in FL are cognitively generic (viz. that they are operative in other non-linguistic cognitive domains). This licenses the assumption that they pre-exist the emergence of FL and so we need not explain why FL recruits them. All that is required is that they “be there” for the taking. The conjecture that FL is optimal computationally (i.e. that it is well-designed wrt to use by the interfaces) goes beyond the evolutionary assumption required to solve the Plato/Darwin tension. The SMT postulates that these evolutionarily available principles are also well designed. This second conjecture, if true, is very interesting precisely because the first Darwinian one can be true without the second optimal design assumption being true. Moreover, if the SMT is true, this might require explanation. In particular, why should evolutionary available mechanisms that FL embodies be well designed for use (especially given that FL is of recent vintage)?
That said, what’s “well designed” mean? Well, here’s a proposal: that the competence constraints that linguists find suffice for efficient parsing and easy learnability. There is actually a lost literature on this conjecture that precedes MP. For example, the work by Marcus and Berwick & Weinberg on parsing, and Culicover & Wexler and Berwick on learnability investigate how the constraints on linguistic representations, when transparently embedded in use systems, can allow for efficient parsing and easy learnability. It is natural to say that grammatical principles that allow for efficient parsing and easy learning are themselves computationally optimal in a biologically/psychologically relevant sense. The SMT can be (and IMO, should be) understood as conjecturing that FL produces grammars that are computationally optimal in this sense.
Two thoughts to end:
First, this way of conceiving of MP treats it as a very conservative extension of the general generative program. One of the misconceptions “out there” (CSers and Psychologists are particularly prone to this meme) is that Generativists change their minds and theories every 2 months and that this theoretical Brownian motion is an indication that linguists know squat about FL or UG. This is false. The outlines of MP as necessarily incorporating GB results (with the aim of making them “theorems” in a more general theoretical framework) emphasizes that MP does not abandon GB results but tries to explain them. This what typically takes place in advancing sciences and it is no different in linguistics. Indeed, a good Whig history of Generative Grammar would demonstrate that this conservatism has been characteristic of most of the results from LSLT to MP. This is not the place to show this, but I am planning to demonstrate it anon.
Second, MP rests on two different but related Chomskyan hunches (‘conjectures’ would sound more serious, so I suggest you sue this term when talking to the sciency types on the prestigious parts of campus): first that it is possible to resolve the tension between Plato and Darwin without doing damage to the former and that the results will be embeddable in use systems that are computationally efficient. We currently have schematic outlines for how this might be done (though there are many holes to be filled). Chomsky’s hunch is that this project can be completed.
IMO, we have made some progress towards showing that this is not a vain hope, in fact that things are better than one might have initially thought (especially if one is a pessimist like me). However, realizing this ambitious program requires a conservative attitude towards past results. In particular, MP does not imply that GB is passe. Going beyond explanatory adequacy does not imply forgetting about explanatory adequacy. Only cheap minimalism forgets what we have found, and as my mother repeatedly wisely warned me “cheap is expensive in the long run.” So, a bit of advice: think babies and bathwaters next time you are tempted to dump earlier GB results for purportedly minimalist ends.
 It is important to note that this is logically possible. Maybe the MP project rests on a misdescription of the conceptual lay of the land. As you might imagine, I doubt that this is so. However, it is a logical possibility. This is why POS phenomena are so critical to the MP enterprise. One cannot go beyond explanatory adequacy without some candidate theories that (purport to) have it.
 For the record, I am not yet convinced of Chomsky’s way of unifying things via Merge. However, for current purposes, the disagreement is not worth pursuing.
 Let me reiterate that I am not interpreting Chomsky here. I am pretty sure that he would not endorse this reconstruction of the Minimalist Problematic. Minimalists be warned!
 In his book on learning, Berwick notes that it is a truism in AI that “having the right restrictions on a given representation can make learning simple.” Ditto for parsing. Note that this does not imply that features of use cause features of representations, i.e. this does not imply that demands for efficient parsability cause grammars to have subjacency like locality constraints. Rather, for example, grammars that have subjacency like constraints will allow for simple transparent embeddings into parsers that will compute efficiently and support learning algorithms that have properties that support “easy” learning (See Berwick’s book for lots of details).
 Actually, if pressed, I would say that we have made remarkable progress in cashing in Chomsky’s two bets. We have managed to outline plausible theories of FL that unify large chunks of the GB modules and we have begun to find concrete evidence that both parsing, production and language acquisition transparently use the kinds of representations that competence theories have discovered. The project is hardly complete. But, given the ambitious scope of Chomsky’s hunches, IMO we have every reason to be sanguine that something like MP is realizable. This, however, is also fodder for another post at another time.