In a previous post (here), I confessed to hubris: I thought that solving PoS1 problems sufficed to finesse PoS2 difficulties. I now think that I was wrong to think this. But hey, I was young and brash and now I am mellow and judicious. Just the exuberance of youth! In this post, I’d like to consider two issues: (i) what models we have for investigating how LADs acquire their particular Gs given a specification of the possible human Gs and (ii) how PoS1 conclusions might be leveraged to address PoS2 concerns.
IMO, GG has gotten further in limning the limits of G-hood than we have gotten in explaining how LADs move through the space of possible Gs to the actual Gs acquired. GG has made serious progress in addressing PoS1 issues. We can tell pretty good stories about G-invariant properties (i.e. why certain kinds of dependencies are unattested in human Gs (e.g. Islands, ECP, CED etc.) and can also sketch out accounts regarding those things that Gs must contain (e.g. anaphors must be “close” to their antecedents in a way we can specify pretty well)). This provides us with pretty good accounts of what sorts of Gs are impossible (i.e. what kinds of dependencies Gs will never contain).
In contrast, we do not have particularly good accounts for why speakers acquire the particular Gs that they do (e.g. why does English obey the Fixed Subject Constraint (FSC) but Italian doesn’t? Why isn’t English pro-drop? Why doesn’t English have resumptive pronouns?). Though we do have accounts aiming in this direction from the acquisition, diachronic and typological literature. The accounts fall into two basic types.
The first kind of account parameterizes a principle, the classic case being the CP/TP parameter for bounding nodes. You remember the story based on work by Rizzi. The Subjacency Principle is an invariant UG principle. However, the principle is defined over bounding nodes, and these can vary across Gs. The differences between English and French with regard to extraction from embedded questions reduces to the fact that in English TP is a bounding node while in Italian CP is. This difference suffices to explain both the similarities and differences with regard to island extraction in the two languages. On this story, acquisition amounts to fixing the value of the bounding node parameter in your language.
In a very interesting paper in progress (hence I cannot link to it, sorry, but be patient), Dustin Chacon, Mike Fetters, Margaret Kandel, Eric Pelzl and Colin Phillips (CFKPP) call this “direct learning.” How is the value fixed? By induction from the PLD (choose your favorite inductive theory), which, it is hoped, provides sufficient amounts of robust data to allow the LAD to directly fix the value of the parameter (see note 5). To my knowledge (and please correct me if there is stuff out there that contradicts what I am about to say), we are still not sure if the actual PLD available in Italian and English suffices to fix the two possible values.
The second kind of account keeps principles fixed (no parametric variation of the principles) but allows for derivations that circumvent the relevant universal condition. This is similar to CFKPP’s conception of “indirect learning.” There are several examples of this. For example, Reinhart’s proposal that CP can have more than one “escape hatch” and thereby allow two WHs to move to an embedded Spec CP position thereby allowing one of them to exit while still adhering to the Subjacency Condition. On this view the different data are not traced to a parameter within bounding theory, but to another kind of fact, namely that different Gs allow for different kinds of rules (viz. English Gs only allow CP expansion rules with a single CP specifier, while other Gs (e.g. Romanian/Bulgarian) might allow more than one, thereby leaving an Spec C exit for a second A’-mover). There is potential degree 0 data that could fix this (e.g. sentences like “Who what bought” would support the conclusion that CP can house multiple WHs). However, the only investigations of the PLD that I know of (by Lydia Grebenyova for Russian) suggest that multiple interrogatives are very far from ubiquitous in the PLD (actually there are none). If so, how the rule allowing multiple Spec Cs would be acquired remains a mystery.
Let’s consider another example where a much more satisfying story exists (CFKPP discuss this case at length and explore its subtleties). Take Rizzi’s explanation for why English but not Italian is subject to the Fixed Subject Condition (FSC): (1a/b):
(1) a. *Who1 do you know that t1 ate a large supper (English)
b. Who1 do you know that t1 ate a large supper (Italian)
The account has the following components:
(2) a. Something like the FSC (non-parameterized) is part of UG
b. Italian has a way of evading the requirements of the FSC, but English doesn’t
c. That Italian can generate structures that evade the FSC is manifest in simple Italian clauses
More concretely, FL/UG contains something like the that-t filter. It stars structures in which C0 governs a trace (e.g. *[CP … C [ t1…]]). Italian (but not English) allows for post verbal subject constructions, in which the subject DP is not in the government purview of C:
(3) a. Had telephoned John (ok-Italian/*-English)
b. [ C [ [had [VP telephoned John]]]]
As a WH moving from the position of John in (3b) will not generate a structure subject to the FSC, sentences like Who do you think that phoned will be fully acceptable in Italian. In other words, Italian does respect the FSC, and the FSC is exactly the same in English and Italian. The difference between them is that the Italian allows the effects of the FSC to be evaded by allowing for movement from post-verbal subject position.
Two things to note: first, post-verbal subjects are not rarities in Italian (or Spanish which is similar) so we expect them to arise frequently and robustly in the PLD. This should provide plenty of PLD fodder for whatever rules generate post verbal subject constructions in Italian and Spanish.
Second, having post-verbal subjects suffices to evade the FSC, but it is possible that there exist other ways of doing so. Nonetheless, it appears that this is a very common way of evading the FSC. CFKPP reviews the FSC variation literature, and suggests that there are not all that many ways to skirt the FSC. Before reading CFKPP I was under the impression (based on widely cited work by Sobin) that certain dialects of English provided evidence that one could evade the FSC in other ways (English does not have post verbal subjects). However, the CFKPP provides excellent reasons (based in part on work by Cowart) that Sobin’s findings are at best inconclusive and most likely incorrect.
CFKPP does something else that is very important: it actually tries to estimate how much data there is in actual PLD bearing on the FSC in both English and Spanish/Italian (effectively the same language for FSC purposes). Bottom line: not very much at all, so were the LAD required to “directly learn” whether the FSC held, it would have a very difficult time doing so. There is just not that much direct data bearing on it. Instead, the child seems to assume that it holds universally. However, this does not imply that every language will appear to respect the FSC for there may be indirect ways of meeting its requirements while still deriving sentences that allow traces abutting Cs. As post verbal subject constructions provide such an out, the differences between English and Italian follow even if we the FSC is left unparameterized.
Note, btw, that this kind of analysis highlights the difference between a Chomsky vs a Greenberg Universal. On this story the FSC regulates Italian Gs just as much as English ones despite its effects being invisible in Italian. In other words, the FSC holds in Italian despite never appearing to hold there. This makes sense on Chomsky’s conception of universals but not Greenberg’s. Chomsky universals are generalizations about structures Greenberg universals about surface forms. They are very different, though far too often confused (as I rail about again in previous posts (see here and here for a reprise).
Ok, back to the main point and I end. There is lots of G variation, and this means that some properties of Gs are acquired on the basis of actual PLD. When one looks carefully, it appears that for many kinds of variation, there is really not that much PLD to go on, and this raises a PoS2 problem. We have a couple of examples of how to solve such PoS2 problems. However, there has been relatively little attention paid to the specific problems it raises (I also plead guilty here). Regarding these, CFKPP presents a useful classical PoS challenge to people of my ilk:
We challenge theoretical syntacticians working on any phenomenon that varies between languages to consider whether the phenomenon in question lends itself to direct observation or not. If not, it must be conditioned on other observable phenomena. This can serve as a useful heuristic for constructing accounts of phenomena in comparative syntax. (20)
Yes, yes and yes again. Note in cases where indirect stories are required, looking for them can generate interesting research into the possible variation among Gs. The Rizzi account of FSC above begins by assuming that the FSC is universal and then looks for ways that particular Gs might circumvent it. Such cases of indirect acquisition leverage what we believe to hold given standard PoS1 considerations. So why does Italian appear not to obey the FSC? Not because the that-t filter doesn’t hold in Italian, but because Italian G allows for derivations that circumvent its strictures. How do Italian Gs do this? By allowing for post-verbal subjects which allow licit “subject” A’-movement derivations. Is this fact about Italian Gs learnable? Yes. Post verbal subjects are not rare, and so the LAD has evidence for postulating rules to generate these structures, while the English kid does not. So, PLD driven acquisition plus UG fixed principles can lead to plausible accounts of G variation (i.e. to stories addressing the question how John/Gianni acquired the particular Gs they did). What’s the moral: don’t parameterize your principles but look for G rules/structures that would allow them to be empirically mute. This sort of strategy suggests taking attested universals very strictly (i.e. as not parameterized) as they serve as boundary conditions on adequate descriptions of particular Gs. Thus, though PoS1 considerations don’t directly solve PoS2 problems, in particular contexts they suggest approaches to G variation that can circumvent PoS2 problems.
Last point: I’ve lamented the fact that we’ve stopped holding syntacticians’ feet to Plato’s Fire. We should constantly be asking of comparative syntax proposals what the acquisition scenario might be. We have, IMO, refrained from doing this of late (and I include myself here). I suspect that the reason for this is that we’ve all been seduced into doing languistics rather than linguistics. We have stopped thinking of syntax as a method for investigating FL and have adopted the view that the ultimate goal of syntax is to explain syntactic patterns, rather than to use syntactic patterns to investigate the fine structure of FL. That’s unfortunate for many reasons, not the least of which is that it serves to Balkanize the discipline. If syntacticans refuse to take responsibility for the cognitive relevance of their results, why should anyone else listen?
It’s not too late to change this. I again suggest that at every variation talk we ask how the proposed variation might be acquired. Syntacticians should be expected to have thought about this problem in developing their proposals. Maybe we should start asking syntacticians to specify what kind of data could account for the presented variation and whether this is plausibly available in the PLD the child might have access to. We now have quite a few Childes data sets and maybe we should start asking syntacticians to peek at these in making their proposals. Having a workable solution is too high a bar. Having thought about the problem, considered the possibly relevant PLD, and entertained possible solutions is not. After all if a proposed account of a given variation is un-acquirable that is an excellent reason for thinking that the analysis is wrong.
 Note that even here, we do not address the specific LAD question but idealize to a situation where we aggregate Gs and reify them as languages. So nobody studies why/how Norbert acquires his idiosyncratic G but how a typical English speaker acquires GEnglish, an object that strictly speaking does not exist.
 By this I do not mean to imply that there is not good and sold work on this issue. I’ve discussed lots of this before. Berwick, Polinsky, Lidz, Yang, Guasti, Rizzi, Lightfoot, Roberts, Dresher, Fodor, Sakas and many others have addressed this question fruitfully. That said, I think we understand this issue less well than we do PoS1 concerns.
 Amusingly, the parameter theory is suggested in a footnote in Rizzi’s deservedly famous paper. The paper itself presented a different story. The parameter idea really took off with LGB, Rizzi’s discussion reworked in a systematic way that gave us the P&P architecture.
 I am reporting the history here. Grimshaw provided what to my mind was pretty compelling evidence that this was the wrong way to describe the data.
 If one assumes that English G is the unmarked case, then the investigation should concentrate on Italian PLD. The data required to fix CP as value are actually quite recondite, at least if eyeballed informally. Using standard Degree 0+ assumptions, violations of the WH-island constraint could not serve as PLD. So what might? Extraction from subject islands might (e.g. Of which Ferrari did the driver crash into the wall?) but I would bet that such data are few and far between in actual Italian PLD. Thus, the direct evidence for the CP/TP parameter are, I suspect, pretty rare in the PLD and so directly fixing the value of the parameter should be pretty challenging. At present, I have no idea how such a parameter might be fixed.
 Of course there is no English nor Italian. Even in these cases we idealize and don’t study particular individuals but study abstractions.
 I am using Anglicized Italian so excuse the accent.
 CFKPP discuss the that-t version of the FSC and understand the constraint in terms of adjacency. This may be right, but I doubt it. I suspect that what’s at stake is not adjacency but hierarchical proximity, inverted subjects being lower than Spec T. However, for what follows the details don’t matter much.
 There is a great paper testing Rizzi’s proposal in non-standard Italian dialects by Brandi and Cordin. It’s here. This really is a fun read and if you’ve never looked at it, you are in for a treat. The basic idea is that certain dialects can tell us overtly whether a WH is moving from Spec T or from a lower verbal position. In particular, movement from Spec TP is signaled with an obligatory subject clitic. Only if this clitic is absent is movement of a “subject” permissible. Take a look, it’s very pretty syntax.