Comments on Faculty of Language: What does typology teach us about FL?

Well I think the C/T vs POS debate is mostly settl...

2015-11-19T12:06:53.254-08:00

Well I think the C/T vs POS debate is mostly settled, but let me make one more attempt at clarifying my distinction between fUG and cUG. Here's yet another example (this time starting with the learning algorithm instead of the parser):

A learning algorithm already has a concept space of possible languages baked into its description. So cUG is cognitively real in the sense that --- at the very least --- it is what is prebaked into the cognitively real learning algorithm. But there might not actually be anything like a cognitive carving into subcomponents: all that's encoded in your genome (through some mysterious means) is the full learning algorithm, and that's also what's computed in your brain (in fact, something even more complicated that also has the parser baked in).

That does not mean that cUG has no cognitive reality, it is clearly a property of the system. But as a discrete object with sharp boundaries it only arises at a higher level of abstraction where we deliberately remove all parts of the learning algorithm that do not directly specify the shape of the target class. And now you run into a problem: you cannot uniquely identify A from the fact that A intersected with B yields C. There's infinitely many choices for A. Moreover, it might be methdologically preferable to pick some A' distinct from A that yields the same set but can be described much more succinctly.

When the whole object is factored into several subcomponents by your theory, there is no guarantee that fUG is cUG. All you can hope to achieve is that fUG baked into a 100% perfect representation of the rest of the cognitive system will yield exactly the same behavior as cUG baked into the system. But since the other components are underspecified for the same reason, you have an awful amount of wiggle room.

Taken together, this leaves as the only achievable goal of cognitive reality a factorized description such that the intersection of all components yields the real object. Since cUG is an abstract part of the real object, its structure is implicitly described in full. But we may have unwittingly distributed its description over several factors, foremost fUG, the parser, and the learning algorithm. So fUG is not necessarily the same as cUG, and I don't see a way of testing their identity.

PS: Personally I prefer the grammar-parser example I gave earlier, but apparently that one isn't particularly elucidating.

@Thomas I'm not sure I understand it either. H...

2015-11-19T07:55:39.207-08:00

@Thomas
I'm not sure I understand it either. However, maybe the following might help. When it comes to theories of FL/UG I am a simple minded realist. The aim is to describe THAT system that we IN FACT have. Now, this can be a complicated process with all the usual caveats, but that it is the aim. Greenbergers do not have this aim for their view lives in abstraction from the mental seat of linguistic capacity. Their aim is to describe regularities across languages, which they take to be real objects. There is a version of this, call it Greenberg at one remove, that wants to identify the regularities of Gs, what they all have in common. This gets closer to the GG enterprise as I view it, but not all the way. I am interested in the features of FL/UG. If all Gs display some regularity then one reason for this is that they display it because they are all products of FL and FL has left overt fingerprints on these Gs. That's possible and it motivates the kind of C/T work I was discussing. However, it's worth noting that the argument is hardly airtight (which does not mean that we should not do it!) . As an argument FORM. it has problems. I noted that POS arguments do not have THAT problem. You note, as have others, that it might have other ones. True. Now to your distinction:

"I think the fUG-cUG distinction is an important one to make as it acknowledges that even though our theories describe a cognitively real object, the way they carve up that object may not directly match the cognitive reality."

Here's my reaction. Given my realism wrt FL/UG I am not sure that I find the distinction relevant. If the carving does not match the reality then there is something wrong with the carving. This does not mean that it might not be useful to nonetheless investigate carvings we know to be false. It might be and it is done all the time and it might be very helpful. But, if it does not carve right then it is no place to stop. Two nickels are not a dime. They can often do what a dime can (buy .10 worth of something (can anything today be bought for .10?) but it cannot buy other stuff. For example, only a quarter can buy 3 minutes of air at my gas station. Two dimes and a nickel will purchase you no time at all unless you use it to get a quarter. So, make the distinction. Maybe it is important. But in the end it is not the one that I was thinking about (I actually not sure I get it, btw) for it appears to (happily) fail the realism constraint. I think that there is an FL and with UG features. It's this thing we want to describe.

Yeah, I was thinking the same thing as I was typin...

2015-11-18T22:44:14.732-08:00

Yeah, I was thinking the same thing as I was typing my reply. That's one of the nice things about FoL discussions, they bring out differences in our underlying assumptions that we otherwise would be unaware of.

I think the fUG-cUG distinction is an important one to make as it acknowledges that even though our theories describe a cognitively real object, the way they carve up that object may not directly match the cognitive reality.

For example, it is perfectly fine to posit a specific fUG and a specific parsing model, with the intersection of the two carving out some superclass of all possible languages. But it is of course conceivable that cUG is an abstraction of the parser in the sense of Marr, which in turn is an abstraction of something even more complicated. In this case we cannot simply identify fUG with cUG and the parsing model with the human parser, for that would identify two formally distinct objects that carve out incomparable language classes with the same cognitive object.

Maybe the following analogy by Andras Kornai is helpful: a dime is two nickels, but it does not consist of two nickels.

@Thomas: I'm not sure that I entirely understa...

2015-11-18T18:57:56.925-08:00

@Thomas: I'm not sure that I entirely understand your distinction between fUG and cUG, but, to the extent that I think I can infer what you mean, I'm guessing that this distinction (which I would indeed reject) was presumably the basis for my (and Omer's) disagreement with you in the comment thread on Another Follow-Up on Athens .

Anyway, don't mean to distract anyone here. I'm looking forward to Norbert's response, but I just thought I'd mention it, since I think it might be relevant, at least if I understand your distinction correctly.

I admit to having exaggerated, but the point remai...

2015-11-18T16:19:25.954-08:00

I admit to having exaggerated, but the point remains: ideas derived from POS arguments become much more solidly supported when the typology is also seen to work out. An example would be Kayne's Generalization from the early 80s ,,, this seemed plausible on the basis of a small number of Romance languages, but was shown to be problematic by Romanian and River Plate Spanish, and it seems to me completely sunk without hope of salvage by Greek, when the Greek generative grammarians began to get going in the 90s (I haven't found a work where any of them attempt to defend it, perhaps somebody else does?).

If GB-Generativists in the early 80s had been more attuned to typology, this embarassing detour probably wouldn't have happened (in part because there was also Swahili, which also has what is probably clitic doubling without case marking, and whose basic properties were reasonably accessible in the 1970s (Nikki Keach has a 1980 Umass thesis which covere a lot of them), but were for some reason deemed irrelevant.

The basic problem with PoS arguments is that we know very little about what kinds of learning are actually possible, also about what The Stimulus is like and how much or what kind of stimulus is necessary or sufficient to learn any particular thing, so virtually any PoS argument can in principle be knocked over by the next language. In the case of Kayne's Generalization, it was the next big penisula to the east in the northern med.

There is in fact an argument to the effect that the structure dependence of Aux-preposing is learnable from the input, but here the typology shows that this fact is highly irrelevant to the nature of UG.

I must be particularly dense today, but there'...

2015-11-18T12:45:29.308-08:00

I must be particularly dense today, but there's still a few things that are unclear to me, and I think they once again have to do with whether we refer to the same thing by FL/UG.

1) As far as I'm concerned, we do know the bounds of FL/UG. My working assumption is that the MG formalism is mostly on the right track, so every natural language has to be in the class of languages defined by that formalism. I can make this assumption because FL/UG to me refers to part of the factorization (fUG) and thus isn't necessarily the same as the cognitive object UG (cUG). Typological gaps with respect to fUG are perfectly well-defined objects. Gaps are indeed hard to interpret once one considers the full system, i.e. the intersection of fUG, learnability, processing requirements, and so on. That's why typology works well for fUG but does not neatly carry over to cUG. Which takes me to my second point.

2) From the factorization perspective, a POS argument builds on two modules, fUG and the learning algorithm. Depending on the choice of algorithm, the intersection of fUG-definable languages and learnable languages can carve out very different language classes. A more restricted learning algorithm affords you more leeway for fUG, and the other way round, you can shift the workload between the two components to some extent. So POS arguments are more complicated for fUG. For the same reason, it's not clear to me why POS arguments should apply to cUG more cleanly than typological gaps --- the problem of reasoning from fUG to cUG stays the same.

I suppose both points are somewhat off-topic since the linguists entertaining the arguments you're questioning are unlikely to accept my fUG-cUG-distinction. But I think there's good reasons to make that distinction, and if one does the logic of the argument changes quite a bit imho.

"I understand that there are some sociologica...

2015-11-18T09:42:46.400-08:00

"I understand that there are some sociological factors motivating your argument, but from a purely scientific perspective I see little point in pitting POS arguments and typology against each other. Just use whatever gets the job done."

Let's stipulate that in science one uses whatever is usable to get on with it. Let's stipulate that C/T work has proven to be a popular and useful way to investigate linguistic structure. Let's stipulate that Hornstein is wrong to suggest otherwise. Ok, now with that out of the way, let's discuss the relevant issue, or at least the one that I wanted to highlight.

The general view is that C/T work is not one way of investigating UG but the best way. Why is it the best? Because as UG is about grammatical universals then the most direct way of studying such is by investigating the G structure of as many Gs as possible. Are all swans white? Only one way to tell: look at the swans. That, I believe is the default view. I am suggesting that the logic is wanting. This does not imply that doing C/T work is worthless or that you should stop. I am suggesting that the logic is hardly airtight and that the presupposition that this is the best way to proceed needs justifying.

Why do I think this? Well because such C/Y investigations cannot deliver on what is promised. It can't explain why Gs must have the structure they in fact have. They cannot do this for the reasons I outlined. POS arguments can deliver this if they are done right for they do not rely on the logic of surveys but on the necessities associated with induction. So, POS arguments can deliver modal claims whereas surveys, even extensive ones (which we don't actually currently have) cannot. DOes this make surveys useless? No. Does it highlight a limitation if one's interest is in FL/UG? I think so.

The problem comes up with gaps in the paradigms. If all Gs have a property or no Gs have a property suggests FL's influence. But the gap problem is not trivial given that we don't know at this moment what the range of possible Gs FL/UG allows is. And so we don't know how to evaluate gaps. This is not so for POS arguments. If done well they give principled accounts of gaps. The problem is doing them well.

Again, this is not to claim that C/Y work is pointless. It is meant to highlight a feature if the logic that is often unacknowledged. Why is it unacknowledged? Well you will know the answer: Greenbergism! And I really do want us to be aware of its reach.

So, if there was a hidden agenda it was to expose the insidious influence of Greenbergism within GG, not to disparage C/T work. The aim was to explore the logic, not make judgments.

There's a lot of minor points I could quibble ...

2015-11-17T12:32:52.199-08:00

There's a lot of minor points I could quibble with, but the real issue here, I believe, is factorization and how one thinks about it.

The case Jeff and Bill make for phonology --- and which does indeed hold for syntax, too --- is that typological gaps may be due to computational properties but can also arise from other factors such as learnability, processing, diachronic change, or just quirks of human nature (the shape of the articulatory apparatus for phonology, our anthropocentric view of the world for animacy effects, and so on). If we were to lump that all together, we would have a hot mess on our hands. Factorization breaks it up into distinct properties each one of which can be given an appealing explanation. But typological data still is something worth explaining --- I've done a fair share of work in computational phonology and morphosyntax, and it has always been guided by typology, not POS.

Also, factorization doesn't mean that the "non-core properties" rooted in, say, processing and learnability are in any sense less about language. This is a methodological distinction we make, but it doesn't necessarily correspond to an ontological one (cf. Marr's levels; and before somebody complains, that does not entail that FL is just a theoretical construct without cognitive reality). So I find the argument that typological gaps need not be FL-gaps and thus aren't interesting rather weak because it depends on one's definition of FL and the status one grants it in factorization.

The fact of the matter is that there are typological gaps that are not predicted by the formalism and unlikely to be due to an insufficient sampling of the language space furnished by FL. In my book, finding explanations for these gaps is an interesting and rewarding program irrespective of whether the explanation draws on the grammar, learnability, processing, or even functional notions such as code optimization for communication.

I interpret your interpreting my use of typology as an attempt to "limn the structure of FL/UG without getting theory laden" as evidence of this fundamental disagreement: I do not want to limn the structure of FL/UG, I want to limn the structure of the whole thing but choose to break it up into smaller parts for methodological reasons. That will sometimes lead to indeterminism --- a specific typological gap can have many explanations and thus may not uniquely inform the FL/UG component of the factorization, but that doesn't make them spurious or unprincipled.

That said, I am not going to argue for the opposite extreme to regard linguistic proposals that aren't rooted in broad typological work as dubious or somehow deficient. I've already given examples of claims that don't need evidence from more than one language --- FL is capable of generating TALs and PMCFLs --- and the same can be done for other factors such as learnability, processing, and so on. And POS arguments can also shed some light on these. I understand that there are some sociological factors motivating your argument, but from a purely scientific perspective I see little point in pitting POS arguments and typology against each other. Just use whatever gets the job done.

Avery wrote: We can claim to know, for example, ...

2015-11-17T11:00:38.431-08:00

Avery wrote: We can claim to *know*, for example, that no languages forms questions by moving the first word of some grammatical category into initial position

To put Norbert's point a bit more boldly (now there's an expression you don't hear every day): if your goal is to reach conclusions of the form "no languages do X", then of course particularly direct evidence for those kinds of conclusions is going to come from doing typological work. But those are not the conclusions that Norbert is expressing interest in reaching.

Nice case. We need more of these.

2015-11-17T08:13:08.707-08:00

Nice case. We need more of these.

I would agree but for the "otherwise would be...

2015-11-17T08:12:41.689-08:00

I would agree but for the "otherwise would be conjectures." Conclusions from C/T explorations are no less conjectural. And, the point I wanted to make was that whereas POS arguments carry necessary conclusions as regards what is and isn't possible, this is less so for G surveys, even extensive ones. Gaps in a C/T paradigm suggest universality. However, we really don't know how the Gs we "see" reflect the class of possible Gs that FL/UG makes available. This is not a criticism of C/T investigations. It is simply a fact and one that means that C/T discovered gaps are, IMO, MORE conjectural than POS proposed gaps are. It is the assumption that the opposite is the case that I want to question.

I largely agree but: "The interesting thing i...

2015-11-17T08:08:16.822-08:00

I largely agree but:
"The interesting thing is that not all logical possibilities are attested cross-linguistically"
So here's the question: to what degree are these "gaps" principled? One way of deciding whether a gap in a single G is principled is by seeing if the facts hold in other Gs (hence the utility you note of C/T work). Say we find a gap across Gs. One possibility is that this is principled and that Gs will not tolerate these (or only tolerate them if there is lots of PLD indicating their presence). Another option is that the gap is "accidental." If I understand some of the work by Heinz and Idsardi, they are suggesting that many gaps are phonologically accidental. Is this not also a syntactic option? Now, conclusions of PoS reasoning cannot be accidental in this way. If something is not acquirable then it will not be acquired. Such gaps have principled explanations. That's why I like this kind of explanation.

Second, as a matter of fact, we have not really surveyed that many languages. So, if we think that the universals we have identified are largely correct (and maybe even largely exhaust their number) then as a matter of fact the assumption that we need to survey a large number of languages to ground universals (or, more strongly, that methodologically speaking UG can ONLY be grounded if every language is surveyed) needs rethinking. And if one compares linguistics to other biological enterprises, it is not clear what drives this assumption. That was a main point of my ruminations.

last point: "the study of individual languages reveals what's possible, but we also need to know what's impossible. Typology does that." Does it? That's the point at issue. And the idea that one can limn the structure of FL/UG without getting theory laden very quickly is, IMO, a myth, and one with baleful consequences.

One important way in which work on diverse languag...

2015-11-17T06:31:54.309-08:00

One important way in which work on diverse languages can advance our understanding of FL/UG is by challenging (and, in some cases, disproving) certain Chomsky Universals.

Norbert is fond of pointing out that the naïve version of this line of argumentation is fallacious ("language X doesn't have internally-headed free relatives, therefore Chomsky is wrong!"), and he is of course correct about that. But there is another, less nonsensical version of this. Chomsky Universals sometimes take the form of no language has a rule with the formal property P (think of the subj-aux inversion stuff: no language has an inversion rule based on linearly closest rather than structurally closest (except in examples where linearly closest means adjacent)). So if we found a language that could only be successfully modeled using a grammar G that resorts to rules that have the (allegedly unattested) formal property P, we will have disproven the particular Chomsky Universal under consideration.

[DISCLAIMER: The following example relies on my own research; if you don't buy the argumentation therein, then it obviously doesn't exemplify what I'm taking it to exemplify.]

One could envision the following Chomsky Universal:

(1) No language has a rule whose application cannot be enforced exclusively via Interface Conditions (i.e., conditions statable at the interface of syntax with semantics or with morphophonology).

I don't think this is a straw-man; many people read Chomsky's (2000) "Strong Minimalist Thesis" (SMT) to entail something like (1). What my 2014 book attempts to show is that agreement in the Kichean languages disproves (1). [And, if you think that (1) is entailed by the SMT, then it also disproves the SMT.] That argument simply could not be mounted using data from English, as far as I can tell. And so this is an example of data from languages that are considerably less studied than English informing our inventory of Chomsky Universals.

"However, I don’t want to downplay the contri...

2015-11-16T14:31:52.475-08:00

"However, I don’t want to downplay the contributions of C/T work here. It has been instrumental in grounding lots of conclusions motivated on pretty indirect theoretical grounds, and direct evidence is always a plus. What I want to emphasize is that more often than not, this additional evidence has buttressed conclusions reached on theoretical (rather than inductive) grounds, rather than challenging them."

In other words, C/T work plays a big role in converting what would otherwise be conjectures into what can reasonably be claimed to be knowledge. We can claim to *know*, for example, that no languages forms questions by moving the first word of some grammatical category into initial position, because if this was described in any grammar of any language, some C/T worker would have noticed it and brought it to our attention.

I can only speak for my own line of work, but typo...

2015-11-16T13:46:11.097-08:00

I can only speak for my own line of work, but typology is a lot more important to my current interests than what is going on in individual languages (including their acquisition). I think this point is just a variation of your third argument in defense of C/T, but since it comes from a different subfield with very different methodology it might be of interest nonetheless:

The study of individual languages was very useful in establishing lower bounds on generative capacity (Swiss German for supra-CFL, Yoruba and a few others for supra-MCFL), but it is very unlikely that we will find any more constructions that would push us even higher up the Chomsky hierarchy. And that's not restricted to weak generative capacity, we're also on pretty safe ground with respect to strong generative capacity. The Minimalist grammar framework as it stands right now can handle pretty much anything linguists use in their analyses while keeping expressivity in check and having very interesting formal properties.

To a certain extent, this trivializes the questions that dominate the analysis of individual languages. For example, Korean allows case endings to be dropped on the last DP in fragment answers. The standard view would be that this is a peculiar property that tells us something about how FL operates. From the computational perspective outlined above this phenomenon is entirely unsurprising because the formalism is already capable of generating such languages (any formalism with subcategorization can do this). Assuming free variation, some language is bound to display this behavior, so there's nothing to see here... if that were the end of the story.

The interesting thing is that not all logical possibilities are attested cross-linguistically. I haven't been able to fully map out the space of attested options for case dropping yet, but it's already clear that there are several typological gaps: drop only the penultimate case marker, only the antepenultimate case marker, every case marker that is preceded by an even number of DPs, and so on. Those are also things that the formalism is capable of (once again, any formalim with subcategorization can do that), so their non-existence does require explanation. Those may take the form of certain algebraic properties, showing that the typological gaps are more complex in some sense, or deriving them from independently motivated substantive universals. But whatever the answer, it is only due to the typology that the construction becomes interesting.

Bottom-line: the study of individual languages reveals what's possible, but we also need to know what's impossible. Typology does that. POS arguments can do it too, but it is far from obvious that both cover the same ground. In addition, POS arguments are much more theory-laden than simply mapping out the space of typological (surface) variation.

For my own research, work emphasizing typology (e.g. Inkelas' Interplay of Morphology and Phonology) has proven a lot more useful than the standard approach (e.g. Kramer's Morphosyntax of Gender; not a bad book, just not what I needed). So for my purposes it doesn't matter whether the C/T community is interested in universals --- as long as their work contains succinct tables contrasting which logically possible options are attested/unattested, I'm happy.

Comments on Faculty of Language: What does typology teach us about FL?

Well I think the C/T vs POS debate is mostly settl...

@Thomas I'm not sure I understand it either. H...

Yeah, I was thinking the same thing as I was typin...

@Thomas: I'm not sure that I entirely understa...

I admit to having exaggerated, but the point remai...

I must be particularly dense today, but there'...

"I understand that there are some sociologica...

There's a lot of minor points I could quibble ...

Avery wrote: We can claim to *know*, for example, ...

Nice case. We need more of these.

I would agree but for the "otherwise would be...

I largely agree but: "The interesting thing i...

One important way in which work on diverse languag...

"However, I don’t want to downplay the contri...

I can only speak for my own line of work, but typo...

Avery wrote: We can claim to know, for example, ...