Friday, July 3, 2015

Pullum on the competence-performance distinction

In an earlier post I discussed a nice little piece by Geoff Pullum (GP) on Aspects after 50 years (here). GP has a second interesting post (here) that discusses a central distinction highlighted in Aspects; the competence-performance distinction (CPD). As he rightly notes, this distinction has caused endless cognitivist’s dissonance. Like GP, I don’t really understand why. It’s a distinction that is rampant, for example, in statistical conceptions of cognition. The trivial distinction between the structure of the hypothesis space and the actual distributions over that space is a version of the CPD. Nobody takes this to be a controversial distinction.[1] If so, we can take a theory of the hypothesis space (what’s its geography?) to be a theory of competence and a theory of how this space gets filled probabilistically to be a theory of performance. This common distinction then provides a first okish pass at the CPD. It is not perfect, but it gets you a lot of the way there.

Another useful explication is that a theory of competence aims to limn the limits of the possible. In syntax we aim to describe the possible sentences of a language in terms of the unbounded products of a given G. We aim to describe the possible Gs in terms of the products of FL/UG. We aim to describe the possible FLs… The sentences/ Gs/FLs we actually encounter are points in a space of sentences/Gs/FLs that could exist. Theories of what could be are theories of competence.

Ontologically, theories of competence ground theories of performance. This is why the CPD is rife with Empiricist/Rationalist (E/R) baggage, which chapter 1 of Aspects elaborates on.  This is also why, IMO, it has proven to be so difficult a concept to get across. Empiricists treat competence as a conceptually derivative notion. First there is performance. Competence, on the E view, is smoothed performance, performance with the variance squeezed completely out, non-noisy performance.  This, however, is not what the CPD demarcates. It is essentially a Rationalist notion, pointing to hidden structure, which actual linguistic items can be used to reveal.

I mention this because, this seems quite different from the way that GP frames matters in his post.[2] For GP, following Edward Lorenz, competence is “what you expect,” and performance is “what you get.” He then elaborates this in terms of what we will find if we squeeze out all the “sporadic and unintended mistakes” of performance. So competence is cleaned up performance. But it isn’t. For example, many many G products will never be performed, (im)perfectly or otherwise, yet they are explanada of a competence theory of G. Humans may fail to internalize many many Gs, yet these Gs may be possible and Gs that at theory of FL/UG should permit. So, what we see, even in statistically cleaned up form (i.e. with all the nosiy variance squeezed out) is not what competence is about.

Of course, we hope that we can get a window into the possible by investigating what is actually deployed. Thus, actual performances are the source of our linguistic data. What people say, how they judge things we present to them, etc. These data are used to plumb the limits of the possible. And some data is perhaps better than others for this purpose. So, if you are interested in what linguistic knowledge consists in (i.e. knowledge of a G) then some data might be better suited than others for probing this. What kind? Well, those that do not run afoul of factors such as limited memory, slips of the tongue, etc. that we think might confuse matters.  So, we clean up the relevant data we use to plumb G competence.  These gussied up linguistic objects are not themselves the targets of explanation. Rather they act as probes into the structure of a G or of FL/UG. In other words, the target of a theory of competence in linguistics (i.e. what we want a theory of competence to explain the properties of) is a G or FL/UG. The intuitions we harvest and the utterances we track are what we use to investigate these structures. The point that Chomsky makes in Aspects concerning the ideal speaker-hearers simply adverts to the fact that some data (e.g. those that abstract away from performance factors like memory and attention limitations) are nosier than others and so less useful as probes of the G and FL/UG systems. In other words, Aspects makes the undeniably correct point so some data are plausibly more useful than others if one’s aim is to discern the structure of linguistic knowledge.

GP interprets the discussion in Aspects quite differently. He understands Chomsky to have been proposing that the “subject matter” of linguistics was “speaker intuitions about sentence structure.” These intuitions are purer than linguistic performances (e.g. utterances) in that they abstract away from (and here GP quotes Chomsky)  “grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of this language in actual performance.” However, even if we grant this (which we should), this does not make these intuitions the subject matter of linguistics. No! These intuitions are simply data, evidence that linguists can use effectively to understand the structure of Gs and FL/UG. Chomsky does claim that such intuitions are rich sources of information about the structure of G knowledge and might suggest that they are superior data to corpora data (i.e. recorded speech of actual utterances). Moreover, he clearly thinks that we should dump the prejudice against speakers’ judgments that behaviorism saddled us with. However, Chomsky does not argue in Aspects (so far as I can tell) that intuitions are epistemologically or ontologically privileged data, just that they don’t suffer from some of the problems we might think would mislead investigation.

This is almost certainly correct, and it is not the same as suggesting that these data are the subject matter of linguistics. The subject matter, (aka aim of inquiry) is, as Chomsky puts it, to describe “the mental reality underlying actual behavior.”

I should add that the idea that intuitions are privileged in some way leads to all sorts of misconceptions that psychologists then spend so much time lecturing linguists about. Such judgments are themselves complex performances, with all the problems that performances entail. The most that can be said for intuitions is that have proven to be excellent probes and that they are very stable, robust, and easy to gather. These are important virtues, but they don’t argue for intuitions as such being privileged, as would be the case were intuitions the subject matter of linguistics.

In sum, if I got GP right here (but see note 2), then I think he gets the CPD wrong. What he presents is the Eish version of the CPD. Aspects takes an irreducibly Rish conception of the aim of inquiry and the CPD is intended to highlight this approach. If this is right, then the main problem with getting others to understand the competence/performance distinction lies in them getting to see how closely it is related to an Rish conception of mind (and science actually (see here)). But Rationalism is largely anathema to many practicing neuro-cog types and so the distinction is hard for them to understand (and accept). This is to be expected. On an Eish conception, the most one can make of the notion lies in the quality of data. It marks a distinction between types of data: Performance data is “messy” “competence” data is not. The latter is privileged for that reason. However, this is not the point that the CPD is intended to highlight. It highlights the difference between the products of an underlying mechanism and the mechanism itself. In other words, it highlights the claim that the subject matter of linguistics is Gs and FL. In other words, the CPD carries within itself the project of modern Generative Grammar, and that’s why understanding it is so very important.

[1] See the Amy Perfors quote here.
[2] I say “might be” because it is possible to read GP as making the same point as I am making here. If so, great. However, there is another reading where he sees competence as non-noisy performance. But, I may be misreading him here and if so my sincere apologies to GP. That said, the two ways of interpreting the CPD are important so I will continue putting my own construal on GP’s elaboration.


  1. I disagree with your diagnosis of the widespread squeamishness about the CPD. The problem is not so much that folks misunderstand it as that they find its deployment frustrating. Chomsky's use of the terms has been consistently clear about what he takes Competence to be, and consistently ambiguous about what Performance refers to, and hence what the CPD is. That is visible already in Ch 1 of Aspects, and it is inherited by most users since that time. The usage encompasses both the notion that you have in mind, and the one that Geoff Pullum has in mind, and more.

    You will find lots of people who are happy to say that they study linguistic competence. But you'll find almost nobody that says that they study performance. Instead, they would use a more specific description for what they study. So, 'competence' is mostly used by one group to assert what they are concerned with, 'performance' is almost exclusively used by that same group in the context of saying "don't bother me about X, Y, Z". And it also leads many in that group to assume that X, Y, and Z have something in common. (The term "gentile" has a similar flavor.)

    My own main frustration with the CPD is that it invites confusion by conflating a variety of distinctions, some of which are necessary, and some of which are interesting empirical hypotheses. So, while I spend most of my life dealing with issues in the general ballpark of the CPD, I try to avoid the terminology, because it rarely clarifies anything. There is the distinction between a system and what it generates (necessary). There are distinctions between descriptions of a system at varying grains of analysis (indisputable; it's always a question of which grain size yields the most insight). There is the distinction between a task-neutral cognitive system and systems the are designed to deploy the task-neutral knowledge in specific tasks like speaking and understanding (interesting empirical hypothesis). And so on.

    If all who are squeamish about the CPD are labeled as empiricists, that probably serves only to exacerbate their frustration. The Cognitive Revolution happened. I think that the dissatisfaction comes from the fact that the CPD is most commonly used to close down discussion.

    Side note 1: but yes, I agree with you that intuitions are not the object of study.
    Side note 2: and it's true that there are some who think that corpus data are privileged and intuitions are to be treated with skepticism. But they're a small subset of those who are nervous about CPD.

    1. As with many things Colin says to me both in public and in private, I am unsure how to respond. Side notes 1 and 2 seem to indicate complete agreement with two of the main points I tried to make. In my reading of Pullum he took intuitions to be the object of study and then asked which ones were the province of the theory of competence; the cleaned up ones. Colin agrees that this is not what a theory of competence studies, so we agree. We also agree about the distinction between a system and what it generates (and that the system is the object of study) and that one can distinguish various features of the system one is interested in studying (say the knowledge rather than how it is put to use in various cases). So, from where I sit, we largely agree. So what's the issue?

      Well Colin you seem to think that the distinction is not as clear as it could be and that some people who are confused about it (you being one?) wish that it could me made clearer. Here here! The clearer the better. And in many particular cases I have heard it cleaned up. As my old friend Sid Morgenbesser used to observe about distinctions; DON"T CONFUSE THEM WITH DUALISMS. The distinction is useful in local contexts and can be clarified well enough there. I doubt that there is a general definition that distinguishes one from the other in all relevant cases. What I do think is that Pullum's version, or the one I attributed to him, is not the right one AND that it is the way one would draw it were one moved by Eish concerns. I'm not sure if you buy this. You seem to deny that Eish views are widespread and so that this is not the real problem. Maybe, but I doubt you are right here. In fact, I just recently had a discussion with a mutual colleague about some work that revolved around confusing just the point I was arguing that Pullum had misunderstood. So maybe you are right and that all your friends are clear headed about this and that Eish tendencies are very rare. Great to hear it. But there are parts of the world where this misunderstanding reigns and the views that I think that Pullum expressed are very much with us.

      The upshot? Well, as you and I agree on most of the substantive issues and agree (do we?) that an Esih perspective would confuse things the way I described, the only question is the degree to which this is serious AND the degree to which the CP distinction is useful. And here we may disagree (but I am not sure about this): you seem to want a general distinction whereas I think that the distinction will require lots of context sensitivity to be useful. We also both agree that intuitions are not the object of inquiry nor that acceptability data are privileged as such. Well, this is more than we usually agree, so I call it a small victory for consensus.

    2. You end your post by saying how important it is to understand the CPD. And you seem to argue that a lot of non-linguists are unhappy with it (or misunderstand it) because they are empiricists. My disagreement is with both of these points. I think that there are many important issues in this general arena, but that the CPD has demonstrably done little to help to clarify them. I think that clarity is most likely to come from avoiding the CPD and talking instead about the important individual issues that it conflates. And I think that there are many reasons aside from empiricism that lead non-linguists to be dissatisfied with the CPD.

      The contrast between a mechanism and its products is indeed the distinction that is highlighted on p.4 of Aspects. But by p.9 the same terms are being used to refer to a different distinction, one involving either different mechanisms or different degrees of abstraction, or both.

    3. I think that it is important that people appreciated the distinction between Gs and their products and between FL/UG and its products. The object of inquiry are these Gs and FL/UG. I think it is important that people understand that intuitions are the data that linguists have found useful in exploring the structure of Gs and FL/UG. I think that it is important to distinguish Gs and UG from how Gs and UG is put to use in using and acquiring a G. I think that these distinctions are important for if they are run together, the chance for confusion rapidly increases. The CPD is useful in that it encapsulates these distinctions and it makes clear that Generative Capacity cannot be identified with our intuitions, our utterances, our parsings etc. So, to me the CPD is just the claim that Gs matter (as does UG) and that it cannot be identified with its uses. From what I can tell you agree. But I sense that you also think that the CPD in some sense denigrates/confuses/ casts aspersions on other kinds of research, research that wants to understand how Gs or UG is actually used.

      I have never seen this myself. Chomsky's point, and here I agree, is that studying these things will already presuppose some commitments to the structure of Gs and UG for they are PART of any theory of use. Put tersely: you can have G/UG without use of these but not vice versa. And this really is an important point, one that I thought you agreed with.

      There is a little more, but this is not part of the basic logic of CPD but often assumed; namely that what is specifically linguistic is G and UG, all other aspects of use being cognitively general. Again, I thought that you and yours buy this: there is no special linguistic memory, no special linguistic attention, no special linguistic executive control. Rather these general things PLUS G and UG are what performance amounts to. If this is so, then again, the CPD is useful for it often is used to tacitly make this cut.

      So why are "non-linguists…dissatisfied"? You say that it is not just Eish attitudes (conceding that it is at least Eish attitudes?). Maybe not. So what is it? In my experience, the non-linguists could care less about G or UG. They even deny that FL exists. So maybe you might like to explain what you see the problem with CPD to be. I would be delighted for you to post on the issue at length should you wish more than a couple of paragraphs. I personally would like to know what the problems are, as truth be told I have never found the distinction tight to grasp (though like all distinctions it can be hard to apply in practice). I would hope that if you decided to do this it would not be just Chomsky exegesis, fun as this is.

    4. Yes, a (slightly) longer piece is in the works. Shevaun Lewis and I have been working on this sporadically for a while. We'll let you know once it's done. As part of this exercise, we did some digging through the literature on what people had said about CPD over the years.

  2. The thing I found most puzzling in Pullum's exposition of the CPD was the claim that Chomsky is interested in the intuitions of ideal speaker-hearers. On my understanding, the idealization to an ideal speaker-hearer comes into play when when we consider the problem of language acquisition. That is, we begin by trying to solve the “easy” version of the problem before we try to figure out how kids deal with non-homogenous speech communities, memory limitations, speech errors, etc. Chomsky is not telling us to interpret the intuitions of real speaker-hearers as approximations to the intuitions of imaginary ideal-speaker hearers.

  3. I'm baffled by the exchange between Colin and Norbert. Nobody says they study "competence" any more than they say they study "performance" because on both ends, there are a gazillion topics. People who work on "competence" study stuff like binding, control, bounding, agreement, etc. People who work on "performance" study stuff like planning, prediction, attention, encoding for memory, maintenance in memory, retrieval from memory, etc. There is a sense in which those who live on the C-side of the CPD don't care about what happens on the P-side, and sometimes this is a perilous move (because maybe some fact that you thought was about the grammar was really about memory, or whatever), and sometimes it is completely innocuous because the intuitions about acceptability are so robust that they suffice for theory construction. If you work on the P-side of the CPD, then it is always perilous to make claims without paying attention to (at least some of) the contents of the competence theory because that theory at the very least specifies the data structures (or constraints on what the data structures have to encode) that the performance theory engages. This much, I am quite certain everyone in this discussion agrees with. So what's left? Explaining why some people are grumpy about the CPD. This, I think, has two parts. The first part is, I think, the sociological point that Colin is making, which is that people who are interested in psycholinguistics maybe think that the hard core grammatical theorists look down their noses at them because competence is important and performance is just noise. Certainly back when we were in graduate school and syntax was still king of the hill, there may have been some truth to that (and for all I know maybe there still is), but this is just noise that gets in the way of thinking. Those who feel slighted by the CPD should just get over themselves. The second part is what Norbert is talking about, and which is certainly right. Many people trained in psychology (and linguistics) think that psychology is the study of behavior (indeed, many psychology textbooks even begin with a sentence making that assertion), which is essentially an empiricist idea. Those people are confused about the CPD because they think the outputs of the system are the system itself, that there is no hammer without hammering. Those psychologists who think that psychology is the study of the mental structures that give rise to behavior have no problem with the CPD (at least in my experience). Those people learned the lessons of the cognitive revolution. I'm sure Colin agrees with the second part and that Norbert would assent to the sociological point. So, I don't see what you guys are grumbling about.

    1. @Jeff: Agreed. I don't get it either. However, if what Colin is getting at is the sociology, then I do agree that it the attitude of syntacticians is not not only misplaced but counter-productive. I think I've even been known to say this publicly in syntax venues (E.g. Athens recently). That said, we need to get the logic right and for this the CPD is very useful for it accurately describes the lay of the logical land. There are always subtleties in applying the CPD to actual cases, but this is true of any general distinction/principle. I don't see the CPD more problematic than most. So, I agree with your puzzlement and I second both points you make.

    2. I don't think that this is about rationalism vs empiricism, at least
      not as I understand those terms. Clearly, there is a sense in which
      linguistics /is/ `about data'; if there weren't, then linguistics
      couldn't possibly be an empirical science. One way to approach
      complex phenomena (like the behaviour which provides the data of
      linguistics) is to attempt to describe them as the interaction of
      multiple causal powers. Generative linguistics is trying to describe
      one of the powers which conspire to give rise to actual language
      behaviour. One obvious way to do this would be to provisionally treat
      the effects of the other factors as a normally distributed random
      variable, and to see whether different linguistic hypotheses improve
      the model. That would keep a straightforward connection between our
      theories and data.

      That's not what we do.

      Instead, we have an extremely non-trivial relation between our
      theories and data. Like Chomksy, and Marr, and Pylyshyn, and many
      others have recommended, we are conducting our investigation into our
      postulated causal power at multiple levels. Generative linguistics
      has typically been working at higher levels, which do not connect
      directly to data. Following Chomsky, who admonished us to use very
      clear data (and to let the theory decide the unclear data), we might
      identify a contrast, and build our model to provide a distinction
      between the contrasting conditions. We appeal thereby to an often
      implicit linking hypothesis. (Maybe, for each subjacency violation we
      expect a linear (?) decrease in acceptability.) And we hardly ever
      relate our high-level theory of our postulated causal power to the
      actual language data. On the rare occasions when we do, there is this
      monstrous edifice working behind the scenes to deliver results that
      don't need it.

      I think that *this* is what bothers people, and I think that they are
      right to be bothered. I don't think this has anything to do with the
      cognitive revolution leaving some behind, or what not. From my
      extremely post-cog-rev perspective, the cognitive revolution gave us
      permission to postulate very abstract entities, not to completely
      disengage from data.

      But of course, I think that this is an important way to investigate
      cognitive phenomena. The theories which are complete overkill for any
      particular experiment are intended to be significant components of
      accounts of wildly different experiments on wildly different data.
      And I think that it's completely reasonable to build models of simple
      (idealized) data in order to get a handle on the real deal, which is,
      afaict, what we are in the business of doing.

    3. I was hoping you would chime in Greg. If I recall correctly you once described yourself as very much an instrumentalist scientifically. That said, I guess I don't see the problems linking theory and data that you like to harp on. I don't see that our data is particularly bad or that we have trouble linking theoretical predictions to data. In fact, we have an all to easy time showing that a particular hypothesis is false. Are there work arounds? Sure, but this is true in *ANY* field (e.g. You may have noticed how hard it has been to pin down the energy range of the Higgs.). It really is hard to eliminate all variants of a hypothesis. So, I really don't buy your skepticism regarding results in linguistics wrt how they fit the data we consider. It's not physics, but then what is? But it's not bad.

      So, from where I sit, your skepticism is not about the CPD at all, but about whether we have any data that fit our theories. You seem to think that the answer is no and that the right thing is then to model the data, rather than the mechanism. this will eliminate the gap between the two, but in a very uninteresting sort of way. So, here we disagree and I would ascribe your views as very very Eish. This should not bother an instrumentalist, so you might agree.

    4. @Norbert: I had intended my last paragraph to address some of these points. I think that you are using the word data in a different way than was I. My parenthetical `(idealized) data' is I think what you are calling data in your first paragraph. I agree that our theories are all too easy to falsify using idealized data, but I hope you will agree that our theories say next to nothing about the data collected in (say) Colin's latest experiment. I think that idealized data is very important in constructing a simple theory that we can understand (which is a goal that I take seriously). But, from where I sit, the theories we construct are only as good as their utility as part of an account of real data.

      Don't get me wrong. I don't think that our theories must be vetted against the real data at every step. I'm happy to use idealized data over and over to try to get a handle on something. But when we describe idealized data, we're not describing the world (as determined objectively by experiments etc), but rather our conceptualization of it (ie. a pattern we think we see). In so far as our theories should be about the world, they must accord with it, rather than our projection of structure into it.

      I don't really expect that you would disagree with this (other than in how coarsely formulated it is). (Do you?) What I'm trying to do is to identify where rational people might look at linguistics and be unhappy. I think that a rational person could be unhappy with the fact that we are very heavily into idealized data, and that we are not yet seriously attempting as a field to make contact with objective data.

    5. Thx for the clarification. The answer is in two parts due to space issues.

      I am having a problem with what you call "idealized data." This is just data from where I sit, no different from other data. There is nothing idealized about it. What does it consist in? Well, the general data linguists use are acceptability-under-an-interpretation judgments. In some cases, when there is no interpretation that could make these sentence acceptable, this defaults to simple acceptability. This is our main probe.

      Is this idealized? No, we ask real flesh and bone speakers. Is this good data? Yes very reliable as Sprouse, Almeida Schutze among others have shown us. And very easy to use. So, this is the main kind of data we use.

      Why this? Well, as noted it is easy and reliable and cheap. It is easier than using corpus data, for example, and can be aimed pretty accurately at what you want to look at. Second, it is not an online measure so we can rationally hope it abstracts away from things that we know can distort judgments; things like attention and memory issues. Does this mean that it is unimpeachable? No. Does it mean that it only targets G and UG properties? No. there are still lots of ways of being unacceptable. Sprouse, for example, showed that filled gap effects degrade acceptability.

      Colin, I know this being the syntax member of many a psycho committee, uses such data all the time, cleaned up with experimental syntax methods given his audience. So, this is some of Colin's data.

      Continued next entry.

    6. Part 2:

      Other data Colin uses include eye tracking measures, TVJT, rep and meg data, self-paced reading measures and some others that I a cannot recall. Why does our work not engage these data as easily as the acceptability data? Because, they are aiming for online measures and so intentionally do not abstract away from the memory, attention, executive control concerns that the acceptability data tries to remove from the picture. Interestingly, some of this data confirms the acceptability data (e.g see his stuff on islands and filled gap effects using self paced reading measures). However, much of his work asks related but DIFFERENT questions than you or I tend to ask: e.g. how is the G used in real time? When is a principle of G invoked in processing? When do kids master principle A, B, C. Note that these all presuppose Gish knowledge and ask more about it. These, of course, require extra assumptions in addition to Gish ones. Our theories do not as they stand address these for they make no obvious commitments to what these extra assumptions are, being compatible with many variants. However, from what I can tell, these too are just data, neither more nor less idealized, just tapping into different cognitive resources and so being more or less direct windows onto G, UG and the performance systems that deploy G and UG.

      So what would a rational person do? Well, he might once have argued that the data linguists use is not reliable. But we know that this is false. Or she might think that the fit between Gs and UGs and performance systems might require us to change the the Gish and UGish representations we think are right just basing ourselves on acceptability data. This resolves to the question of how transparent G and UGs are to the operations and primitives of the performance systems. This question has been asked, and IMO, to date it looks like the fit between the two is pretty close, though this is clearly something that people will have to keep investigating (think covering grammars here). However, TO DATE, there is little reason from most of what I have seen to think that the conclusions we have drawn on acceptability judgment data are much challenged by data using these other methods.

      In fact, I would go further: standard practice is to start with the theories we have developed using our data and then combine THEM with assumptions about memory structure (content addressable? RAM? Stack?) and attention and see where this combined theory goes. There are a few cases where we are confident enough in our theories of memory, attention and excutive control to start holding these constant to probe G structure. I personally hope we will see more of this in the future, but IMO, this is still not that compelling. Note, however, this uses data that is neither better nor worse than ours, just different.

      That's how I see it. It's not a question of data quality or ontological differences in the data (idealized vs objective). I wish in some sense that such distinctions made sense for then we could all concentrate on the right kind. But it doesn't. It's just objective data all the way down! Most of the data is massaged using standard methods, but there is no other real difference except the number of assumptions you are or are not abstracting away from. So, that's how I see things.

    7. "There is nothing idealized about it. What does it consist in? Well, the general data linguists use are acceptability-under-an-interpretation judgments.".

      Acceptability data is quite noisy and gradient. The discussion about the status of islands in Scandinavian here (the comments to is basically a discussion about what is a reasonable level of idealization with regard to acceptability judgments

    8. I'm loathe to contribute to the slide of this thread into yet another discussion of judgment data (it's like a cosmic force), but a couple of clarifications are needed.

      1. Yes, when people are given a gradient scale, they use it. The causes of the gradience are partly understood, partly not understood.

      2. Gradience is quite different than inconsistency, of course. For the Scandinavian phenomena that Alex mentions, there's certainly variability across items, contexts, etc., but I'm not sure that we have evidence that the reactions are uncommonly inconsistent. For the English judgments in Sprouse's LI corpus, he has shown that the gradient judgments are rather consistent.

      3. Although it's true that judgments typically involve acceptability-under-an-interpretation ratings, that's explicitly not what Sprouse et al. did in their large scale studies, for practical reasons. They restricted attention to the ~50% of example sentences where interpretation, prosody, context, etc. were not really at issue. E.g., no judgments involving anaphora.

      4. Though Sprouse has tested the consistency of the relatively 'interpretation free' judgments in his sample, we know less about the consistency of the phenomena that SA&S excluded. It's likely that there's more variability once interpretation, context, prosody become relevant. But that could be for relatively uninteresting reasons, e.g., linguists are trained at imagining scenarios to support the interpretation/prosody at issue; regular folks are not.

      5. The dependent measures that we get in acceptability rating tasks are way more consistent than what we get in studies with reading times, eye-movements, ERPs or what-have-you.

      ... but please let's not have a CPD discussion devolve into another scrap about judgment data.

    9. I also don't want to have a scrap about judgment data; I think
      judgments are perfectly fine data, and that informal elicited
      judgments have been shown to be pretty robustly related to the results
      of more controlled judgment tasks. (Although my recollection of the
      Sprouse et al results is that they only tested pairwise judgments
      (better-worse), and didn't test for consistency across the sample
      (that stars are always worse than no stars).)

      I want to respond to Norbert's claim that there is no idealized
      (or at least that it is relevant to not what linguists do).
      First off, it is absolutely right that when we elicit (however
      informally) speaker judgments about anything that that is just as much
      `real data' as anything else. So I agree with this 100%.

      Let me give a simple and concrete example of what I meant by idealized
      data. Consider a center-embedded structure with 4 levels of embedding
      (no pronouns, etc). This is judged to be very unacceptable, by
      everybody. We all believe that the right account of this is to treat
      it as grammatical, and to have the fact that it is unacceptable be the
      consequence of memory limitations, or some such. I think this is a
      perfectly legitimate move in general, and I think it is even the right
      one in this case. However we are no longer accounting for the real
      data with our theory of grammar. We could; we could penalize
      ourselves for each divergence between acceptability and predicted
      acceptability (i.e.~grammaticality), we could incorporate various
      theories of memory limitations, and compare how much they reduce the
      divergence against how complicated the resulting theory is. But we
      don't. And this is ok, it is a perfectly reasonable way to do things,
      but we can't evaluate whether it was `right' until we reengage with
      the `real data' in the manner mentioned just above.

      This is a straight-up competence performance deal, but I introduced
      the term `idealized data' because I wanted to include some other stuff
      as well. When doing fieldwork, you don't report the actual responses
      you got (usually), but rather some informal statistic thereof. You
      never see `speaker 1 hemmed and hawwed before accepting it, speaker 2
      said that maybe I could say that, but he wouldn't, speaker 3 offered a
      reformulation instead of a judgment, speaker 4 said it was fine, but he says everything is fine', rather you see that the sentence
      in question is unacceptable. This would be a straightforward and
      unproblematic instance of dimensionality reduction, except that the
      procedure is informal.

      These are two kinds of instances of doing things the relation of which
      to the actual collected data is not clear. For that reason, I called
      it `idealized data'. My intent was not to impugn or malign it, but
      rather to point out that people acting rationally might have qualms
      about the empirical relevance of our work at these points.

    10. @Greg

      Maybe I am missing things, but data does not come marked as competence data or performance data or any other kind of data. It's just data. Ahead of explanation there is no partition of the data. All we have are unacceptability judgments. These are used to construct various kinds of theories. One such theory is a competence theory. We claim that some difference in acceptability can/should be traced to some difference in their G status. Some unacceptabilities are ungrammaticalities. Other acceptability data might be explained in terms of memory limits (center embedding). Again the data use is acceptability data (at least in part). We all agree that this is kosher and standard practice. But you seem to suggest that there is something untoward about this as we are not engaging real data. But we are. What we are not doing is pre-specifying which data is relevant to evaluate competence from anything else. That's right. We don't and cannot. But then again, this is true in every domain of inquiry. Ahead of theorizing, we don't know what data is relevant for what, ever. This is no different here in linguistics. So if one is suspicious about this, then the suspicions have nothing to do with linguistic practice. They are more global, and frankly, I don't care to argue these in this blog. Go talk to your philosophy friends.

      The second point you make is that data is cleaned up; with various irrelevancies pruned. Again, this is not that different from what happens in general everywhere. Did the baby really look left? Was the angle of incidence really 27 degrees? Maybe it was 26.879? Our data is idealized just like this data is, with the BIG exception that nobody in linguistics (or the mental sciences more generally) have absolute measure of any quantity. We have no grammatometer to measure grammaticality or even absolute acceptability (e.g this is 4.6 Chomskys more acceptable than that) . But, nor does anyone else (or most everyone else, there are some psychophysics cardinal measures of things). It would be great to find some real values we could measure, but sadly, we don't have these right now. That said, Colin is right to note that what we do have is pretty good by psych standards.

      So, I must be obtuse (ok, I know I am) but I just don't get what you are fussing about. By psych standards our data rocks. By psych standards our theories are great. By most standards our methodological and measuring problems are pretty anodyne. This does not mean that every claim regarding competence based on acceptability data is right (think agreement attraction errors). There is guarantee. But who would have thought otherwise?

    11. @Norbert: Suppose someone runs an experiment, plots the data, and
      notices that, if it weren't for these pesky points, it'd fit
      reasonably well to this curve
      . We might regard it as somewhat
      sketchy if this person built their theory ignoring those points. We
      might want to know if there were some principled way to ignore points,
      other than that they don't fit in with the theory. If that person
      were to admonish us to not let those ignored points distract us from
      how good their theory was, we might respond that, if they ignore
      points that don't fit their theory, then by definition their
      theory will fit the points that they don't ignore pretty well.

      I am suggesting that this is the position some of our detractors find
      themselves in. I read your last comment as agreeing with me (?), but
      then arguing that this is par for the scientific course. (?!)

      As for your remark about not having an absolute measure of any mental
      quantity, sure; instead we have absolute measures of behavioural
      quantities. (I think I must be misunderstanding you here.)

    12. @ Greg:

      "Suppose someone runs an experiment, plots the data, and
      notices that, if it weren't for these pesky points, it'd fit
      reasonably well to this curve. We might regard it as somewhat
      sketchy if this person built their theory ignoring those points."

      Really? Did you ever sit in on a psych lab meeting? Did you ever notice that there is ALWAYS trimming in the sense of certain data points being excluded (the baby fussed out, the subject wasn't paying attention as can be seen by missing the question at the end, the subject was a vast outlier, etc). That's what is commonly done. Sometimes this is legit, sometimes not. There is NO general rule to tell ahead of time which is which. One needs to argue on a case by case basis. This is how things work.

      Amazingly, this is a LESS common problem in our part of the world. Why? Well because the data is generally very robust and for many sorts of cases the judgments are pretty crisp. Island and ECP effects fall into this by and large. So do many binding condition effects. There is some language variation, but there is also a lot of consensus over many data points.

      Given this, I don't think that our detractors find themselves in the position you note. I at least don't know of any. Maybe a reference would be useful. But say they did make your point. I would respond that you argue the cases one by one as you do in any scientific domain. So, say I think center embedding is a GRAMMATICAL fact about legit representations rather than a fact about memory, and you think the opposite. What does one do? Well what one always does: argue the details. But from my experience the skeptics NEVER want to argue the details. In fact, they refuse to consider the facts that the two of us agree are pertinent. I have friends who would love it if the other side would seriously engage. They almost NEVER do. That's the problem, not the CPD. That's at lest how I see it. They may sometimes complain about the CPD (though again I know of no examples), but IMO this is cover. They just don't want to seriously engage.

      As for absolute measures: we have evidence that this is better than that (so a 6 rather than a 4 on a rating scale). But we have no quantity that we are measuring (the analogue of mass, or charge, or force). We have no analogue unit like the Newton or measurable constant like the gravitational constant. So there is no Chomsky which is the unit of grammaticality that we can measure. This is too bad. For if there were we would be able to make predictions about the exact Chomsky value of a certain unacceptable sentence, i.e. how much is due to ungrammaticality and how much to other factors. We cannot do this right now, so the discussion is necessarily more imprecise. That's what I meant.

    13. Thanks for the clarification, Norbert.
      As for references, I read many of Postal's complaints as involving excluding data points. Also, I read Edelman & Christiansen's review of Lasnik as complaining that syntactic theory is not re-engaging with (what I think they would call) less idealized data. I take these not to be isolated instances, but to reflect an underlying current of dissatisfaction with the relationship of minimalist syntax to data. I find the E&C review interesting for a number of reasons; they seem to be trying to be very diplomatic, and they suggest a way to engage with them, that is different (I think) from how we have been trying to engage with detractors.

      I am familiar with some of the literature on statistical inference. I'll remind you that part of pre-registration involves defining in advance your trimming criteria. There are schools of thought (for example, Minimum Description Length) according to which no trimming is allowed. Essentially, when you trim, you are hypothesizing that the data you got was the interaction of multiple factors (and you are guessing that the stuff you didn't trim is more reflective of the factor you care about). It is a hypothesis, and must ultimately be tested.

    14. @Greg: "I find the E&C review interesting for a number of reasons; they seem to be trying to be very diplomatic, [...]"

      You're being sarcastic, right?

    15. @Omer: Have you read it? My impression is that they don't think too highly of Lasnik's book, or of the transformational generative tradition in general. (They nowhere say this.) Still, their critiques are fair (i.e. about issues), and they close with a positive attempt at a rapprochement. How much more diplomatic could it get?

    16. This comment has been removed by the author.

    17. @Greg: I think if you've described the target of your review (or the subject matter that the target discusses; I couldn't quite tell) as "more like a taxi ride gone bad than a free tour," you've pretty much taken 'diplomatic' off the table.

    18. @Omer: Alright, you've got me there. That opener is fairly colorful...

    19. @Greg: It’s hard to take E&C’s closing call for engagement very seriously, since in their view there is simply nothing for them to engage with. Generative syntax has yielded no “testable hypotheses” and no significant results. If that is really true, then we have nothing useful to offer and we should all just quit. I think that is what E&C would really like to say (which is perhaps why you described the review as diplomatic).

  4. Typical. Norbert and I try to stage an interesting fight, and then Lidz comes along and goes all conciliatory on us. Party pooper.

    But seriously. I'm a bit confused too, as I wasn't looking to pick a fight (yawn) and I'm not feeling bent out of shape or trying to make a sociological point. I was simply making the suggestion that the CPD hasn't proven to be a great source of clarity. It's possible that in the context of the early 60s it did just what was needed, in helping to delimit a new research program. But I'm not sure that it has helped a lot since then. As this thread illustrates, it seems to be one of those things that people are confident that they understand, and are surprised to learn that others think of it differently.

    Some differences that get swept together under the heading of the CPD.

    1. Mechanism vs. its products. Easy.

    2. Degrees of abstraction in characterizing a neurocognitive system (roughly equivalent to Marr levels, except that the notion that there are 3 distinct levels masks the many choices that you make when choosing where you'll find the most insight).

    3. Differences in how the same neurocognitive system performs in different task settings, when information is available or withheld, e.g., in comprehension the meaning is withheld.

    4. Differences between distinct neurocognitive systems with specific functions, e.g., "grammar" vs. "producer".