Tuesday, January 29, 2013

Competence and Performance Redescribed

In discussion with Jeff Lidz one morning on the way to getting my cup of tea (this is my favorite time to waylay unsuspecting colleagues for linguist in the street interviews) we fell into discussion of how to explain the C(ompetence)/P(erformance) (C/P) distinction.  One thing I have often found useful is to find alternative descriptions of the same thing. Here’s a version I’ve found helpful. Let’s cast the C/P distinction in terms of Data Structures (DaSt) and the operations (or algorithms (algs)) that operate on them. What’s the difference between these notions and how might this terminology illuminate the C/P distinction?

Chomsky has standardly distinguished what we know (competence) from how we put what we know to use (performance).  He further proposes describing the former as a computational state; initial states describing the capacity for a certain kind of competence and the steady state as describing the actual competence.  Thus, UG describes the initial state of FL and the G attained describes the steady state of the language acquisition device (i.e. yours and your linguistic conspecifics).  How are we to understand these states?  The discussion in Gallistel & King (G&K) on DaSts and the operations/algs that work on them provides a useful handle.  

G&K (149) describe a DaSt as “a complex symbol” whose “constituents are themselves symbols” and whose “referent …is determined by the referents of its constituents and the syntactic…relation between them.”   Thus, “[c]omplex data structures encode the sorts of things that are asserted in what philosophers and logicians call propositions” (150) and what linguists call (sets of) phrase markers. Actually the linguistic notion might be closer as the operations manipulate DaSts exclusively in virtue of their formal syntactic properties. How these are interpreted plays no role in what kinds of things operations on DaSts can do.  This makes them effectively syntactic objects like phrase markers rather than semantic objects like propositions.

Using this terminology, a theory of competence aims to describe the linguistic DaSts that the mind/brain has. The initial state (UG) consists of certain kinds of DaSts, the final state (Gs) with particular instances of such DaSts. A particular G is a recursive specification of the permissible DaSts in a given I-langauge. UG specifies the kinds of DaSts (i.e. Gs) that FL allows by describing the range of permissible generative procedures (rules for recursively specifying Gs/I-language specific DaSts). The theory of UG aims to outline what kinds of features linguistic DaSts can have, what kind of syntactic relations are permissible, what kinds of constituents are possible etc.  The theory of a particular G (e.g. GE, the grammar of English) will specify which DaSt options that UG provides GE realizes. Both these accounts specify what the system “knows,” i.e. the kinds of representations it allows, aka the kinds of DaSts it tolerates.  In other words, a theory of linguistic DaSts is a competence theory.

Algs are procedures that use these data structures in various ways. For example, an alg can use DaSts to figure out what someone said or what to say or what rhymes with what or what follows from what. e.g. could operate on the phonological DaSts to determine the set of rhyming pairs. Algorithms can be fast or slow, computationally intensive or not, executable in linear time etc.  None of these predicates apply to DaSts. Different algs for different tasks can use the same DaSts.  Algs are ways of specifying the different things one can do with DaSts. Algs are parts of any performance theory.

G&K emphasize the close connection between DaSts and Algs. As they note:

There is an intimate and unbreakable relation between how information is arranged [viz. the DaSts, N]…and the computational routines that operate on the information [viz. algs, N]…(164).

Consequently, it is quite possible that the computational context can reveal a lot about the nature of the DaSts and vice versa.  I have suggested as much here, when I mooted the possibility that systems that have bounded content addressable memories might guard against the problems such systems are prone to by enforcing something like relativized minimality on the way that DaSts are organized.  In other words, the properties of the users of DaSts can tell us something about properties of the DaSts.

Another way of triangulating on properties of a DaSt D1 is by considering the properties of the other DaSts- D2, D3 – D1 regularly interacts with. A well-designed D1 should be sensitive to how D1 interacts with D2 and D3. How well do they fit together? What would be an optimal fit? As the government learns weekly when it tries to integrate its various databases, some DaSts play together more nicely than others do.

So, the theory of competence can be viewed as the theory of the linguistic DaSts; what are their primitive features? How are the assembled? What kinds of relations do they encode? The theory of performance can be viewed as asking about the algs that operate on these DaSts and the general cognitive architectural environment (e.g. memory) within which these algs operate.  Both DaSts and algs contribute to linguistic performance. In fact, one of minimalism’s bets is that one can learn a lot about linguistic DaSts by carefully considering what a well designed linguistic DaSt might be like given its interactions with other DaSts and how it will be used. So, performance considerations can (at least in principle) significantly inform us about the properties of linguistic DaSts. We want to know both about the properties of linguistic representations and about the operations that use these representations for various ends. However, though closely related, the aim of theory is to disentangle the contributions of each. And that’s why the distinction is important.


  1. "So, the theory of competence can be viewed as the theory of the linguistic DaSts; what are their primitive features? How are the[y] assembled? What kinds of relations do they encode?"

    Could you say a little bit more on why you think that "How are they [= the DaSts?] assembled" is a question that a competence theory needs to answer?
    (Probably I'm misunderstanding what you take this question to be, but to me any answer to this seems to require specification of an algorithm to actually build/assemble data structures. At the very least, this "how" question strikes me as rather different from the other two "what"-questions.)

  2. I'm not sure I really get the analogy. Partly this is because the line between data and program is not as clear as people want to say it is. Church encodings make this point fairly clearly: what is a pair in the lambda calculus? Well, you could implement it in some particular way in your computer, as say a memory location with two parts. Then it definitely looks "data"-y. But that's merely one choice. The pure LC doesn't have this option (there's no memory to speak of). So what did Church do? Well, he realized that what makes a pair a pair is something more like an algebraic specification using three operations which I'll call "pair" (for making pairs), "fst" (for getting the first element), and "snd" (for getting the secondsecond):

    fst (pair a b) = a
    snd (pair a b) = b

    It doesn't matter how "pair", "fst", and "snd" are implemented so long as these equations hold. So Church came up with the following definitions (using \ for lambda):

    pair = \x. \y. \f. f x y
    fst = \p. p (\x. \y. x)
    snd = \p. p (\x. \y. y)

    And we can now check that the equations hold and they do. But where's the data? As Jerry Sussman said about this in his SICP lectures, we have "data" thats made of nothing but air.

    So what really _is_ data, and how can we analogize UG to this? I think perhaps a better analogy really is to the specifications there. UG isn't the data structures themselves, but the specifications that the implementations must satisfy. Some implementations might satisfy the specification quite well, while others might do it poorly. But the specification is well defined independent of the implementations.

  3. I buy Darryl's amendment. It is a specification of data structures.

    1. I think an interesting alternative possibility is that UG might instead be a collection of tools that come out of the box as separate units that can be put together in whatever way (provided the input/output properties of each are met). As a rough analogy, think of a programming language, where you have some well defined list of primitives and means of combination, but you put them together and get an infinite number of different programs. I don't know if this line of research has really been pursued, but it seems like the kind of thing that could give rise to the large scale effects that cut across particular phenomena, e.g. locality. Lots of things seem to employ nearness constraints of one sort or another, but maybe not in exactly the same way. Maybe it's because there's some common computational core that gets plugged into different parts of the system. It's that underlying device which has nearness effects, but the details of the effect in different cases depend on what that device is used for.

      I think this is pretty distinct from the specification view, but probably has more plausibility in terms of what could actually be coded on the genome.