In discussion with Jeff Lidz one morning on the way to
getting my cup of tea (this is my favorite time to waylay unsuspecting
colleagues for linguist in the street interviews) we fell into discussion of
how to explain the C(ompetence)/P(erformance) (C/P) distinction. One thing I have often found useful is to
find alternative descriptions of the same thing. Here’s a version I’ve found
helpful. Let’s cast the C/P distinction in terms of Data Structures (DaSt) and
the operations (or algorithms (algs)) that operate on them. What’s the
difference between these notions and how might this terminology illuminate the
C/P distinction?
Chomsky has standardly distinguished what we know
(competence) from how we put what we know to use (performance). He further proposes describing the former as
a computational state; initial states describing the capacity for a certain
kind of competence and the steady state as describing the actual
competence. Thus, UG describes the
initial state of FL and the G attained describes the steady state of the
language acquisition device (i.e. yours and your linguistic conspecifics). How are we to understand these states? The discussion in Gallistel & King (G&K)
on DaSts and the operations/algs that work on them provides a useful handle.
G&K (149) describe a DaSt as “a complex symbol” whose
“constituents are themselves symbols” and whose “referent …is determined by the
referents of its constituents and the syntactic…relation between them.” Thus, “[c]omplex data structures encode the
sorts of things that are asserted in what philosophers and logicians call
propositions” (150) and what linguists call (sets of) phrase markers. Actually
the linguistic notion might be closer as the operations manipulate DaSts
exclusively in virtue of their formal syntactic properties. How these are
interpreted plays no role in what kinds of things operations on DaSts can
do. This makes them effectively
syntactic objects like phrase markers rather than semantic objects like
propositions.
Using this terminology, a theory of competence aims to
describe the linguistic DaSts that the mind/brain has. The initial state (UG)
consists of certain kinds of DaSts, the final state (Gs) with particular
instances of such DaSts. A particular G is a recursive specification of the
permissible DaSts in a given I-langauge. UG specifies the kinds of DaSts (i.e.
Gs) that FL allows by describing the range of permissible generative procedures
(rules for recursively specifying Gs/I-language specific DaSts). The theory of
UG aims to outline what kinds of features linguistic DaSts can have, what kind
of syntactic relations are permissible, what kinds of constituents are possible
etc. The theory of a particular G (e.g. GE,
the grammar of English) will specify which DaSt options that UG provides GE
realizes. Both these accounts specify what the system “knows,” i.e. the kinds
of representations it allows, aka the kinds of DaSts it tolerates. In other words, a theory of linguistic DaSts
is a competence theory.
Algs are procedures that use these data structures in
various ways. For example, an alg can use DaSts to figure out what someone said
or what to say or what rhymes with what or what follows from what. e.g. could operate
on the phonological DaSts to determine the set of rhyming pairs. Algorithms can
be fast or slow, computationally intensive or not, executable in linear time
etc. None of these predicates apply to
DaSts. Different algs for different tasks can use the same DaSts. Algs are ways of specifying the different
things one can do with DaSts. Algs are parts of any performance theory.
G&K emphasize the close connection between DaSts and
Algs. As they note:
There is an intimate and
unbreakable relation between how information is arranged [viz. the DaSts,
N]…and the computational routines that operate on the information [viz. algs,
N]…(164).
Consequently, it is quite possible that the computational
context can reveal a lot about the nature of the DaSts and vice versa. I have suggested as much here, when I mooted
the possibility that systems that have bounded content addressable memories
might guard against the problems such systems are prone to by enforcing
something like relativized minimality on the way that DaSts are organized. In other words, the properties of the users
of DaSts can tell us something about properties of the DaSts.
Another way of triangulating on properties of a DaSt D1 is
by considering the properties of the other DaSts- D2, D3 – D1 regularly
interacts with. A well-designed D1 should be sensitive to how D1 interacts with
D2 and D3. How well do they fit together? What would be an optimal fit? As the
government learns weekly when it tries to integrate its various databases, some
DaSts play together more nicely than others do.
So, the theory of competence can be viewed as the theory of
the linguistic DaSts; what are their primitive features? How are the assembled?
What kinds of relations do they encode? The theory of performance can be viewed
as asking about the algs that operate on these DaSts and the general cognitive
architectural environment (e.g. memory) within which these algs operate. Both DaSts and algs contribute to linguistic
performance. In fact, one of minimalism’s bets is that one can learn a lot
about linguistic DaSts by carefully considering what a well designed linguistic
DaSt might be like given its interactions with other DaSts and how it will be
used. So, performance considerations can (at
least in principle) significantly inform us about the properties of linguistic
DaSts. We want to know both about the properties of linguistic representations
and about the operations that use these representations for various ends. However,
though closely related, the aim of theory is to disentangle the contributions
of each. And that’s why the distinction is important.
"So, the theory of competence can be viewed as the theory of the linguistic DaSts; what are their primitive features? How are the[y] assembled? What kinds of relations do they encode?"
ReplyDeleteCould you say a little bit more on why you think that "How are they [= the DaSts?] assembled" is a question that a competence theory needs to answer?
(Probably I'm misunderstanding what you take this question to be, but to me any answer to this seems to require specification of an algorithm to actually build/assemble data structures. At the very least, this "how" question strikes me as rather different from the other two "what"-questions.)
I'm not sure I really get the analogy. Partly this is because the line between data and program is not as clear as people want to say it is. Church encodings make this point fairly clearly: what is a pair in the lambda calculus? Well, you could implement it in some particular way in your computer, as say a memory location with two parts. Then it definitely looks "data"-y. But that's merely one choice. The pure LC doesn't have this option (there's no memory to speak of). So what did Church do? Well, he realized that what makes a pair a pair is something more like an algebraic specification using three operations which I'll call "pair" (for making pairs), "fst" (for getting the first element), and "snd" (for getting the secondsecond):
ReplyDeletefst (pair a b) = a
snd (pair a b) = b
It doesn't matter how "pair", "fst", and "snd" are implemented so long as these equations hold. So Church came up with the following definitions (using \ for lambda):
pair = \x. \y. \f. f x y
fst = \p. p (\x. \y. x)
snd = \p. p (\x. \y. y)
And we can now check that the equations hold and they do. But where's the data? As Jerry Sussman said about this in his SICP lectures, we have "data" thats made of nothing but air.
So what really _is_ data, and how can we analogize UG to this? I think perhaps a better analogy really is to the specifications there. UG isn't the data structures themselves, but the specifications that the implementations must satisfy. Some implementations might satisfy the specification quite well, while others might do it poorly. But the specification is well defined independent of the implementations.
I buy Darryl's amendment. It is a specification of data structures.
ReplyDeleteI think an interesting alternative possibility is that UG might instead be a collection of tools that come out of the box as separate units that can be put together in whatever way (provided the input/output properties of each are met). As a rough analogy, think of a programming language, where you have some well defined list of primitives and means of combination, but you put them together and get an infinite number of different programs. I don't know if this line of research has really been pursued, but it seems like the kind of thing that could give rise to the large scale effects that cut across particular phenomena, e.g. locality. Lots of things seem to employ nearness constraints of one sort or another, but maybe not in exactly the same way. Maybe it's because there's some common computational core that gets plugged into different parts of the system. It's that underlying device which has nearness effects, but the details of the effect in different cases depend on what that device is used for.
DeleteI think this is pretty distinct from the specification view, but probably has more plausibility in terms of what could actually be coded on the genome.