I have long believed that physics envy is an excellent foundation
for linguistic inquiry (see here). Why?
Because physics is the paradigmatic
science. Hence, if it is ok to do something there it’s ok to do it anywhere
else in the sciences (e.g. including in the cog-neuro (CN) sciences, including
linguistics) and if a suggested methodological precept fails for physics, then others
(including CNers) have every right to treat it with disdain. Here’s a useful
prophylactic against methodological sadists: Try your methodological dicta out
on physics before you encumber the rest of us with them. Down with
methodological dualism!
However, my envy goes further: I have often looked to (popular)
discussions about hot topics in physical theory to fuel my own speculations.
And recently I ran across a stimulating suggestive piece about how some are
trying to rebuild quantum theory from the ground up using simple physical
principles (QTFSPP) (here).
The discussion is interesting for me in that it leads to a plausible suggestion
for how to enrich minimalist practice. Let me elaborate.
The consensus opinion among physicists is that nobody really
understands quantum mechanics (QM). Feynman is alleged to have said that anyone
who claims to understand it, doesn’t. And though he appears not to have said exactly this (see here
section 9), it's a widely shared sentiment. Nonetheless, QM (or the Standard Theory)
is, apparently, the most empirically successful theory ever devised. So, we
have a theory that works yet we have no real clarity as to why it works. Some
(IMO, rightly) find this a challenge. In response they have decided to
reconstitute QM on new foundations. Interestingly, what is described are
efforts to recapture the main effects of QM within theories with more natural
starting points/axioms. The aim, in other words, is reminiscent of the
Minimalist Program (MP): construct theories that have the characteristic
signature properties of QM but are grounded in more interpretable axioms.
What’s this mean? First let’s take a peak at a couple of examples from the
article and then return to MP.
A prominent contrast within physics is between QM and
Relativity. The latter (the piece mentions special relativity) is based on two
fundamental principles that are easy to understand and from which all the weird
and wonderful effects of relativity follow. The two principles are: (1) the
speed of light is constant and (2) the laws of physics are the same for two
observers moving at constant speed relative to one another (or, no frame of
reference is privileged when it comes to doing physics). Grant these two
principles and the rest follows. As QTFSPP outs it: “Not only are the axioms
simple, but we can see at once what they
mean in physical terms” (my emphasis, NH) (5).
Standard theories of QM fail to be physically perspicuous and
the aim of reconstructionists is to remedy this by finding principles to ground
QM as natural and physically transparent as those that Einstein found for
special relativity. The proposals are
fascinating. Here are a couple:
One theorist, Lucien Hardy, proposed focusing on “the
probabilities that relate the possible states of a system with the chance of
observing each state in a measurement” (6). The proposal consists of a set of
probabilistic rules about “how systems can carry information and how they can
be combined and interconverted” (7). The claim was that “the simplest possible
theory to describe such systems is quantum mechanics, with all its
characteristic phenomena such as wavelike interference and entanglement…” (8).
Can any MPer fail to reverberate to the phrase “the simplest possible theory”?
At any rate, on this approach, QMs is fundamentally probabilistic and how
probabilities mediate the conversion between states of the system are taken as
the basic of the theory. I cannot say
that I understand what this entails, but I think I get the general idea and how
if this were to work it would serve to explain why QM has some of the odd
properties it does.
Another reconstruction takes three basic principles to
generate a theory of QM. Here’s QTFSPP quoting a physicist named Jacques
Pienaar: “Loosely speaking, their principles state that information should be localized
in space and time, that systems should be able to encode information about each
other, and that every process should be in principle reversible, so that
information is conserved.” Apparently, given these assumptions, suitably
formalized, leads to theories with “all the familiar quantum behaviors, such as
superposition and entanglement.” Pienaar identifies what makes these axioms
reasonable/interpretable: “They all pertain directly to the elements of human
experience, namely what real experimenters ought to be able to do with systems
in their laboratories…” So, specifying conditions on what experimenters can do
in their labs leads to systems of data that look QMish. Again, the principles,
if correct, rationalize the standard QM effects that we see. Good.
QTFSPP goes over other attempts to ground QM in
interpretable axioms. Frankly, I can only follow this, if at all,
impressionistically as the details are all quite above my capacities. However,
I like the idea. I like the idea of looking for basic axioms that are interpretable (i.e. whose (physical)
meaning we can immediately grasp) not merely compact. I want my starting points
to make sense too. I want axioms that make sense computationally, whose meaning
I can immediately grasp in computational terms. Why? Because, I think that our
best theories have what Steven Weinberg described as a kind of inevitability and
they have this in virtue of having interpretable foundations. Here’s a quote
(see here
and links provided there):
…there are explanations and
explanations. We should not be
satisfied with a theory that explains the Standard Model in terms of something
complicated an arbitrary…To qualify as an explanation, a fundamental theory has
to be simple- not necessarily a few short equations, but equations that are
based on a simple physical principle…And the theory has to be compelling- it
has to give us the feeling that it could scarcely be different from what it is.
Sensible interpretable axioms are the source of this
compulsion. We want first principles that meet the Wheeler T-shirt criteria (after
John Wheeler): they make sense and are simple enough to be stated “in one
simple sentences that the non sophisticate could understand,” (or, more likely,
a few simple sentences). So, with this in mind, what about fundamental starting
points for MP accounts. What might these look like?
Well, first, they will not
look like the principles of GB. IMO, these principles (more or less) “work,”
but they are just too complicated and complex to be fundamental. That’s why GB
lacks Weinberg’s inevitability. In fact, it takes little imagination to imagine
how GB could “be different.” The central problem with GB principles is that
they are ad hoc and have the shape
they do precisely because the data happens to have the shape it does. Put
differently, were the facts different we could rejigger the principles so that
they would come to mirror those facts and not
be in any other way the worse off for that. In this regard, GB shares the
problem QTFSPP identifies with current QM: “It’s a complex framework, but it’s
also an ad hoc patchwork, lacking any obvious physical interpretation or
justification” (5).
So, GB can’t be fundamental because it is too much of a
hodgepodge. But, as I noted, it works pretty well (IMO, very well actually,
though no doubt others would disagree). This is precisely what makes the MP
project to develop a simple natural theory with a specified kind of output
(viz. a theory with the properties that GB describes) worthwhile.
Ok, given this kind of GB reconstruction project, what kinds
of starting points would fit? I am about
to go out on a limb here (fortunately, the fall, when it happens, will not be
from a great height!) and suggest a few that I find congenial.
First, the fundamental principle of grammar (FPG)[1]:
There is no grammatical action at a distance. What this means is that for two
expressions A and B to grammatically interact, they must form a unit. You can
see where this is going, I bet: for A and B to G interact, they must Merge.[2]
Second, Merge is the simplest possible operation that
unitizes expressions. One way of thinking of this is that all Merge does is make A and B, which are heretofore separate, into
a unit. Negatively, this implies that it in no way changes A and B in making
them a unit, and does nothing more than make them a unit (e.g. negatively, it
imposes no order on A and B as this would be doing more than unitizing them).
One can represent this formally as saying that Merge takes A,B and forms the
set {A,B}, but this is not because
Merge is a set forming operation, but because sets are the kinds of objects
that do nothing more than unitize the objects that form the set. They don’t
order the elements or change them in any way. Treating Merge (A,B) as creating
leaves of a Calder Mobile would have the same effect and so we can say that
Merge forms C-mobiles just as well as we can say that it forms sets. At any
rate, it is plausible that Merge so conceived is indeed as simple a unitizing
operation as can be imagined.
Third, Merge is closed in the domain of its application
(i.e. its domain and range are the same). Note that this implies that the
outputs of Merge must be analogous to lexical atoms in some sense given the ineluctable
assumption that all Merges begin with lexical atoms. The problem is that
unitized lexical atoms (the “set”-likeoutputs of Merge) are not themselves lexical
atoms and so unless we say something more, Merge is not closed. So, how to close it? By mapping the Merged unit back to
one of the elements Merged in composing it. So if we map {A,B} back to A or to
B we will have closed the operation in the domain of the primitive atoms. Note
that by doing this, we will, in effect, have formed an equivalence class of
expressions with the modulus being the lexical atoms. Note, that this, in
effect, gives us labels (oh nooooo!), or labeled units (aka, constituents) and
endorses an endocentric view of labels. Indeed, closing Merge via labeling in
effect creates equivalence classes of expressions centered on the lexical atoms
(and more abstract classes if the atoms themselves form higher order classes).
Interestingly (at least to me) so closing Merge allows for labeled objects of
unbounded hierarchical complexity.[3]
These three principles seem computationally natural. The
first imposes a kind of strict locality condition on G interactions. E and I merge
adhere to it (and do so strictly given labels). Merge is a simple, very simple,
combination operation and closure is a nice natural property for formal systems
of (arbitrarily complex) “equations” to have. That they combine to yield
unbounded hierarchically structured objects of the right kind (I’ve discussed
this before, see here
and here)
is good as this is what we have been aiming for. Are the principles natural and
simple? I think so (at least form a kind of natural computation point of view),
but I would wouldn’t I? At any rate,
here’s a stab at what interpretable axioms might
look like. I doubt that they are unique, but I don’t really care if they
aren’t. The goal is to add interpretatbility to the demands we make on theory,
not to insist that there is only one way to understand things.
Nor do we have to stop here. Other simple computational
principles include things like the following: (i) shorter dependencies are
preferred to longer dependencies (minimality?), (ii) bounded computation is
preferred to unbounded computation (phases?), (iii) All features are created
equal (the way you discharge/check one is the way you discharge/check all). The
idea is then to see how much you get starting from these simple and transparent
and computationally natural first principles. If one could derive GBish FLs
from this then it would, IMO, go some way towards providing a sense that the
way FL is constructed and its myriad apparent complexities are not complexities
at all but the unfolding of a simple system adhering to natural computational
strictures (snowflakes anyone?). That, at least, is the dream.
I will end here. I am still in the middle of pleasant
reverie, having mesmerized myself by this picture. I doubt that others will be
as enthralled, but that is not the real point. I think that looking for general
interpretable principles on which to found grammatical theory makes sense and that
it should be part of any theoretical project. I think that trying to derive the
“laws” of GB is the right kind of empirical target. Physics envy prompts this
kind of search. Another good reason, IMO, to cultivate it.
[1]
I could have said, the central dogma of syntax, but refrained. I have used FPG
in talks to great (and hilarious) effect.
[2]
Note, that this has the pleasant effect of making AGREE (and probe-goal
architectures in general) illicit G operations. Good!
[3]
This is not the place to go into this, but the analogy to clock arithmetic is
useful. Here too via the notion of equivalence classes it is possible to extend
operations defined for some finite base of expressions (1-12) to any number. I
would love to be able to say that this is the only feasible way of closing a finite domain, but I doubt that this
is so. The other suspects however are clearly linguistically untenable (e.g.
mapping any unit to a constant, mapping any unit randomly to some other atom).
Maybe there is a nice principle (statable on one simple sentence) that would
rule these out.