Wednesday, April 11, 2018

Bale and Reiss formalism

Bill Idsardi & Eric Raimy

[Note: in this post ordered pairs and tuples will be enclosed in parentheses, (x,y), instead of with angle brackets, <x,y>. The Blogger platform tends to eat the angle brackets, which it interprets as malformed HTML. Yes, we could do it with HTML character entities, but that's painful to edit.]

Warning: this post is also not light bedtime reading.

We think that it will be instructive now to examine the Bale & Reiss (forthcoming; BR) formalism for phonology. Although their book is “just” an introductory text it is their laudable intent and very impressive achievement to be rigorous and didactic in building up their formalism. They start out with basic set theory, and they make it all very accessible to beginners, building it up piece by piece. Consequently the book is extremely clear on many matters that (all) other intro texts are vague or silent about. And we feel that the book is very successful in this regard. But (you guessed it) we have some qualms about their treatment of precedence.

In their formalism, BR have:
  1. Values, drawn from W = {+, -}
  2. Features, drawn from F = {high, low, … } -- a finite set
  3. Feature-value pairs, elements of S = W x F -- also finite
  4. Segments, which are consistent sets of feature-value pairs (p 377); see below -- also finite
  5. Forms, which are tuples of segments (p 36, pp 101-103) -- this is intended to be an infinite set
Notice that there is no mention of time or precedence here yet, so we’ll have to wait to see how that’s constructed (hint: it’s in 5, sort of).

Here’s the definition of Consistency (BR 445):

So consistency is a well-formedness condition on segments, for it goes above and beyond the set requirement by itself, as {(+,high), (-,high)} is a set of feature-value pairs. Like our discussion of EFP features a few posts ago, this is a NAND condition, +high NAND -high, it’s just a little harder to state now without a basic notion of events since it has to be stated as a condition on certain sets of features (absent a clear notion of time) rather than as a conjunction of properties of an event. That is, since segments are constructed as sets of features, the condition is stated as a condition on sets. The way it’s formulated, it quantifies over sets (A set of features …) and also over features inside those sets (no feature …) and consequently this is a second-order statement since it is quantifying over sets. However, since the set S is finite, the work done by this definition could instead be done by exhaustively listing the licit combinations (the consistent elements of the power set pow(S)) without using any quantifiers.

Without an available formal notion of time at this point, it’s a little difficult to know what to make of the segment datatype. The latent idea, so far unexpressed in the formalism, is that the segments are sets of feature-value pairs, maybe occurring “at the same time, more or less” or "overlapping in time, more or less" or something like that. But that’s not formally expressed yet, except by allusion in the name of the construct, “segment”. Presumably some statements in the transducers handle the relationships between the elements of a segment and their motor and auditory correlates. Therefore, without knowing what's in the transducers, it's hard to know if "being together in a segment" is a substantive notion or not. If there is some property such as approximate temporal overlap that's veridical with "in the same segment" then the notion would seem to qualify as substantive (at least as we understand it, i.e. veridical and useful). But Chomsky 1964's arguments regarding linearity are very powerful here, and strongly suggest that there is no obvious veridical notion in the overall mapping from UR to phonetics. So in that case, segments could be a purely formal part of the theory, with their in-the-same-set-edness not mapping to any consistent property in the motor or perceptual systems. That is, the transductions for segmenthood would be "interesting"; we think this is Veno's view at least.

But this will also depend on what the rule system does in the phonology, and therefore we shouldn’t be too hasty to think that the linearity arguments directly establish this point for SFP. One of Chomsky's linearity arguments was the comparison between “writer” and “rider”. With rules of flapping and vowel lengthening, the LTM (UR) distinction between /rayt+r/ and /rayd+r/ is mapped to a surface difference in the length of the preceding vowel [rayDr] and [ra:yDr] (these are Chomsky's transcriptions). So the derivation as a whole, as well as the LTM representation do not respect a condition of linearity with "phonetic" representations. But what about the output forms of the phonology, [rayDr] and [ra:yDr]? How do they fare with respect to linearity with the motor and auditory forms at the interface? Better, certainly, but are they "phonetic enough" to find a veridical relationship with specified aspects of the motor and perceptual systems, especially at the point of interface? This seems like a hard, and important question.

Moreover, if features are substantive, the +high NAND -high condition has at least a potential lawful relationship to the co-domain motor and perceptual conditions, i.e. mot(+high) NAND mot(-high) and also aud(+high) NAND aud(-high), where mot() and aud() are the transductions with the motor and auditory systems respectively. If these NAND statements are true motor and perceptual statements, then the consistency requirement is recapitulating motor and perceptual conditions within the model (as we -- er and wji -- think it probably should). But this isn’t entirely substance-free then. What would be a (purely) formal (and non-substantive) universal is if we could show that it is NOT the case that mot(+high) NAND mot(-high) and also not the case that aud(+high) NAND aud(-high). Then having +high NAND -high in the phonology would be a phonological truth without any motor or perceptual connection or motivation. And depending on the actual content of mot() and aud() that could perhaps be the case, but we need some actual proposals for mot() and aud() in order to evaluate that. If so, then +high NAND -high could be an example of a pure phonological delusion, of the type that Charles suggested last week.

Now to the forms. For much of their presentation, BR just call them strings without saying what that means formally. But we do find out on p 36 that they intend them to be what they call “ordered sets”, which they then tell us are tuples, see also BR chapter 18. We wish they hadn’t used the term “ordered set” because that term already has an established meaning in mathematics as a structure (S, R) where S is a set and R is a relation of order over the set, e.g. Schröder 2003 Ordered Sets. The more usual treatment would be to define strings inductively using concatenation (e.g. Harrison 1978).

OK, so what’s a tuple? Using tuples for this purpose brings up some very interesting issues. Counting up by size, there is only one 0-tuple, (). (Bale and Reiss use angle brackets, <>.) The 1-tuples are from S, notated (x), the 2-tuples are from S x S, notated (x,y)  -- think points on a plane -- the 3-tuples from S x S x S (x,y,z) -- think points in 3D space -- and so on. A problem here is that unless there is a fixed upper limit on the size of the tuples, then this is not finitely axiomatizable in first-order logic as it requires an infinite number of statements. This issue has a famous history in the case of arithmetic, Ryll-Nardzewski 1952, Mostowski 1952, Montague 1964. (I [wji] already commented on the blog about problems like this in regard to < and successor. You need transitive closure of successor to get <, and that’s not first-order finitely axiomatizable.) There are alternative definitions for tuples using nested tuples, but that doesn’t get us out of the problem here, which is ultimately one of inductive (recursive) definitions. This seemingly innocuous move trips up many, many people (see Keller 2004, Some Remarks on the Definability of Transitive Closure in First-order Logic and Datalog).

Also unhelpfully, the types for tuples are all different from each other, as (x,y) has nothing to do with (x,y,z). (Hutton 2016:26 is helpful here as are the discussions in formal semantics of things like transitive (e, (e,t)) and intransitive (e,t) verbs.)

So how do BR get precedence? The way they do this is to invoke a convention to index the components of the tuples (BR pp 101-1033); their discussion refers to them primarily as strings.
"These strings have an implied left-to-right linear order and are equivalent in structure to ordered sets written with angled brackets, as we discussed above. For example, the mental representation of the word man will be mæn which is equivalent to <mæn>." (p 101)
"We use various numeral subscripts, or indexes, not only to distinguish between the variables but also to indicate their relative position in a string. Thus, if x₁x₂x₃ = mæn, then x₁ = m (the first member of the string), x₂ = æ (the second member of the string), and x₃ = n(the third member of the string)." (p 102)
As could be predicted, we're not keen about implied representations for precedence or order and would much prefer an explicit notation for it instead. We're also not sure what "left-to-right" means other than something about typography. Is this a statement about how phonological representations map to temporal relations in the motor or auditory system?

But in addition, there are a couple of other issues with doing things this way. First, we don’t have any numbers yet because (1-5) didn’t provide any. So we have to give ourselves an infinite ordered set (in the usual sense, e.g. Schröder 2003), presumably the natural numbers N, which, as we said, are also not first-order finitely axiomatizable. When we have the numbers we can get the definitions that BR give, once we actually do the tuple indexing. It’s obvious how to do it, but since we are being formal, then we still need to say it. So here it is:
  • For all forms (x,y), x has index 1, y has index 2
  • For all forms (x,y,z), x has index 1, y has index 2, z has index 3
  • For all forms (w,x,y,z), w has index 1, x has index 2, y has index 3, z has index 4
  • ...
But now we’ve got the whole set of natural numbers in phonology. Do we really want them in there? Now we can talk about things like the 25th segment, do we want to be able to do that? There is another way out, without using any numbers or indexes. We can instead define the precedence relations directly on the components of the tuples, as follows:
  • For all forms (x,y), x^y
  • For all forms (x,y,z), x^y and y^z
  • For all forms (w,x,y,z), w^x and x^y and y^z
  • ...
You get the picture. (By the way, we’re quantifying over tuples of sets of feature-value pairs here.) Now we don’t need any numbers, and we can’t talk about segments by their position indexes because there aren’t any. And, as you could guess by now, this isn’t finitely axiomatizable either, because there would be an infinite set of these statements. This is why programming languages like Haskell have datatypes like lists, and strings are then lists of symbols. Then, at least, we can write an inductive definition on the size of the list (which we could mimic here using nested tuples, which are just cons cells by another name). That’s still not a first-order finite characterization, but it seems pretty clear that we’re not going to get one by proceeding this way, going up through sets and tuples, ending up with an infinite set of types.

In summary, precedence isn’t a primitive in the BR treatment, instead they build it out of feature-value pairs, sets, tuples and indices. Doing it this way is not first-order finitely axiomatizable. But it is finitely axiomatizable if we do it with events, properties and precedence as primitives.

So what’s the upshot here? Do these arcane points about finitary vs infinitary logic really matter? Are first-order and finiteness too much to ask? Probably. We would be content with monadic second-order (MSO) definable theories (which are first order plus quantification over monadic properties, like (*10) from the Boring details post, even though we rejected (*10)). Why are we ok with MSO? For one, this seems consistent with theories of semantics that we like (Pietroski) and MSO over strings is one characterization of the set of regular (= finite-state) languages, making a connection to the sub-regular hierarchy. But if we can bring everything down to first-order, then so much the better.


  1. One remark on MSO and the connection to the sub-regular hierarchy: I believe that's mostly orthogonal to your concern in this post. I take the latter to be with how to axiomatize the class of intended models, whereas all subregular work already presumes that the class of models is suitably restricted. Otherwise no set of finite strings could be FO-definable because finiteness is not an FO-definable property. Similarly, a context-free grammar as a tree generator would not have the same weak generative capacity as a context-free grammar as a string generator because the former is a fragment of propositional logic over tree propositions and the latter corresponds to a logic that vastly exceeds the power of MSO. The same split arises in finite model theory, where one takes finiteness as already given and then asks how logics differ over this class of intended models.

    So the thing that is unclear to me at this point is why we should care that the Bale & Reiss system is not truly FO if we take into account the cost of axiomatizing the class of intended models. Do we believe that linguistic cognition is set up to entertain non-standard models (e.g. Z-chains for a FO-theory of natural numbers) and thus we need axioms to rule out those non-standard models?

    1. Hi Thomas! Thanks very much for the comment. Yes, I agree with you here, MSO is kind of a curveball at the end. As you say in the second paragraph, I'm not at all sure that we should care whether it's pure FO or not. This post (maybe more than most) is a lot of noodling around, trying to a get a handle on things, certainly grasping at some straws that happen to be on the floor. I'm not at all sure what to think about non-standard models for arithmetic vis a vis cognition, and even less sure how that question relates to our concerns in phonology and syntax.

  2. Sub Sub-point: the issue doesn't have to be about axiomatizing the class of intended models. One can ask what kind of "language of thought" children bring to the table for purposes of acquiring languages that connect pronunciations with meanings. Suppose there were arguments that on the meaning side, we needed to posit a mental language that allows for second-order quantification, but only finitely many non-monadic predicates (all of which are atomic). Then one might throw Bill's curveball, and speculate that with regard to acquiring phonology, children don't employ a language of thought that is different in kind.

  3. Thanks Paul. As I'm sure you noticed we did limit ourselves to just one or two non-monadic relations (precedence, and the comments on a separate spatial relation for ASL). so I didn't fall asleep during that lesson ;-)

    1. Noticed indeed. If you're right about phonology, this invites a minimalist speculation that we've talked about before: a spare autonomous syntax could provide a semantics-phonology interface that approximates what we actually see, given a language of thought that is second-order but fundamentally monadic. In which case, given independent evidence for at least a spare autonomous syntax, we shouldn't assume a more powerful language of thought with regard to meaning or pronunciation.

  4. Alan Bale should really comment on this stuff, since he is primarily responsible for our rule semantics, but here is a quick point that I think he will agree with w.r.t. "But now we’ve got the whole set of natural numbers in phonology." No-- we have used the natural numbers to explain how the phonology works, but the numbers are not in the phonological grammar. We could have used other formalisms but the one with numbers is pedagogically convenient.
    My limited understanding of the matter is that the formalism used to, say, study the semantics of a programming language, is not going to be identical to the formalism of the language under study. I think that is what is going on here.
    (Of course, we also have to keep in mind that we are not delusional enough to think that our so-called SPE rule syntax is anywhere close to sufficient as a model of the phonological faculty.)
    I guess my comment will show that I don't understand the relationship between characterizing the complexity of the phono rule SYNTAX and the complexity of its semantics. Is the nature of this relationship obvious?

    1. Looking back at our semantics, I saw that we did say:
      "We use x and y as metavariables that range over mental representations. They are “metavariables” in that they are not part of the phonological system but rather are used (by us linguists) to specify an interpretation of the system. Thus, in the interpretation rules, x can serve as a stand-in for the mental representation of k (kM), or b (bM), or D (DM), and so on."
      The use of numeral subscripts is introduced just below this passage and was intended to be interpreted in the same way--the numerals are not part of the phonological system, as I suggested the other day.

      In last minute edits, we also try to clarify our writing to express the view that the consistency property of segments is to be understood, not as a constraint or well-formedness condition, but just as a description of segments as natural objects. Segments are NOT built by grammars in the way that sentences are, so there is no need to generate segments and subject the generation to constraints. We DO use consistency as a condition in unification rules later.