Wednesday, April 11, 2018

Hot Cross Buns

Bill Idsardi & Eric Raimy

In a comment Fred asked why we think the EFP structure is a directed multigraph, rather than just a simple digraph. The answer is because we allow self-loops and multiple parallel edges. We'll give linguistic examples later, but in the meantime, here's a fun "blast from the past".

In the book Rules, constraints, and phonological phenomena edited by Bert Vaux and Andrew Nevins, we had a chapter where we discussed various aspects of reduplication. One non-linguistic case that we found quite striking was Jeanne Bamberger's discussion of children's tune-building with Montessori bells in chapter 7 of The Mind Behind the Musical Ear.

Here's the tune for "Hot Cross Buns" in standard musical notation: 

And here's our reinterpretation of the child's encoding of the tune using events, features and precedence, following Bamberger's discussion quite closely:

Bamberger's chapter makes for very interesting reading because the Montessori bells have particular physical characteristics that she is able to  creatively employ to interpret the children's tune-building in very insightful ways. Unlike a piano, there are multiple bells with the same pitch. And the bells themselves are all the same size, so you cannot tell the relative pitch of the bells by visual inspection, you must strike them and hear the note. The bells were deliberately designed in this way, and are part of the Montessori sensorial materials. This allowed Bamberger to make interesting observations about which notes the children felt were the "same" or "different". For instance, the two separate "C" events in our diagram were executed on separate bells, and the same is true for the two "D" events.


  1. I just wrote a long thing about how you need to formally distinguish the edges that can only be traversed once versus those that can be traversed more than once, when I remembered the linearization algorithm from Eric's stuff, that requires all edges to be traversed at least once, and all edges to be traversed as few times as possible (although I'm curiously to see how that gets formalized).

    Also, my brain broke when I read in your reply to my other comment that your precedence relation isn't transitive. I'll keep following closely to see why/how that is.

    1. The non-transitive part is like child() rather than descendant(). If child(a,b) and child(b,c) it doesn't follow that child(a,c). But we allow self-loops, so this analogy falls apart. It's closer to a successor relation where items are allowed to be their own successors.

      But if you want something more mathematical to peruse, these things are similar to Moore machines with the edges as the state transitions (but without a real input alphabet). State transitions are also not transitive. If you can go directly from a to b, and directly from b to c, it does not follow that you can go *directly* from a to c. It's the directly part that makes it non-transitive.

      The Moore machines stand in contrast to the Mealey machines used by xfst/foma, or the general state machines used in UML diagrams.

      We'll give an algorithm for graph walking (path finding) between # and % which employs some interesting decision criteria when it encounters a choice between edges. One of these criteria is that the graph walking prefers untraversed edges to previously traversed ones, raising some interesting questions about working memory. And we need to say a few more things, like preferring going to an unvisited "regular" node instead of going to %.

      But we would be delighted for you to write your own algorithm. There are now maybe about a half dozen different ones, with somewhat different assumptions about what information about edges or nodes is manipulated in the traversal. I think we should explore a lot of possibilities on this question.

      And we're very open to monster discovery (and potentially monster barring). So dream up some crazy cases.

  2. Hi Fred, thanks for the comments and sorry for my delay in interacting on this. Bill and I have had a lot of stories on linearization/serialization but we've never had a 'ALL edges MUST be used at least once'. From the very beginning we had graphs that had conflicts in edges that could not be reconciled (think infixation). We have had a heuristic version of the intent of the statement where we want to most links used as possible.... its complicated and I'm a glutton for punishment... I've been thinking about this for about 20 years now... not sure what that means.