Sunday, November 10, 2013

Computational Linguistics: Too Computational for Linguistics?

Even though I have recently moved on from the bed of nails that is the current job market to a cushy tenure track job, I still find myself reading the job announcements on LinguistList on a daily basis. There's of course all kinds of professional reasons for doing so, but the actual driving force behind this minor obsession of mine is more twisted. For you see, I have an existentialist streak that allows me to derive perverse amounts of joy from things that should cause me grief, worry, pain, and outbreaks of homicidal rage. And job searches for computational linguists got all of that aplenty.

If you also frequent that other blog about linguistics, then you are probably familiar with their annual number crunching of the linguistics job market. Computational linguistics usually does extremely well in those, with the number of job searches vastly outstripping the number of freshly minted PhDs. But my beef isn't so much with the number of available jobs, it's with the kind of jobs advertised. Not at all with the industry jobs, those are straight-forward NLP searches and the skills they require are exactly those you will acquire in an MA/MS or PhD program in computational linguistics that's geared towards NLP (startups tend to list more requirements than big companies like Google and Nuance, but anecdotal evidence tells me that they are also more willing to settle for less). But not every computational linguist wants to do NLP, so he or she will look at the computational linguistics searches run by linguistics departments. And that's where the dramedy starts.

Why Linguists Love Numbers

Having seen many, many job announcements over the course of the last five years, I have come to the conclusion --- and you have to read the rest of this sentence in deadpan monotone, there is no outrage on my part here --- that linguistics departments have no interest in computational linguistics, what they want is linguistics with numbers.

Given a choice between somebody who studies OT from the perspective of formal language theory and somebody who couples Stochastic OT with some MaxEnt algorithm to model gradiance effects, they'll pick the latter. Given a choice between somebody who works with Minimalist grammars or Tree Adjoining Grammars and somebody who does corpora-based sociolinguistics, they'll pick the latter. Given a choice between somebody who studies the learnability of formal language classes and somebody who presents a Bayesian learner for word segmentation, they'll pick the latter. (Just to be clear, all these examples are made up.) Of course there are exceptions --- I got a job, after all --- but they are rare. And a job search for a computational linguist that explicitly favors the formal type over all others is unheard of.

Even if one considers quantitative research the greatest thing since sliced bread, this is an unfortunate state of affairs. In my case, it makes it way more difficult to convince your students that they have a good chance of landing a job if they join the glorious enterprise of computational linguistics. More importantly, though, it means that linguistics departments, and by extension the field as a whole, are missing out on tons of interesting work. This is the part where the rules of blogosphere click-baiting dictate that I turn the rant-ometer to 11 and diss the narrow-mindedness of linguists and their aversion to rigorous work. But I prefer to only do things that I do well, and my rants tend to be less like precision surgery and more like beating a paraplegic to death with their own severed leg. Not a beautiful sight to behold, so let's instead try to be rational here and answer the question why linguists do not care about theoretical computational linguistics.

The reason is very simple: it's about as approachable as a porcupine in heat. That's due to at least three factors that mutually reinforce each other to create an environment where the average linguist cannot be expected to have even the hint of a shadow of a clue what computational linguistics is all about.

Reason 1: Abstractness/Few Empirical Results

The theoretical work in computational linguistics is mostly "big picture linguistics". The major issues are the generative capacity of language and linguistic formalisms, the memory structures they require, their worst-case parsing performance, and so on. This is appealing because it carves out broad properties of language that are mostly devoid of theory-specific assumptions and pin-points in what respects formalisms differ from each other if one abstracts away from substantive universals. Eventually this does result in empirical claims, such as what kind of string patterns should not occur in natural language or in which contexts Principle B is expected to break down. But the formal pipeline leading to the empirical applications is long, and it is often hard to see how the mathematical theorems inform the empirical claims. At the same time, many computational linguists are not linguists by training and hence are resilient to moving out of their formal comfort zone. On the Bayesian and probabilistic side, however, we find tons of linguists who want to explain a specific empirical phenomenon and simply use these tools because they get the job done (or at least it looks like they do). When asked why their work matters, they can give a 5 second summary and point to a neverending list of publications for further background. A theoretical computational linguists needs at least 5 minutes, and they can't give you any papers because they are all too hard.

Reason 2: Difficulty/Lack of Prior Knowledge/Lack of Good Intros

There is no denying that computational linguistics isn't exactly noob-friendly. Before you can even get started you need to be able to read mathematical notation and understand general proof strategies, and that's just the baseline for learning the foundational math and CS stuff you have to know before you can get started on computational linguistics proper. Only once all of that is in place can you really start to think about the implications for linguistics. To add insult to injury, none of the math is anything like what you know from high school or the calculus courses you took as an undergrad: formal language theory, complexity theory, abstract algebra, mathematical logic, finite model theory, type theory, learnability, just what the heck is all this stuff? Do you really want to wrestle with all this arcane lore for at least two years in the vague hope that eventually it might tell you something really profound about language, or would you rather hop on the Bayesian gravy train where your first baby steps only require addition and multiplication?

What more, the overwhelming majority of math you will have to learn on your own because no linguistics department teaches courses on these topics and the corresponding courses in the math and CS departments focus on things you do not need. There's also a dearth of good textbooks. The best option is to compile your own reader from various textbook chapters and survey papers --- the kind of fun activity that even Sir Buzz Killington wouldn't put on his weekend schedule. On the other hand, you could just drop this whole idea of learning about computational linguistics, take your department's statistics and experimental methodology courses, and spend the weekend marathoning Orange is the New Black. Tough choice.

Reason 3: Lack of Visibility/Lack of Practitioners/Lack of Publications

Computational linguistics as a whole is a minority enterprise within linguistics, but its theoretical spin is downright exotic. The odds of coming across a formal paper in a mainstream journal are rather low. LI sometimes features formal work in its Remarks and Replies section, and some full-blown technical papers got published in Lingua.1 Formal talks are also far from common at the major conferences such as NELS and WCCFL (CLS is a positive exception in this respect, and the LSA is downright abysmal). But this is not enough to reach critical mass, the level where linguists start to develop a general awareness of this kind of work, its goals, its merits. This of course means fewer hires, which means fewer publications and fewer students, which means reduced visibility, which means fewer hires. It's the textbook definition of a Catch 22, assuming that there are any textbooks that define Catch 22.

We're Doomed, Doomed! But With Every Year we're Slightly Less Doomed

The above isn't exactly great PR for computational linguistics: a steep learning curve with few immediate rewards, permanent outsider status in the linguistic community, and colleagues that waste their Sunday afternoons ululating on some obscure blog. But things aren't quite as bad. People do get jobs (good ones to boot), but the road leading there takes a few more unexpected turns and twists. If you're a syntactician or a phonologist, you have a clear career trajectory, you know how many papers to publish in which journals, what conferences to attend, and what courses you'll wind up teaching. Most importantly, you are part of a big community, with all the boons that entails. The same is also true for quantitative computational linguists. As a theoretical computational linguist that wishes to be hired in a linguistics department, you have to find your niche, your own research program, and you have to figure out how to market it in a field where almost everybody you talk to lacks even the most basic prerequisites.

I imagine that things must have been similar for generativists back in the 50s and 60s when linguistics departments were dominated by descriptivists that knew little about Transformational grammar, didn't care about it at best and actively loathed it at worst. But Transformational grammar succeeded despite a higher degree of necessary rigor because it offered a new and highly insightful perspective. The same is true for theoretical computational linguistics. Scientifically, we are already on our merry way towards a bright future, which is evidenced by breakthroughs in learnability, the unifying force of a transductive perspective of grammar formalisms, and many more (all of which deserve their own posts). On a sociological level, there's still PR issues to overcome, but things are improving in this respect, too.

The three problems listed above can all be solved by publishing more work that solves empirical problems with computational tools that are easy to understand on an intuitive level. And thanks to the efforts of people like Robert Berwick, Aravind Joshi, Paul Smolensky, Ed Stabler, Mark Steedman, and all their students, this has become a lot more common since the early oughts. Jeff Heinz and Jim Rogers have done some remarkable work with their students that factors phonology in appealing ways yet is pretty easy to follow --- if you know how to play domino, you can get started right away.2 Tim Hunter uses MGs in his unification of freezing effects, adjunct islands, and extraposition constraints.3 The Stabler parser for MGs is already used to model and predict processing difficulties.4 I, too, have moved from constraints and feature coding to empirical applications of this result for binding and island constraints.5 All of these topics are simple enough that they can be discussed in an advanced grad student course without having to presuppose years of formal training. As a matter of fact, that's exactly what I am doing in my computational seminar this semester, in an attempt to get them hooked and eager to try some of the harder stuff --- yes, it's deliberately designed as a gateway drug

Of course it would be nice to also have some textbooks for students at departments without a computational linguist, maybe even an online course (yes Norbert, small fields actually stand to profit a lot from MOOCs, although the enrollment probably wouldn't be all that massive; OOCs rather than MOOCs). Well, I'm working on some of that, but this takes more time than I have right now --- patience, my young Padawan, patience. In the meantime, I'll just keep you guys briefed on all the interesting computational work that's going on now. Who knows, maybe you'll think of it when your department is looking for a computational linguist.

  1. Idsardi, William (2006): A Simple Proof that Optimality Theory is Computationally Intractable. Linguistic Inquiry 37, 271-275.
    Heinz, Jeffrey, Gregory M. Kobele, and Jason Riggle (2009): Evaluating the Complexity of Optimality Theory. Linguistic Inquiry 40, 277-288.
    Bane, Max, Jason Riggle, and Morgan Sonderegger (2010): The VC Dimension of Constraint-Based Grammars. Lingua 120, 1194-1208.
  2. Heinz, Jeffrey (2010): Learning Long-Distance Phonotactics. Linguistic Inquiry 41, 623-661.
    Chandlee, Jane, Angeliki Athanasopoulou, and Jeffrey Heinz (2011): Evidence for Classifying Metathesis Patterns as Subsequential. Proceedings of WCCFL 2011, 303-309.
  3. Hunter, Tim: Deconstructing Merge and Move to Make Room for Adjunction. To appear in Syntax.
    Hunter, Tim and Robert Frank: Eliminating Rightward Movement: Extraposition as Flexible Linearization of Adjuncts. To appear in Linguistic Inquiry.
  4. Gregory M. Kobele, Sabrina Gerth, and John T. Hale (2013): Memory Resource Allocation in Top-Down Minimalist Parsing. Proceedings of FG 2012/2013, LNCS 8036, 32-51.
  5. Graf, Thomas and Natasha Abner (2012): Is Syntactic Binding Rational. Proceedings of TAG+11, 189-197.
    Graf, Thomas (2013): The Syntactic Algebra of Adjuncts. Ms., Stony Brook University.


  1. Completely agree about the lack of good textbooks or introductory articles. There aren't even any good ones for modern language theory -- i.e. the MCS hierarchy.

    1. And advanced textbooks like Gecseg&Steinby84 have been out of print for years now. As a matter of fact, even the Handbook of Formal Languages has been hard to obtain for quite a while (Springer only lists the first two volumes, and the missing third one is --- naturally --- the most important one for our line of work; makes me wonder how I got my PDF of the complete handbook).

      There seems to be no viable market for these things, which makes it hard to publish a handbook. Writing a good textbook without any financial reimbursement requires more altruism than most people can muster up. And it's not exactly the kind of topic where you can crowd-source expertise through a wiki. So, yeah, not sure what to do about the lack of good intro material.

  2. I completely agree with this statement: "The three problems listed above can all be solved by publishing more work that solves empirical problems with computational tools that are easy to understand on an intuitive level." The works that you're citing in that paragraph are all excellent examples. One thing that I'm confused about is your definition of computational linguistics, which seems to exclude probabilistic methods. Why is that a useful dichotomy? Where does the work you mention on modeling reading times using probabilistic minimalist grammars fall on this spectrum?

    1. It's a useful dichotomy only in the sense that if your work is predominantly probabilistic, you have an easier time talking to linguists because they know what probabilities are and what they can be used for. That's only a rough guideline, of course, if you work on weighted tree transducers over probability semirings you'll still have a hard time, probabilities notwithstanding.

      In case it came across like that: I'm not saying that probabilistic work is less computational, but that non-probabilistic work in computational linguistics has a harder time reaching and engaging linguists. In a universe where high schools taught algebra and formal language theory and probability theory was an esoteric oddity of higher mathematics with few practical applications, things would probably be the other way round.

    2. Yeah, I don't know about that. There's a lot more going on in most probabilistic work than just basic probability theory. I don't think the bottleneck in either the probabilistic or the formal language theory case is lack of mathematical sophistication on the part of linguists; it's the computational linguists' job to make the case for why their work matters. Just to mention one example from the Bayesian gravy train, not that many linguists are familiar with, say, the Pitman-Yor Process or MCMC sampling, but they can still get the potential theoretical significance of something like the Perfors et al Cognition paper that comes up on this blog occasionally.

    3. it's the computational linguists' job to make the case for why their work matters.
      No major disagreement there. I still think it's a harder job, but well, life ain't fair. We'll just have to work even harder to get the message across.

    4. Regarding I don't think the bottleneck in either the probabilistic or the formal language theory case is lack of mathematical sophistication on the part of linguists;
      Not saying that's what you're implying, but better safe than sorry: Despite the occasional snark I do not blame linguists for not knowing enough math to appreciate the intricacies of my research. The way things are right now is not some great cosmic injustice, you can see how we wound up in this situation for very pragmatic reasons (including the ones I listed in the post).

  3. This comment has been removed by the author.

  4. Well, "computational linguistics" is a bit of a catch-all. In addition to NLP, "linguistics with numbers" and shall we say "linguistics with proofs", one could easily distinguish "linguistics plus computer programs", "descriptive work plus computer programs", "proofs tangentially related to linguistics", "psycholinguistics with computer programs", "psycholinguistics with statistical models", "psycholinguistics with fancy statistical analysis" etc etc etc etc. All these are very very different things and I personally am just glad that in recent years linguistics departments have ventured out and helped make it at least a bit kosher to want "computational X" in whatever sense.

    Anyhow I'm tempted to make double sure to spin what you're saying in the direction of "be aware, here's what non-computational linguists seem to be primed to hear" -- so that we can work with that as a starting point in this uphill battle. I think is where you were going with the deadpan thing. There's a tendency, on the other hand, to say "linguists are unfairly shutting formal people out because They Just Don't Get It." I think that's exactly NOT what you were going for, and it's the same old other-blaming mope you hear from theoretical linguists - "I can't talk to Those psychol{og/ingu}ists because They Just Don't Get It." But it's no coincidence that high performing 20th century physicists doing weird 20th century physics like Feynman and Einstein seemed to have nice, low-dimensional explanation-to-a-five-year-old versions of their work ready at all times. People who think big are much better at what they do when they have thought so deeply about complicated things that they are simple.

    As much as I have encountered the problem myself and continue to do so I am still not convinced that the absence of a 5-second version is a necessary problem of the territory. I think the special problem formal work brings is precisely that it is NOT computational - at least not in the sense of "this is something I did on a computer." The entities one needs to refer to are not computers but really quite ethereal abstract objects. But, hey, the problems better darn well still lie squarely in the common ground or else I'm not sure why we want to be talking to linguists anyway. So, as they say in Quebec, "don't drop the potato" - keep on trying to find new and better ways of communicating with "normal" linguists and I think we will all be doing better science for it. Their willingness to accept computational anything into their lives is a foot in the door, even if they don't KNOW they want "that" kind of computational something - yet.

    1. Full ack. While things are still far from ideal, from what I've heard they are a lot better than even 15 years ago. But this also means that now we have to do everything we can to get linguists interested before this window of opportunity closes again.

  5. I agree with your major point that hiring committees are very interested in "linguistics with numbers". My own sobering opinion is that many such committees are not even fully capable of evaluating whether a linguist uses numbers well or poorly, for good or for evil. Much like the parable of the drunk and the lamppost, they evaluate "impact" instead.

    Speaking of impact, though, you haven't really sold me on the impact of proof-based computational linguistics such as the work you cite (modulo some of the work you've explained on this blog). Take Bane et al. for instance. They show that the VC dimension of HG/OT is finite. From where I sit, and with all due respect to the authors, the impact of this finding is quite minimal. First, this tells us next to nothing about learnability. I believe (as do most here) that children learn something like a minimalist grammar, yet MCSGs we know about have an infinite VC dimension, so as Galilean biolinguists, we should at least suspect that finitetude of VC dimension is irrelevant to learnability. Secondly, if I understand correctly, the finiteness result assumes that CON is finite, which is a matter of some contention (I believe Bill Idsardi has written on this). Third, OT has been the dominant paradigm in phonology for almost 2 decades before Bane et al; does anyone really believe that a negative result would have changed that one bit?

    So, I guess what I'm asking is whether you could explain why hiring committees should prefer a proof-writer candidate over a Bayesian word segmentation candidate. I am not trying to be critical, just asking out of my own ignorance.

    1. Bane&Riggle I cited mainly because it is one of the few cases where an unabashedly technical paper was published in a mainstream journal. I do not think it is actually a good example for selling computational linguistics, precisely because it is so technical. That being said, imho the interesting contribution of the paper isn't that OT has finite VC dimensions, but that Harmonic Grammar does, too. One reason that OTistas reject HG is their belief that it makes the grammar more complicated, but at least with respect to PAC-learnability that isn't really the case. This holds as long as every grammar uses only finitely many contraints --- the set of constraints furnished by UG may nonetheless be infinite, though. So it applies even if you do not think that CON is fixed across grammars.

      Regarding your second question, there's two factors to take into consideration: scientific merit and socio-political advantages. The latter vary a lot between departments. If you're a small department, for instance, having a proof-writer will allow you to carve out a niche for yourself, while being yet another department with a Bayesianist does very little to put you on the map (unless this person quickly becomes a rock star, but then you'll also quickly lose them to a more prestigious department).

      Anyways, the more interesting factor to discuss is scientific merit. I can't go into much detail here (more posts on the way). Let me just quickly touch on two points. First, the papers cited in fn 2--5 are good examples of how formal considerations can inform empirical research in syntax, phonology and psycholinguistics. It's always good to look at a problem from as many angles as possible, and the computational perspective complements the generative one really well.

      Second, I would argue that a computational perspective is indispensable given the mentalist commitment of generative grammar. I think most generativists would agree that the cognitive aspirations haven't worked all that well so far, at least in syntax. For example, it is unclear how specific syntactic assumptions affect processing or learnability. Are there parsing reasons to avoid remnant movement? Are island constraints learnable? If so, always, or only for some grammars?

      Somebody who's not a proof-writer will tackle these issues by designing models or doing case studies. Provided that this actually yields a useful answer, then you're immediately faced by the question which parameters the result depends on. And this is very hard to determine if all you do is building models. In the worst case, you're stuck with trial-and-error.

      The abstractness of the proof-writer approach has the advantage that it quantifies over grammars, and the proofs make fully explicit which assumptions your theorems depend on. So we can give very general answers that hold for a variety of frameworks under various conditions. That's not just a matter of having a wide coverage and keeping your results from being outdated as soon as Chomsky writes a new paper, it's also about efficiently isolating the constituting parameters of a complex problem. In other words, moving from simulation to genuine understanding.

      Of course this sounds rather self-important without any concrete examples to back it up, but for now I can only refer you to the papers above or ask you to wait until the next post.

    2. Argh, unfortunate typo: This holds as long as every grammar uses only finitely many contraints should be this holds as long as every grammar uses only a fixed finite number of constraints. So there is an upper bound on the maximum number of constraints in a grammar, but I don't think you have to stipulate that they all use the same constraints. I don't remember if that is shown in the paper, though, it might just be a conjecture on my part that I somehow remembered as fact.

    3. I think most generativists would agree that the cognitive aspirations haven't worked all that well so far, at least in syntax.

      Really? i beg to differ. Indeed I suspect my whole department would disagree. Indeed, though I am sympathetic to your kind of work I don't think that much bearing on these issues has come from it, whereas quite a bit has come from the abstract musings of syntacticians. For example, are islands learnable. Not from the data provided. Given what we take to be the PLD how are island effects to be acquired? No data, no learning. You and Alex C have regaled us with problems, but I have yet to convince myself these are real. Yes, we need some kind of priors in addition to constrained hypotheses spaces, but I guess I have not been terribly convinced that the sky is falling. I'll await the detailed insights you provide. Wrt parsing, we have made some insights about how it takes place and how to model it. I am still a big fan of versions of Berwick and Weinberg as well as the stuff by Rick Lewis. No theorems, but insightful formal work. I guess I don't know anything else that advances the real parsing issues, how sentences are mapped to meanings from sounds in real time that comes from the very formal world you allude to. But I am sure I'll get some good examples soon. I'm waiting with baited breath. But for the record, no, I ink we have learned a lot about syntax and cognition. Maybe formal stuff will add to what we know, but the proof is in the eating.

    4. not all that well != not at all

      I'm not denying that progress has been made, quite a lot actually. But I would locate it mostly on a descriptive level, For example, I don't see any coherent proposals for parsing models that could be evaluated independent of the substantive universals they assume. If I look at the processing literature, I see ideas like serial vs parallel parsing, top-down and bottom-up, reanalysis, memory decay, but few of them form a cohesive whole that could be called a parser. So any kind of claim that a certain model predicts or cannot predict a certain processing effect has to be taken with more than just a grain of salt. There's also claims that Minimalist syntax is unsuitable for parsing because of its bottom-up nature, which confuses the specification of a grammar with how it is processed (CFGs generate top-down but can be parsed in either direction). So there's many claims being thrown around that seem to be based solely on "I can't see how it could be done" rather than actually proving that it is impossible.

      To reiterate, I'm not saying that everything done in psycholinguistics is hogwash and all problems would already be solved if only we could have chained all those experimentalists to their desks to prove some theorems rather than have them bombard innocent undergrads with garden path sentences. But I do not see any worked out linking theory between theoretical syntax and psycholinguistics, the assumptions that are made to tie the two together strike me as rather ad hoc and also have a tendency to differ wildly among researchers. That's where a computational approach could clarify things quite a bit, methinks.

  6. Random thought from the peanut gallery: Thomas G: "This holds as long as every grammar uses only finitely many contraints --- the set of constraints furnished by UG may nonetheless be infinite, though. So it applies even if you do not think that CON is fixed across grammars." It occurred to me a few days ago that you might be able to get learnability in the absence of any absolute limit on grammar complexity (eg infinite VC dimension) by imposing an upper bound on grammar complexity as a function of age. So for each age,there would be a finite VC dimension of grammars learnable by that age.

  7. Thomas: The problem is more general. Those working on formal models--not just the formal/statistical variety you discuss--bear a responsibility of making their work relevant to other researchers. I think a major reason for the success of GB/MP, which is an abstract model based on the study of very familiar languages, was because it was very useful in the study of very unfamiliar languages. The range of complex specific examples in LGB, including languages that appeared very unfamiliar back then, was amazing. I wasn't around at the time, obviously, but I can easily imagine one running to his or her favorite language since the theoretical devices are presented clearly--LGB style--with worked out case studies.

    Let me offer an example from personal experience. In a term paper for Ken Wexler's class, I essentially completed the variational model of parameter setting that became part of my thesis (it's not that complicated, after all). It worked better than triggering and I showed it to some people, thinking my job was done. The reaction was largely negative (for many reasons and that's for another day), but two very senior colleagues told me that in order for the work to have any bite, I needed to find empirical evidence from child language to show the reality of the variational model. (They don't even work on child language.) *That* took a lot of work--at a time I had no interest in child language whatsoever--and I am still learning. But the work became a lot stronger and more relevant, thanks to their advice.

    I think the (reasonable) empirical worker's skepticism toward abstract models is well justified: In what way does it help us understand child language development, how does it affect the action of the parser, whether CED is reducible to ECP/Subjacency/Barriers (to cite one of my, and I'm sure yours, favorite examples). What's in it for me? The work you mentioned are all good progress in that direction, and we need to do more, by taking initiative ourselves. Some "brands" of work may be perceived to be more relevant to empirical matters, as you lament: time will tell.

    1. Charles, sorry for the late reply, I had completely missed your comment and if it weren't for Jeff Heinz (thanks Jeff!) I would still be unaware of it. I'll have to figure out how to set up email notifications >.>

      Anyways, I agree with everything you say. Yes, the problem is better expressed as a split between model builders and proof writers, phrasing it in terms of probabilistic vs discrete is misleading (which Tal also tried to hammer through my thick skull). And of course mathematical proof in the absence of any link to empirical issues is highly unsatisfying. Not just for linguists, but also for proof-writers. Or at least it should be, we're mathematical/computational linguists, not mathematicians.

      That being said, I think that there's actually a lot of computational work that has linguistic implications even if they are not readily apparent. The examples above are kind of "in your face" about what it means for language, to the extent where it is impossible to miss the point. But a lot of work has implications that are harder to pick up on because they are big picture issues or rather subtle. In my experience it is very hard to get those across because they require a certain way of thinking about language and computation, which isn't something a reader can acquire from just a single paper. So I'm hoping that as we get more exposure through the applied work, people will become at least slightly curious about these abstract issues, too.

  8. This comment has been removed by the author.

  9. Hi everyone,

    If you are a software developer or simply have a burning desire to reduce illiteracy rates and improve global education, you should be interested in this new challenge The All Children Reading (ACR) Global Challenge Development (GCD) is hosting Enabling Writers, a $100,000 prize competition aimed at finding technological solutions to improve reading skills for children in developing countries. Enabling Writers seeks to spur the development of software that easily allows authors to write and publish materials to help primary school children in developing countries learn to read in mother tongue languages. In the first round of the prize, three finalists will be awarded $12,000 each and offered feedback to improve their submissions for field testing. The technological solution that best enables local writers to quickly and easily create appropriate and interesting texts that follow tested reading instruction methodologies, and provide the optimum reading and learning experience for early primary school children, will win the $100,000 grand prize.

    Established in 2011 as a partnership between USAID, World Vision and the Australian Government, ACR GCD aims to catalyze the creation and expansion of scalable, low-cost education tools and initiatives to improve literacy for early-primary students.

    To learn more about the Challenge and to apply, go to or follow us at on Twitter

    ps Please share the link and spread word about the challenge. The more applications and solvers we have, the more chance we have of finding a long-lasting solution and reducing global illiteracy rates. :-)

  10. Oh, I was kind of occupied when this blog post was posted, but I would have had so much to say about this from very personal experience. This thread is not fresh any more but I will say that trying to walk the tightrope of being a computer scientist who does language and a linguist who is computational leads to a bit of a "neither fish nor fowl" problem, career-wise. It turns out that one isn't both/and, but neither/nor, at least some of the time.