Faculty of Language: Thomas Can't Into Cognitive Modelling

Saturday, June 28, 2014

Thomas Can't Into Cognitive Modelling

This week I got to present at CMCL 2014, a workshop on computational models of language-related cognition, i.e. processing, acquisition, discourse representation, and so on. My talk was about the connection between Stabler's top-down parser for Minimalist grammars and the processing of relative clauses, something I've been working on for a while now with Bradley Marcinek, a student of mine. Thanks to Greg Kobele, John Hale and Sabrina Gerth, we already know that the predictions of this parser depend on one's syntactic analysis in interesting ways, so we wanted to extend their line of work to some other well-known phenomena. Long story short, our results are rather messy and it will be a while until we can get this idea truly off the ground.

That is why I won't blog about this research quite yet (except for the shameless self-promotion above) and instead focus on the talks I heard, rather than the one I gave. Don't get me wrong, many of them were very interesting to me on a technical level; some of them even pierced my 90s habitus of acerbic cynicism and got me a bit excited. Quite generally, a fun time was had by all. But the talks made me aware of a gapping hole in my understanding of the field, a hole that one of you (I believe we have some readers with serious modelling chops) may be able to plug for me: Just what is the point of cognitive modelling?

Advantages of Modelling

Don't get me wrong, I understand why modelling can be useful, and human cognition is one of the most interesting things to study. But from where I'm standing, the two simply do not fit together. Or rather, I think what people are trying to do cannot be done by modelling and instead requires an approach grounded in mathematics --- theorems and proofs.

Let's first outline some clear advantages of developing computational models:

Proof of concept
Your idea might sound batshit crazy, but if it can be turned into a working model that works on a wide range of problem instances, that demonstrates a certain level of sophistication. So maybe we shouldn't dismiss it right away.
Getting results
Models are great from a utalitarian point of view: you have a problem, and your model solves it for you. You want to know if tomorrow's picnic will be a pleasant sunshine siesta or a rainy rancor trigger? Let's feed the data into our weather model and see what it has to say.
Testing for problems
A model can test a much bigger set of data than any group of humans can, so they're a great way of hunting for holes in your theory.

There's another point that I surmise many would like to add to this list, namely explicitness. Since a computational model needs to be coded up in some programming language, there's a lot of things that need to be decided on a technical level. The usual abstracting and hand-waving that is common in many disciplines just doesn't cut it. But keep in mind that is not necessarily a virtue. It is conceivable that some of the decisions you made during the implementation process could have a major effect on the performance of the model, so the model might have little to say about the original theory that inspired it. The likelihood of such a scenario depends a lot on what you are modelling, but since I don't have the faintest idea how much of an issue this is in real life (has anybody looked at this?), I'm gonna count it neither as a positive nor as a negative.

Why do we Study Cognition?

Just because modelling has advantages doesn't mean that its advantages are of much use for a given area: a wine cooler is a nifty thing to keep in your kitchen, but it's pretty worthless at a rehab clinic. In the case of cognitive modelling, it really depends on what you are trying to achieve. For computational linguists, cognition might be just another pesky quirk of humans that makes language needlessly complicated for computers. They just need an efficient method for constructing discourse representations, making semantic associations, and whatever else you need to simulate human-like understanding and usage of language. Given such a tool, they also need to verify that it works for a wide variety of industrial applications. Both issues are covered by advantages 2 and 3 above, so modelling does indeed fit the bill.

But I, for one, do not care that much about cognition as a problem for non-sentient machines. I am interested in how human cognition works and, most importantly, why it doesn't work in different ways. More boldly: what makes human cognition human?

If that's your main interest, it's not enough to show that some model works for a given problem. The important questions are:

Is it guaranteed to succeed on for every problem for a given problem space? In formal terms: is it sound and complete?
Why does its solution to the problem actually work --- what does that tell us about the problem?
How is the workload distributed across the assumptions and techniques your model incorporates?
Are there different ways of doing it? Can we translate between different models in an automatic fashion?

As far as I am concerned, asserting the success of a model based on simulations is an instance of the logical fallacy of existential instantiation. Even in probabilistic settings it is unjustified --- just image what PAC-learning would look like if people simply tested their algorithms a million times against various data sets and then called it a day.

But even if you're not ready to take that extreme stance, a model by itself is still a very boring thing and provides little insight. What matters is its relation to other models --- those that succeed as well as those that fail. Now since that is an infinite class, the standard strategy of designing and testing models via simulations won't be able to answer any of these issues. If you need to understand the structure of an infinite object, you are firmly within the realm of theorems and proofs. And I don't see any of that in the cognitive modelling community.

The Argument Against Theorems and Proofs

I suppose one reply to this little rant of mine would be that in an ideal world a proof-based approach would indeed be preferable, but the problem is simply too complex to be studied in this fashion. Just like you can't prove many things about the behavior of leaves blowing in the wind, the system is too complex to be studied in this fashion. So rather than fruitlessly toiling away for hundreds of years, we accept the limitations of the approach and sacrifice a little bit of rigor for a huge increase in data coverage.
To this I have two replies, one personal and one more objective. On a personal level, one of my guiding credos is that a question that cannot be answered in a satisfying manner is not worth asking. So if the problems the cognitive modellers are trying to solve are indeed too complicated to be studied in an insightful manner (according to my high standards of what counts as insightful), then they simply aren't worth studying scientifically (the engineering angle is still perfectly viable, though). Pretty black and white, but a simple(minded) view of things is comforting once in a while.
More generally, though, my hunch is that the reply itself relies on a false equivocation. The problems themselves may indeed be complicated, but your models are mathematical objects. So simplify them, figure out the math for the simple cases, and keep expanding your results until you reach the level of the original models again. In many cases we are dealing with the construction of special cases of hypergraphs that are evaluated over a probabilistic semiring. That's not exactly the epitome of mathematical inscrutability. Why, then, don't we see any work along those lines? Or is this actually a big research area, and the only thing to blame is my ignorance of the field?

A Reconciliatory Closing Note

As should be clear by now, my problem isn't so much that there are people who are happy to develop models of certain aspects of human cognition. What's got me flummoxed is that this is all that's being done apprently, when it should only be the first step. Just like I think that field work is very valuable to linguistics, but cannot understand people who think that it is the primary task of our field. But I'm also acutely aware that I'm pretty much talking out of my metabolism waste dispenser --- except for syntactic processing, I follow most of the cognitive modelling literature very casually at best. So I might have easily missed something important, in which case I hope that at least one of you will set my pompous self-righteous head straight.

31 comments:

UnknownJune 29, 2014 at 8:21 AM
I'm not sure I understand what's at issue here. What would be an insightful cognitive modeling result in your opinion?
ReplyDelete
Replies
Noah MotionJune 29, 2014 at 9:54 AM
And I don't see any of that in the cognitive modelling community.

You don't see any theorems and proofs in the cognitive modeling community? Then you're not looking particularly hard. The subfield of mathematical psychology has plenty of theorems and proofs (as well as simulations, computation models, and so on).
ReplyDelete
Replies
Asad SayeedJune 29, 2014 at 8:10 PM
Hmm, I wrote a long post here but Blooger seems to have eaten it. (Or moderated?)

Anyway, the tl;dr was that: (a) it was nice to meet you at CMCL and (b) I think that your provided objection to your real objection (ie, that the world is too complicated for theorem-proving to provide good explanation) is not the one that cognitive modellers, such as those that appear at CMCL, would actually make. I think they would go "whole hog" and argue that the right way to study language is to construct simulations that increasingly match the behavioural output as well as match whatever it is we know about neurobiology, and that theorem-proving is partly a red-herring, and that the infinitude of the object is not biologically relevant.
ReplyDelete
Replies
Greg KobeleJune 30, 2014 at 5:31 AM
I think an instructive parallel can be drawn between `cognitive modeling' and `writing grammar fragments'. While there are some who disparage the latter (arguing that that is something a learner should do for us in a principled way), I suspect that you are more positively inclined to these than to those. What do you think the difference is between these two activities? Here are some questions to get you started:
1) is there a difference (for you) between writing a fragment in HPSG vs TAG?
2) between writing a small vs a large fragment?

I personally think (agreeing with Stefan Müller) that large grammar fragments are necessary (but not sufficient) in seeing whether or not these ideas that we as linguists have actually do what we think they do.
ReplyDelete
Replies
ewanJuly 1, 2014 at 6:04 AM
I too wrote a long post that got eaten, I think there must have been a timed out session involved.

The short version is that I think "proof of concept" doesn't do modeling enough justice. Modeling is exploratory work, but exploratory work is not something you go do once just to hash out or try out a few things before going back to the "real" work. Exploratory work and theory driven work are equal partners in science, addressing different parts of the creative cycle.

To make the analogy, modeling is like "data collection" about the space of possible models. Proofs are like "theorizing" about that same space. There needs to be an interplay between the two. Sometimes theorizing will be "data"-driven, merely an effort to explain why certain models work and others don't. Benjamin alluded to the urge to do this among Bayesian modeling people already, I'd say I'd hope the pressure from reviewers to explain catches up soon (which may only happen after we take some distance from the compulsively theory-hollow world of NLP). But, to take it a bit further, this would be butterfly-collecting if theory didn't also have a life of its own, logical deduction about what works and what doesn't and why needs to be free to run in parallel, the way it does in pure machine learning and theoretical statistics. The analogy doesn't quite work, because there's little chance that we'll be surprised by the new "data" we collect, i.e., collections of modeling results, if we've done our proofs right. But those proofs will only address narrow abstractions, and so there's always more to learn.

Modeling and proofs are as inherently complementary as wake/sleep. Just like everyone should avoid the trap of doing exploratory work without at least a vague idea about what's interesting in mind - ideas that will be driven, I think fundamentally, by a bit of deduction about what "must be so," i.e., theory - we should also avoid the trap of doing models without stopping to deduce why we see the results we do, and letting the theory speak and the deductions flow without too doing much new modeling work (perhaps using it only as "proof of concept"). And, conversely, we always need to avoid the trap of doing theory without constant inspiration and nudging from the real world, and so no intelligent person should think of doing theory without a constant flow of new ideas about which no proofs exist.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments