UMD has a cognitive science lecture series with colloquia
delivered every other Thursday afternoon to a pretty diverse audience of
philosophers, linguists, psychologists, and the occasional computer scientist,
neuroscientist and mathematician. The
papers are presented by leading lights (i.e. those with reputations). The last
dignitary to speak to us was Fei Xu from Berekely psych and her talk was on how
a rational constructivism, a new take on old debates, will allow us to get
beyond the Empiricism/Rationalism (E/R) dualism of yore. Here’s the abstract:
The study of cognitive development has often been framed
in terms of the nativist/empiricist debate. Here I present a new approach
to cognitive development – rational constructivism. I will argue that learners
take into account both prior knowledge and biases (learned or unlearned) as
well as statistical information in the input; prior knowledge and statistical
information are combined in a rational manner (as is often captured in Bayesian
models of cognition). Furthermore, there may be a set of domain-general
learning mechanisms that give rise to domain-specific knowledge. I will
present evidence supporting the idea that early learning is rational,
statistical, and inferential, and infants and young children are rational,
constructivist learners.
I had
to leave about five minutes before the end of the presentation and missed the
question period, however, the talk did get me thinking about the issues Xu mentioned
in her abstract. I cannot say that her remarks found fertile ground in my
imagination, but they did kick start a train of thought about whether the
debate between Es and Rs is worth resolving (or maybe we should instead
preserve and sharpen the points of
disagreement) and what it would mean to resolve it. Here’s what I’ve
been thinking.
When I started in this business in the 1970s, the E/R debate
was billed as the “innateness controversy.”
The idea was that Rs believe that there are innate mental structures
whereas Es don’t. Of course, this is silly (as was quickly observed and
acknowledged), for completely unstructured minds cannot do anything, let alone
think. Or, more correctly, both Es and Rs recognize that minds generalize, the
problem being to specify the nature of these generalizations and the mechanisms
that guide it (as mush doesn’t do much generalizing). If so, the E/R question
is not whether minds have biologically
provided structure that supports generalization but the nature of this
structure.
This question, in turn, resolves itself into two further questions:
(i) the dimensions along which mental generalizations run (i.e. the primitive
features minds come with) and (ii) the procedures that specify how inputs interact
with these features to fix our actual concepts. Taking these two questions as
basic allows us to recast the E/R debate along the following two dimensions:
(a) how “specific” are the innately specified features and (b) how “rational”
are the acquisition procedures. Let me address these two issues in turn.
Given that features are needed, what constitutes an
admissible one? Es have been more than
happy to acknowledge sensory/perceptual features (spf), but have often been
reluctant to admit much else. In particular, Es have been averse to cognitive
modularity as this would invite admitting domain specific innate features of cognitive
computation. Spfs are fine. Maybe domain general features are ok (e.g. “edge,”
“animate,” etc.). But domain specific features like “island” or “binder” are not.
A necessary concomitant of restricting the mental feature
inventory in this Eish way is a method for building complexes of features from
the given simple givens (i.e. a combinatorics). Es generally recognize that
mental contents go beyond (or appear to go beyond) the descriptive resources of
the primitives. The answer: what’s not
primitive is a construct from these primitives. Thus, in this sense,
constructivism (in some form) is a necessary part of any Eish account of mind.
A second part of any Eish theory is an account of how the
innately given features interact with sensory input to give rise to a cognitive
output. The standard assumption has been that this method of combination is
rational. ‘Rational’ here means that
input is evaluated in a roughly “scientific” manner, albeit unconsciously, viz.
(i) the data is carefully sifted, organized and counted and (ii) all cognitive
alternatives (more or less) are evaluated with respect to how well they fit with
this regimented data. The best alternative (the one that best fits the data)
wins. In other words, on this conception, acquisition/development is an
inductive procedure not unlike what we find more overtly in scientific practice[1]
(albeit tacit) and it rests on an inductive logic with the principles of
probability and statistics forming the backbone of the procedure.[2]
Many currently fashionable Bayesian models (BM) embody this
E vision. See, for example, the Xu abstract above. BMs assume a given hypothesis space possibly
articulated, i.e. seeded with given “prior knowledge and biases,”[3]
plus Bayes Rule and maybe
“a set of domain-general learning mechanisms.” These combine
“statistical information” culled from the environmental input together with the
“prior knowledge and biases” in a “rational manner” (viz. Bayesian manner) to
derive “domain specific knowledge” from “domain-general” learning mechanisms.”
There are several ways to argue against this E conception
and Rs have deployed them all.
First, one can challenge the assumption that features be
restricted to spfs or the domain general ones. Linguists of the generative
stripe should all be familiar with this kind of argument. Virtually all
generative theories tell us that linguistic competence (i.e. knowledge of
language) is replete with very domain specific notions (e.g. specified subject,
tensed clause, c-commanding antecedent (aka binder), island, PRO, etc.) without
which a speaker’s attested linguistic capacity cannot be adequately described.
Second, one might argue against the idea that there is a lot
of feature combinatorics going on. Recall, that if one restricts oneself to a
small set of domain general features then one will need to analyze concepts
that (apparently) fall outside this domain as combinations of these given
features. Jerry Fodor’s argument for the innateness of virtually all of our
lexical concepts is an instance of this kind of argument. It has two parts.
First, it observes that if concept
acquisition is statistical (see below) then the hypothesis space must have some
version of the acquired concept as a value in that space (i.e. it relies on the
following truism: if acquiring concept C involves statistically tracking Cish
patterns then the tracker (i.e. a mind) must come pre-specified with Cish
possibilities). Second, it argues that there is no way of defining (most of)
our lexical concepts from a small set of lexical primitives. Thus, the Cish
possibilities must be coded in the hypothesis space as Cish as such and not congeries of non-Cish
primitives in combination. Put the two assumptions together and one gets that ‘carburetor’
must be an innate concept (specified as such in the hypothesis space).[4]
This form of argument can be deployed anywhere and where it succeeds it serves
to challenge the important Eish conception of a relatively small/sparse
sensory/perceptual and/or domain general set of primitive features underlying
our cognitive capacities.
Third one can challenge the idea that
acquisition/development is “rational.”[5]
[6]
This is perhaps the most ambitious anti E argument for it is an assumption long
shared by Rs as well.[7]
One can see the roots of this kind of criticism in the distinction between
triggering stimuli and formative stimuli. Rs have long argued that the relation
between environmental input and cognitive output is less a matter of induction
than a form of triggering (more akin to transduction
than induction). Understanding
‘trigger’ as ‘hair trigger’ provides
a way of challenging the idea that acquisition is a matter of induction at all.
A paradigm example of triggering is one trial learning (OTL).[8]
To the degree that OTL exists, it argues against the idea
that acquisition is “rational” in any reasonable sense. Minds do not carefully organize and smooth the
incoming data and minds do not
incrementally evaluate all possible hypotheses against the data so organized in
a deliberate manner. If OTL is the predominant way that basic cognitive
competence arises, it would argue for re-conceptualizing acquisition as
“growth” rather than “learning,” as Chomsky has often suggested. Growth is no
less responsive to environmental inputs than learning is, but the
responsiveness is not “rational” just brute causal. It is an empirical question,
albeit a very subtle one, whether acquisition/development is best modeled as a
rational or a brute causal process. IMO, we (indeed I!) have been to quick to
assume that induction (aka: learning) is really the only game in town.[9]
Let me end: there is a tendency in academic/intellectual
life to split the difference between opposing views and to find compromise
positions where we conclude that both sides were right to some degree. You can
see this sentiment at work in the Xu abstract above. In interest of honesty, I
confess to having been seduced by this kind of sort of gentle compromise myself
(though many of you might find this hard to believe). This live and let live
policy enhances collegiality and makes everyone feel that their work is valued
and hence, valuable (in virtue of being at least somewhat right). I think that this
is a mistake. This attitude serves to blur valuable conceptual distinctions,
one’s that have far reaching intellectual implications. Rather than bleaching them of difference, we
should enhance the opposing E/R conceptions precisely so that we can better use
them to investigate mental phenomena. Though, there is nothing wrong with being
wrong (and work that is deeply wrong can be very valuable), there is a lot
wrong with being namby-pamby. The E/R opposition presents two very different
conceptions of how minds/brains work. Maybe the right story will involve taking
a little from column E and a little from column R. But right now, I think that
enhancing the E/R distinctions and investigating the pure cases is far more
productive. At the very least it serves to flush out empirically substantive
assumptions that are presupposed rather than asserted and defended. So, from
now on, no more Mr. Nice Guy! [10]
[1]
Actually, whether we find this in scientific practice is a topic of pretty
extensive debate.
[2]
There is a decision procedure required as well (e.g. maximize expected
utility), but I leave this aside.
[3]
Let me repeat something that I have reiterated before: there is nothing in
Bayes per se that eschews pretty
abstract and domain specific information in the hypothesis space, though as a
matter of fact such has been avoided (in this respect, it’s a repeat of the old
connectionism discussions). One can
be a R-Bayesian, though this does not seem to be a favored combo. Such a
creature would have Rish features in the hypothesis space and/or domain
specific weightings of features. So as far as features go, Bayesians need not
be Es. However, there may still be an R objection to framing the questions
Bayes-wise, as I discuss below.
[4]
I discuss Jerry’s argument more fully here
and here.
Note that it is important to remember that this is an argument about the
innately given hypothesis space, not
about belief fixation. Indeed, Jerry’s original argument assumed that belief
fixation was inductive. So, the concept CARBURETOR may be innate but fixing the
lexical tag ‘carburetor’ onto the concept CARBURETOR was taken to be inductive.
[5]
Fodor’s original argument did not do
this. However, he did consider this in his later work, especially in LOT2.
[6]
I am restricting discussion here to acquisition/development. None of what I say
below need extend to how acquired knowledge is deployed on line in real time,
in, e.g. parsing. Thus, for example, it is possible that humans are Bayesian
parsers without their being Bayesian learners. Of course, I am not saying that
they are, just that this is a possible position.
[7]
E.g. the old concept of children as “little linguists” falls into this fold.
[9]
There are other ways of arguing against the rationality assumption. Thus, one
can argue that full rationality is impossible to achieve as it is
computationally unattainable (viz. the relevant computation is intractable).
This is a standard critique of Bayesian models, for example. The fall back
position is some version of “bounded” rationality. Of course, the tighter the bounds the less
“rational” the process. Indeed, in
discussions of bounded rationality all the interesting action comes in
specifying the bounds for if these are very narrow, the explanatory load shifts
from the rational procedures to the non-rational bounds. Economists are
currently fighting this out under the rubric of “rational expectations.” One
can try to render the irrational rational by going Darwinian; the bounds
themselves “make sense” in a larger optimizing context. Here too brute causal
forces (e.g. Evo-Devo, natural physical costraints) can be opposed to
maximizing selection procedures. Suffice it to say, the E/R debate runs deep
and wide. Good! That’s what makes it interesting.
[10]
For the record, lots of the thoughts outlined above have been prompted by
discussions with Paul Pietroski. I am not sure if he wishes to be associated
with what I have written here. I suspect that my discussion may seem to him too
compromising and mild.
"To the degree that OTL exists, it argues against the idea that acquisition is “rational” in any reasonable sense."
ReplyDeleteI guess the classic example in language acquisition of OTL is fast mapping -- and that has Bayesian explanations like, say, Fei Xu's own work, that are both "rational" and "probabilistic". I am having trouble seeing how word learning could be other than rational.
"Here [in evolution] too brute causal forces (e.g. Evo-Devo, natural physical costraints) can be opposed to maximizing selection procedures."
ReplyDeleteYou'll need to clarify (especially for this evolutionary naif) what the role of evo devo is in this context. But "natural physical constraints": isn't the whole point that selective forces do and indeed must select organisms with traits that are "better" in some sense, _up to_ physical limitations? You might have to forgive some naif's coarseness there but I couldn't be too too far off, mm? And isn't the point of the rational cognition program to explain cognitive processes as "as close to optimal as we can get under the circumstances"? Either on an organism level or an evolutionary level, different cases may be different (bug me for examples). In other words the "brute causal forces" you refer to must warrant some explanation for being one way and not another. It seems to me the only useful dichotomy is the phenotype-genotype level dichotomy, in one way or another things must be the way they are for a reason. Unless you reject this, I claim that you are merely elaborating the details of the rational cognition program.
Ewan and Alex rightly press me on what I mean by rational cognition. Here's a shot: I take the program to be part of the continuation of a long tradition that sees induction as the key to understanding cognition. Hume is the poster pinup here. The aim of theories of induction was to model how scientific deliberation worked; how data collection and evaluation leads to truth. the idea was (and is) that how data is sampled, how it is organized and how it is used to evaluate alternatives plays a major part in explaining why some investigations lead to truth and some don't. In this context, for example, induction is contrasted with abduction and learning is contrasted with growth and gradual learning with one trial learning. Abduction, one trial learning, growth are NOT rational processes in the normal sense of the term. So, one way of taking my point is that I want to know how the Bayesian Rational Cognition program fits with these ideas. One answer is that it is at right angles to it. Another is that it abstracts away from it. Another is that it embodies it. I have been assuming that it sees itself as part of this tradition (hearing Xu e.g. start with a contrast of the rationalist and Empiricist traditions and claiming that we can eat our two cakes etc suggested that part of the sell of this approach is the traditional one). However, I could be wrong. Others treat this "approach" as effectively a suggestion for a normal form notation without making ANY empirical claims. Is this indeed the whole point? Let's talk Bayes because it's neutral wrt any questions we will be interested in? Or is it that Bayes has content and it is the right tool for the job of explaining development/acquisition. In which case I want to know what it is about Bayes that makes it such. What does IT bring to the table empirically? So when I take it that induction is not the right way of thinking about a problem, I mean it as seen against the great tradition. It is not merely a technical question.
ReplyDeleteLet me make this point another way: is 'rational' in 'rational cognition' like 'significant' in 'statistically significant'? As we all know statistically significant results can be trivial.
You are trying to squish together too many different distinctions that are mostly independent.
DeleteI understand (mostly) the brute causal/rational distinction.
But that seems completely different from the gradual/OTL distinction.
Growth for example is normally slow -- so if the analogy for language acquisition is something like pubescence or the growth of a kidney that is a gradual process.
Yes, they are different dimensions along which E/R approaches contrast. The main interest in OTL for Rs is that it suggests that classical "learning" is not the real issue: one trial learning suggests a great deal of mental baggage. And yes this is different from the first distinction, though not less interesting.
DeleteI think the relevance of Bayesian methods comes down to whether you think there is uncertainty in the process of language acquisition. Advocates of "deterministic" triggering-based algorithms, which never make a learning mistake (and so are really "inerrant" rather than simply "deterministic") are really proposing that there is no uncertainty in the acquisition process: if we know ahead of time certain relationships between words and categories, the grammar can be straightforwardly decoded from the input.
ReplyDeleteHowever, if we accept that there is uncertainty involved in some aspect of language acquisition, then probabilistic models are useful, because they measure uncertainty about the values of hypothesized variables (and Bayesian models are just probabilistic models that represent uncertainty about the model parameters). For example, if we think that children use an inerrant triggering algorithm for learning syntax once they are confident in the part of speech for each word, but that there is uncertainty about the mapping between parts of speech and words, we could use a Bayesian model to measure when there is good evidence for particular sequences of parts of speech.
I like this way of putting matters. Would you agree that the issue is just how inerrant it is? In other words, I take OTL to be a limit case that IF correct suggests that indicative methods are really not where the action is. However, the way I would like to think about it is that the narrower the range of alternatives the less work there is for inductive methods to do.
Delete@ John: You say: "if we think that children use an inerrant triggering algorithm for learning syntax once they are confident in the part of speech for each word," This sounds fascinating, Do we have any idea how such a mechanism might have evolved?
Delete@Norbert: I'm deliberately avoiding making proposals about the actual learning procedure children use. If they are faced with uncertainty, then probabilistic models can measure what counts as good evidence when, under explicit assumptions about what linguistic structures look like. In other words, probabilistic models can tell us about the shape of the data under different assumptions about linguistic structures, without committing to any one inference procedure on the part of the child. Indeed, the field of machine learning has shown that algorithms with very different behavioral properties are capable of approximating inference in the same underlying model.
DeleteIf OTL occurs in the face of uncertainty, then probabilistic models can allow us to measure the utility of different cues the learning strategy, whatever it is, might be attending to. Those same probabilistic methods may also suggest inference procedures that are capable of sudden changes of state, but the effectiveness of a probabilistic model for measuring uncertainty is different from its effectiveness for replicating the sequence of steps a child follows. Personally, I expect that language acquisition is full of uncertainty, and so the strategies children follow should have probabilistic justifications, but they don't necessarily need to be the kind of clean and natural relaxations of optimal procedures that come up in machine learning.
@Christina Behme: I don't know how such a mechanism would have evolved. I had in mind this paper by Sakas and Fodor:
http://www.colag.cs.hunter.cuny.edu/pub/Sakas_Fodor_Disambiguating_prepub.pdf
Maybe I should mention that I'm not advocating for this proposal myself. I merely brought it up to illustrate how probabilistic models are useful for measuring uncertainty in hypothesized variables, regardless of the broader theoretical framework, since measuring uncertainty is just what probabilistic models do.
Let's consider a concrete example. Trigger learning algorithms proceed by maintaining a single hypothesis about the grammar, keeping that hypothesis if it is capable of parsing observed sequences, and moving to a different hypothesis if the grammar cannot handle some observed sequence. This algorithm in broad outline is essentially the “win-stay/lose-shift” algorithm (WSLS). WSLS approximates Bayesian inference when the likelihood function is deterministic, and can be relaxed to non-deterministic likelihood functions (e.g. http://cocosci.berkeley.edu/Liz/BonawitzetalCogSci11.pdf). So the sudden changes that occur with trigger-learning algorithms are possible with algorithms that perform inference in a Bayesian model.
DeleteOne major difference between trigger learning algorithms and WSLS, or other particle filter-based algorithms, is that trigger learning algorithms typically assume that a Parameter cannot be changed once it has been set. Technically, this means that the markov chains of such trigger learning algorithms are not ergodic, and they can get “stuck” in bad hypotheses. This is why the “subset” problem arises for trigger learning algorithms but not for most statistical learners: a hypothesis that generates a language that is too large will pay a statistical price. The Sakas and Fodor (2012) paper I mentioned in my previous comment addresses the non-ergodicity by arguing that there is never uncertainty about the goodness of some transitions in the markov chain. If children in fact follow a strategy like the one they outline, then the procedural similarities between trigger learning algorithms and particle filters are just superficial coincidences: the success of the child is not due to proper handling of uncertainty because there was never any uncertainty.
Hi John,
DeleteTake an even simpler learner:
Say you have a general Bayesian classifier, two classes A and B,
and maybe in this particular case the supports of the distributions p(x|A) and p(x|B) are disjoint -- i.e. always one of the two is zero. Then when you see one point with say p(x|B) = 0, then you know that p(A|x) = 1.
I.e. this is OTL.
So I would say this is a rational Bayesian learner which does OTL. Are you arguing that this is not Bayesian because there is no uncertainty?
What would Norbert say for this example?
No, I'm happy to say that that learner is Bayesian, but Bayesian learning in this situation is probably overkill. You could explain the same pattern by appealing only to set membership, and Bayesian models also appeal to sets. The extra moving parts of Bayesian models are justified when you want to address uncertaiinty.
DeleteIn some cases there is certainly is a certain degree of uncertainty of that we can be .. quite sure (as Blackadder puts it), so we need a mechanism to deal with that-- say some general Bayesian reasoner. The cases of certainty and OTL are just special cases where the posterior distribution is 0 or 1, and so we don't need a separate mechanism, as the Bayesian reasoner will work just fine (overkill as you put it); so the more parsimonious explanation is that certainty is just a special case of uncertainty.
DeleteYes, it is possible that children have a general purpose Bayesian sampler that they can plug in to any inference problem, and that it's just easier to use the general purpose Bayesian sampler even for deterministic problems than to devise a special-purpose theorem prover. However, biology is messy, and I'm not sure we should assume that the brain follows software engineering best practices. My point was just that, regardless of how the inference toolkit is laid out, if it works by handling uncertainty, Bayesian methods are useful for measuring uncertainty under different ways of analyzing linguistic structures into parts.
Delete@ John: You say:
Delete"However, biology is messy, and I'm not sure we should assume that the brain follows software engineering best practices."
True as this may be, I do not see how it addresses Alex' point that having 1 mechanism is simpler than having 2. Besides being 'messy', brains also have to 'economize' [they consume the most 'fuel' of all biological organs as it is] - so why maintain one [metabolically costly] mechanism that rarely gets used [even if such should have evolved - I have no clue how this could have happened in the first place but that's another issue] when one can do almost as well with only the other?
To repeat a point I made earlier on this blog: unless you actually do biology [e.g. brain-research] and have some "hard" evidence for specialized mechanisms, it seems idle to speculate. If people like Alex can show that a Bayesian sample CAN do the job there seems no a priori reason to rule out that brains can also make due with one mechanism for many cognitive tasks. Whether they actually do is an entirely different question and i have not given up hope entirely that one day Norbert will convince one of the biologists of the biolinguistic enterprise to explain to us how UG is implemented in the brain...
" If people like Alex can show that a Bayesian sampler CAN do the job there seems no a priori reason to rule out that brains can also make due with one mechanism for many cognitive tasks."
DeleteNo pun intended, but that depends on your prior.
I am mostly interested in the question of whether there are any arguments that the existence of OTL implies that empiricist or rational or probabilistic learning is wrong. But it seems that there aren't.
Delete@ Alex Clark: If the suggestion is that "certainty and OTL are just special cases where the posterior distribution is 0 or 1", then effectively one is saying the priors need to be 1/0, right? If so, aren't you committing to some non-empiricist knowledge (priors = 1/0). Note, the OTL data suggests that kids are also able to revise their hypotheses that they learn through OTL; therefore, modelling OTL through 1/0 priors wouldn't allow for this. In fact, there would be no "learning" left.
DeleteNo, I was thinking of two hypotheses with a flat prior -- 0.5 each -- the likelihood functions ( prob of data given hypothesis) have disjoint support.
Delete@Behjamin: Thank you so much for this ingenious pun. Since Alex solved the prior problem you raised prior to me having a chance to respond, maybe you could be so kind and educate us on biological implementation?
DeleteI don't see how what Alex wrote has any bearing on whether or not there are good "a priori" reasons for or against preferring single domain-general mechanism explanations over alternatives.
DeleteAs for biological implementation, biological implementation of what? The intuiting faculty that allows us to stand in a knowledge-of-relation with abstract entities such as "languages"? The (one and only?) domain general learning mechanism that is the answer to all of our problems? Or of a language faculty of the kind you seem to have serious misgivings about?
I'm happy to admit that I can't give nor neither know of any detailed account of how any of these are biologically implemented. But as far as I can see, we are all in the same boat here, "empiricists" or "rationalists". So what?
May I ask how you KNOW that 'we're all in the same boat'? Are you intimately familiar with recent work in neurophysiology/developmental psychology? That you even have to ask 'biological implementation of what?' suggests otherwise - someone taking the 'bio' in biolinguistcs seriously would not ask such a question. He would talk about what he [or his colleagues] has [have] discovered [little as this might be] and compare it to what those working in other frameworks have discovered [little as that might be]. If, as you say, everyone would be equally ignorant, why would your ignorance be any better than the ignorance of a person you deride as 'empiricist'?
DeleteI really fail to see where I was describing certain kinds of ignorance as better as others, deriding anyone as 'emipricist' or even claiming to KNOW that we're all in the same boat.
DeleteI'd be delighted to be pointed to any recent work in neurophysiology that tackles the most fundamental implementation problem for any cognitive theory, i.e. how any kind of structured representations could be represented in the brain.
This is out of my area of expertise but there is a lot of work on population coding of various types of representations, especially in the visual cortex (for obvious methodological reasons).
DeleteI don't follow this work but put "population coding" into google scholar and
you will get several hundred recent papers.
(or there are some videos here http://haxbylab.dartmouth.edu/meetings/ncworkshop11.html#speakers)
It is an interesting question about whether one can join this sort of work up with the concerns that we have on this blog. I am unconvinced at the moment that it has much relevance as the gap seems too large; but I am entirely open-minded about this.
Maybe someone that knows this literature better could point us to some more pertinent recent work that bears on this.
Thanks for this. From a quick glance this reminded me of Paul Smolensky's work on coding tree-representations in artificial neural networks. I share what I take to be your scepticism, however, as to whether any of the current work really would allow us to cash in our concepts in (artificial) neuro-vocabulary, much less "biological" notions. In any case, I fail to see why that would be problematic, and how this ought to be different for (Chomskyan) Generative Linguistics than for Cognitive Science in general.
DeleteYes agreed. I think Christina's objection is to what she perceives a rhetorical overreach by biolinguists; and while that is a perfectly reasonable point, it's not really relevant to the current issue.
DeleteThere are two cases where people say we should move beyond the difference between X and Y. In one, X versus Y is a false dichotomy. In the other, the question of X versus Y is in simply ill-posed. This is one of the second sorts of cases. It is often useful to go to great lengths to try and precisify their intuitions and those of others, but, frankly, not when there are whole mountains of precisely theory that make genuinely meaningful distinctions about learning...
ReplyDelete