Monday, April 10, 2017

Is the examined academic life worth it?

Every year grad departments select an incoming class, funding agencies decide on who is going to get how much and journals decide who is going to get published. Each of these decision involves a scare resource issue: a (usually small) finite number of positions, a (usually much too small) finite number of dollars and a (usually small) finite number of pages.[1] The decisions amount to how to allocate these scare resources. The reason the resources are scarce is that there are often more applicants/submissions than there are places to allocate. But the real reason resources are scarce is that quality alone, at least evident quality, does not suffice to perfectly fit the applicants/submissions to the allocations available. In other words, there are many ways to allocate the money to high quality applicants/submissions, at least prima facie.  And that’s the problem.

How do academics solve this problem? In general we do this by applying every more refined (i.e. recondite?) markers of quality to winnow out the really truly deserving from the merely truly deserving from the merely deserving. It is well known that even blunt methods suffice to lop off the clearly undeserving from the rest. In other words, academics embrace the conceit that if we try hard enough we can find the very very very…very best, and can thus optimize our choices; the decision procedure being a simple easily defensible one: select the most deserving and allocate to them the scarce resources. So, every year departments (including mine, of course) select the very very…very best of the applicant pool, funding agencies fund the very very very…very best proposals and journals publish the very very very…very best papers. Those readers that find that the last sentence rings true, please follow me as I have bridge for sale for you to look at.

Ok, everyone knows that the above story is more than a tad tendentious. We all recognize that our powers of discernment, even if applied with the utmost seriousness (which is seldom the case), are loaded with piles of false positives and negatives. We know this, but as an institution we also believe that overall looking for the best and rewarding it is the superior strategy. But is it? Here is a recent piece that questions this in interesting ways and suggests an alternative.

Before getting into the nub of the proposal, here are some points that the piece makes that (largely, though not perfectly) fit with my experience (see some comments below). The author Shahar Avin (SA), only discusses research funding, but I would add that the same holds for journal pages, and grad slots.

1.     Currently, “the monetary cut-off point still tends to be way above the quality cut-off point.”
2.     “…expert reviewers spend a lot of time allocating grant money by trying to identify the best work. But the truth is that they’re not very good at it, and that the process is a huge waste of time.”
3.     Peer review, rather than adding information, often adds “another layer of irrationality” to the decision process.
4.     The application process asks that you to be more sure of where you are going and how you will get there than it is rational to expect someone who is aiming at original research to be.
5.     Our capacities to identify those novel ideas most likely to succeed is much more limited than we believe.
6.     The belief that we can make this kind of rational decision “demand[s] excessive amounts of information from applicants, and waste a colossal amount of their time.”
7.     For some areas, we can even quantify how time consuming this is. SA cites a study in Nature which calculates that “[i]n Australia, during a recent annual funding round for medical research, scientists spent the equivalent of 400 years writing applications that were eventually rejected.” That’s a lot of effort.
8.     “…‘expert reviewers’ are not fungible commodities. One reviewer is not the same as another, and their judgements tend to be highly personal. Of the nearly 3,000 medical research proposals submitted for public funding in Australia in 2009, nearly half would have received the opposite decision if the review panel had been different…”
9.     “…the process isn’t just ineffective – it’s systematically biased. There’s evidence that women and minorities have lower chances of securing grants than people who are male or white, respectively.”
10.  Unorthodox proposals are at a disadvantage in the current funding structures.

As I said, most of these observations seem right to me. Points 1/2 perfectly fits my little world. In my experience the quality of the typical applicant pool, after the first triage, leaves many excellent candidates to fill too few spots. I would also add that choosing the best among the crop of very good is not something that I have found academics are good at.  Certainly much of what comes to be recognized as the best fails to find much support much of the time. And this is especially so if the work is original or the applicant has an unusual pedigree.

Let me expand on this for a moment in the context of grad admissions. In assessing applicants we are going, reasonably enough, on track record (what else is there). But, I have found that what makes you a good undergrad is not necessarily what makes you a good grad student, nor is what makes you a good grad student necessarily what makes you a fecund researcher. Undergrads are, rightly, much more coddled. They are assessed by how well they solve problems that have answers in the back of the book (or, now, online). Grad students are largely assessed on how well they solve problems without available answers, as are researchers. But, in addition, the latter are judged by how well they find new problems and generate novel research. The best are those whose work create enlightenment that generates puzzlement and so underwrites many years of further work.  In my experience, the qualities that allow success in any one of these endeavors does not reliably signal success in any of the others. So, even if we are conscientious in reviewing the available material, it is not clear what we are looking for. And, I should add, an hour or two interview, does not, in my experience, add much useful information. Luckily, how well we decide does not really matter as in a large number of cases any selection will be ok. We don’t need to find the best of the bunch because even the “worst” of the “best” is very good.

Another reaction: I was surprised at how wasteful the process can be! See point 7 above. 400 years is a long time, even if there are collateral benefits of writing a grant that does not get funded or applying to schools that don’t accept you or writing papers that never see the light of day. And I would question how large these collateral benefits actually are. The claims that there are benefits of doing the work even if it is unrewarded often strike me as self-serving (often touted by those at the top of the food chain). I can see the benefits of thinking through what one is working on as a valuable exercise. What I am less convinced of is doing this in a grant application format or a grad student admissions form is a good way of thinking things through. Maybe it is. But it is not obvious to me that it is clearly a superior way of doing this useful activity. And if it is not very useful, then it is questionable given that it clearly is very time consuming.

I have made noises echoing point 4 on FoL many times. I think that this is especially true for “theoretical” work where laying out the timeline of research is a silly exercise. If a problem is really good, then how you plan to solve it is not terribly apparent. The best you can do is motivate a conjecture, and funding agencies (or at least the NSF) finds this very insufficient. I suspect that this is so beyond theory. In my experience, grants seldom fund the research proposed. Rather a grant submission often presents finished research which if funded is ready to go “public” and the generated new grant is used to fund novel research that is as yet indeterminate (and only lightly described in the proposal). Ok, maybe I am being too cynical here. But I don’t think I am being way too cynical. At any rate, SA seems to agree.

Last point wrt 8/9 and then the “solution.” I also agree that the current process is unstable in that choose your reviewers and things will change dramatically. This suggests that if there really are objective bases of quality that could be used to rank alternatives then this factor is epistemologically elusive, at least over a large domain. Moreover, this elusiveness allows for systematic bias to creep in part because the process is believed to squeeze the bias out by focusing on “quality” alone.  Sadly, we all know how this works. The point AH makes is that it might be working more effectively in a context in which we are aiming to choose the best.[2]

Ok, the solution:

“Fortunately, there’s a simple solution to many of these problems. We should tell the experts to stop trying to pick the best research. Instead, they should focus on filtering out the worst ideas, and admit the rest to a lottery. That way, we can make do with shorter proposals, because the decision to accept or reject a ticket to a random draw requires less information – and highly specific proposals are unrealistic anyway. So instead of asking reviewers to make unreasonable predictions, they can turn their minds to weeding out cranks and frauds. Bias will still occur in the filtering stage, of course, but many more proposals will make it through to a lottery, which is inherently unbiased.”

I confess to finding this idea appealing. I believe that weeding out the clearly undeserving is a lot easier than identifying the best. So, I believe, a lottery among the triaged could save lots of time and effort. It might also, as AH notes, eliminate some bias, against unconventional ways of thinking, biases of various sorts and allow some things outside the prevailing fashion (though not too far out, I suspect).

But I think that there is a possible second very salutary effect of adopting a lottery system. It might force academics to acquire a bit more modesty. Academics are natural meritocrats. We reward the rewarded and denigrate the failures. We tend to downplay how much luck plays in the process of “success.” One nice feature of a lottery system is that it will make weaken this fantasy by making crystal clear that luck plays a non-negligible role. It does this by institutionalizing luck as an overt feature of the process. This might also have the beneficial side benefit of forcing administrators to consider more refined ways of measuring academic quality than just counting entries on a CV (weighted by venue quality). No more just listing publications in “major” journals or “large” grants funded. Maybe some thought will need to go into the process. Ok, I admit that there are problems with this too. But right now the mechanization of the whole acadmic evaluation process has, I believe, gotten out of hand and some push back is required. This could help by, as AH says, making it official, that the whole process is (in large part) a lottery anyhow. So not only might recognizing this and institutionalizing it, make “the whole process… cheaper, fairer and more efficient.” It might also help to make it more honest, with the attendant benefits that honesty often promotes.

[1] In the case of journal space, web publishing offers a plausible solution to the “too few pages” problem. However, one benefit of the page limits is that it helps manage information overload for the reader. If page numbers expand then the selection of “quality”/”relevance” will be off loaded to the reader. Right now many rely on editors and journals to cull the good stuff. It is unclear that it works well, but managing the tidle wave of material out there is an important problem.
[2] I don’t know if the review process in linguistics leads to obvious bias, but I would not be surprised if it did. I just don’t know. Anyone with info to share is invited to do so.

No comments:

Post a Comment