Probably since forever philosophers and mathematicians have
dreamed of mechanizing thought, of removing judgment from thinking. The newest
aspirant in this millennial quest is Big Data, and not surprising there is an
eponymous book (excerpted here) with the following provided summary:
This revelatory exploration of big
data, which refers to our newfound ability to crunch vast amounts of
information, analyze it instantly and draw profound and surprising conclusions
from it, discusses how it will change our lives and what we can do to protect
ourselves from its hazards.
Big Data (BD) is the new New Thing, the method by which diligence
can substitute for thought. The idea actually has a certain charm as it
reverberates with our sense of justice. Collecting data is hard work, but it is
generally the kind of work for which effort is rewarded. Work hard and you will
do well. Put in the hours and the data will pile up. It’s an activity that rewards virtue.
In this it is entirely unlike
coming up with a plausible analysis, aka thinking. This activity is totally
unfair. Lazy people can have excellent
ideas. Sloth is no bar to insight and profligacy no guarantee of intellectual
stagnation. Here even the wicked,
sloppy, and lazy can prosper. How unfair.
In a just world, virtue would be rewarded. In a just world
hard work would guarantee enlightenment. We don’t live in a just world. Big
Data is the unfounded belief that this can be remedied. The hard work of data
gathering can substitute for the caprice of thought. It cannot be, and,
unfortunately, believing it can is likely to deform scientific practice. To see this, consider the following quote:
The era of big data challenges the
way we live and interact with the world. Most strikingly, society will need to
shed some of its obsession for causality in exchange for simple correlations:
not knowing why but only what. This overturns centuries of
established practices and challenges our most basic understanding of how to
make decisions and comprehend reality (10).
And that is precisely the problem. Big Data is part of an enterprise aimed at
reforming scientific practice. Dump why
aim for what. However, contrary to the prevailing
conception, without a model/theory it is not clear what it means to just “look
for correlations.” Data do not speak
for themselves. So gathering lots of data will not result in eloquent models
that understand the whats that
matter. Big data sets cannot pull themselves up by the bootstraps (nothing can pull itself up by the
bootstraps!) thereby yielding useful models. So, without explicit thoughtful models
that guide the enterprise, we will be saddled with implicit models that obscure
(and trivialize) what we are doing (as noted here without good models it is
even difficult to separate good data from bad).
None of this would be worth mentioning were it not for the
mesmerizing powers of Big Data. We have seen this before (here, and here for
example). Big Data is the modern avatar
to classical empiricist methodology. It’s appeal is its promises to provide
insight without intellectual sweat. This time, however, Empiricism has found a slogan
attached to a technology, Google being the all-powerful mantra. Not
surprisingly, money-making slogans can be very enticing and Google
intellectuals (e.g. Peter Norvig) can gain powerful platforms. And though I am
quite sure that like all other (empiricist) attempts to circumvent thought,
this too will ultimately fail, it’s demise may not come soon enough to prevent
serious damage. So when you hear the siren calls of Big Data I suggest the
following prophylactic procedure; repeat Kant’s dictum to yourself, viz. data
without theory is blind, data without theory is blind, data without theory is
blind…and hope it soon goes away.
Hear hear, but, cleverness won't hold its own over Big Data unless the people who think they're smart can find the right places to be clever. I am suspect that you're right about the link between Big Data-ism and the currently rising tide of sanctimoniousness and process-worship in the 'Anglosphere'.
ReplyDeleteI think I am missing something important here, so instead of jumping to conclusions i ask for clarification first. It had been a while since i read my Kant so i looked it up to confirm that my memory [suggesting Norbert had only quoted half of the dictum] was correct:
ReplyDeleteThoughts without [intensional] content (Inhalt) are empty (leer), intuitions without concepts are blind (blind). It is, therefore, just as necessary to make the mind's concepts sensible — that is, to add an object to them in intuition — as to make our intuitions understandable — that is, to bring them under concepts. These two powers, or capacities, cannot exchange their functions. The understanding can intuit nothing, the senses can think nothing. Only from their unification can cognition arise. (A50-51/B74-76)
So I imagine Norbert does not want to persuade anyone to abandon search for data entirely but he seems to accuse 'empiricists' or 'Google intellectuals like Norvig' of collecting data just for the sake of collecting data in the absence of ANY theory - is this the charge?
How relevant 'Big Data' (ie 500 Billion Words) is to structural linguistics is questionable, but we certainly need to take an interest in 'Middle-Sized Data', the 30-100 million sized corpora that language leaners between the ages of 3 and 10 are looking at (depending partly on age and SES).
ReplyDeleteI for example feel rather embarassed that I can't tell my students how many times they need to see object NP preceeding PP in a corpus but not vice versa before they can conclude with some confidence that there is a NP<PP rule/principle in operation. I've just requested my uni library to order a book called 'Frequency Effects in Language Acquisition' which seems like it might be highly relevant to this kind of query.
Some patterns don´t start showing up before the 100 million word mark. For example tense distribution for really low frequency constructions.
DeleteSo (noticing this reply after a rather long delay, but it is apropos of topics that are often discussed here), if the relevant facts can be found in the intuitions or performance of 7 year olds, they are presumably projected (in the sense of Peters' discussion of the projection problem) from more 'basic' data found in smaller corpora, whereas if they can only be found reliably in the intuitions or performance of college students, they might be acquired as features of the language with some degree of independence from other features.
DeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDelete