In the latest issue of Wired magazine, Editor-in-Chief Chris Anderson expounds upon what the magazine is calling “The Petabyte Age.” With computers crunching tons of data, Anderson maintains that scientists will no longer need to come up with hypotheses and models to test their theories. They’ll simply have to look at aggregated numbers and they’ll have their answers.
While numbers can be fascinating for what they show us, I think Anderson overreaches with his notion (dare I say it’s a hypothesis?) that mounds of computer-analyzed data will save humanity. The one paragraph that sticks in my craw from the article is this one:
“This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.”
Oh, really . . . . I think Anderson is being cheeky here – trying to get a rise out of us. Well, then, let me rise to the occasion. Numbers can be easily manipulated. That’s why we often hear, “Numbers can say whatever you want them to say.” It all depends on what you want to focus on with a given set of numbers. If numbers were truly static, how could people get away with cooking their books or sway people with warped statistics?
As for throwing out all the fields of study that Anderson thinks we’ll no longer need, what a load of hogwash. Besides, his list is so, to borrow a word from my daughter, RANDOM. Taxonomy is the naming of living things. While The Petabyte Age may allow us to count things in all of their permutations, we still need names in order to tell those things apart. Linguistics deals with the study of language. Unless we’re planning to let all the computers do the talking about these numbers, I suppose we’re still going to need to speak and understand the structure of language.
As for sociology, psychology, and ontology (the study of the nature of being), all of these fields deal with the whys of life. Why are we here? Why do we do what we do? Why are things the way they are? With all due respect to Anderson, a heap of numbers without the whys is just a heap of numbers, and frankly, a human being who doesn’t ask why isn’t much of a human being at all. “Why?” is one of the first questions out of a toddler’s mouth. It’s the first question out of my mouth about pretty much everything. (Writers thrive on the whys and what ifs of life. Either that, or I’m still a toddler.)
This brings me to the NINabyte. The band Nine Inch Nails (NIN) is like a kid in a pasta shop. The band is having fun throwing massive amounts of spaghetti at a wall to see what will stick in its effort to try out various online applications and experiment with different methods of music distribution. It has accounts through Flickr, YouTube, Facebook, and MySpace, along with its official website. Its newest offering is a Google Earth application that shows how many times its freely available album “The Slip” has been downloaded by geographic location. NIN meet The Petabyte Age. We now have the NINabyte.
Upon examining the downloads in Minnesota, my first question was, “Why aren’t there as many downloads in Minnesota as compared to other geographic locations?” Obvious question, ain’t it? But there it was. Could it be that NIN hasn’t done as much touring or marketing in Minnesota? Could it be that the music of NIN isn’t necessarily music to the ears of Minnesotans? If the latter is the case, why don’t Minnesotans like NIN’s music? Not that they don’t. I’m simply hypothesizing here. The data, even though there is a lot of it, actually raises more questions than it answers. Any researcher, scientific or otherwise, will tell you that’s the typical effect of data.
In discussing this topic with Hubby (a sociology major), he came up with a few of his own questions concerning Anderson’s hypothesis. Where do the numbers come from? Sure, we have lots of data sets hovering around in computers waiting to be analyzed. Does this mean that we don’t ever have to collect another iota of data? By what mechanism do we verify that the data these computers are regurgitating is legitimate or valid? Any monkey (no offense to the monkeys out there) can enter data into a computer, but, if you put garbage in, you’re going to get garbage out. (Remember GIGO?) (Hubby reminded me that Stephen Colbert once told viewers to go to Wikipedia and change the amount of remaining African elephants.)
Given my analysis of Anderson’s topic, I’d have to say that scientists and sociologists and psychologists and the rest are fairly safe in their jobs. After all, we’re going to need them for their expertise in analyzing those petabytes and NINabytes.
Here’s an article from Ars Technica that also questions Anderson’s article.