I recently ran across a thought-provoking post on the USC Anneberg Innovation Lab blog – “Why Do We Need Data Science when We’ve Had Statistics for Centuries.” With all the debate of late surrounding the relatively new “data science” term, I’ve been thinking a lot about this question, so I thought I’d analyze this notion here on insideBIGDATA by picking apart the article. I’d love to hear your take on this, so feel free to leave a note.
Here are some excerpts from the article along with my commentary:
“Use of the term data science is increasingly common, as is big data … but what does it mean? Is there something unique about it? What skills do data scientists need to be productive in a world deluged by data? What are the implications for scientific inquiry?”
This is the big question, how does data science differ from statistics and computer science? I think the answer is related to big data, but not exclusively so. Big data does require the use of a very different technology stack than used previously with statistical analysis. Hadoop represents a paradigm shift to address these needs. So a statistician from 20 years ago, would not be equipped to deal with doing analysis on huge data sets on a time-scale that’s often required by today’s business applications.