Big Data: get a little bit pregnant


I understand big things. Some of my favorite songs exceed 20 minutes. Many of my favorite books push 500 pages. Heat is one of my favorite movies and it clocks in at nearly three hours. Sometimes big ideas just can't be compressed into small packages.

So it might seem strange that my fifth book, Too Big to Ignore, is (at least for me) relatively small. Ringing in at 256 pages, this is my shortest book to date. Perhaps paradoxically, though, the book could easily have exceeded 600 pages. Trust me: I had no shortage of ideas and there's an enormous amount happening in the Big Data world right now.

At the same time, though, I had to constantly remind myself of the book's target length. I honestly could have kept writing indefinitely, describing more and more companies doing fascinating things with Big Data. I could have become the equivalent of Michael Douglas's character in the excellent movie Wonder Boys. (For those of you who haven't seen the movie, the professor cannot finish the manuscript to his second novel.)

Fortunately, I have many other outlets for my expressing my thoughts on Big Data, one of which is this blog. One interesting company that didn't make the cut for the book is RavenPack, "a provider of real-time news analysis services. Financial professionals rely on RavenPack for its speed and accuracy in analyzing large amounts of unstructured content." From the company's site:

Diverse types of organizations are incorporating automated text analysis into their decision-making process, from financial trading to risk management firms. Knowing that the vast majority of content is trapped within unstructured data, RavenPack unlocks actionable content from news for instantaneous delivery, whether to a group of analysts at a top global bank or directly into algorithmic trading systems.

Financial institutions realize that a great deal of information lies in unstructured formats, whether its in tweets, blog posts, news articles, or other "non-traditional" sources of data. But, as I write in my book, just because these data sources and formats don't meet your father's definition of data doesn't mean that it's not data. Nor is it without (significant) value.

Think about two investment houses, A and B. Both have access to traditional, structured data: stock prices, number of trades, opening and closing prices, and the like. However, B adjusts its algorithm to include new data sources, perhaps using RavenPack or one of its competitors. All else being equal, I would bet that the returns of Firm B will exceed those of Firm A.

Simon Says

Most organizations haven't embraced Big Data. That is, they have yet to get on board the Hadoop train or hire a team of data scientists. That's understandable. But perhaps transforming unstructured data into a more structured, SQL-friendly format will convince CXOs of the power and value of Big Data.

Maybe organizations that want to do more with Big Data (but can't with their relational databases) need a little convincing. Go ahead. Get a little big pregnant with Big Data.


What say you?


About Author

Phil Simon

Author, Speaker, and Professor

Phil Simon is a keynote speaker and recognized technology expert. He is the award-winning author of eight management books, most recently Analytics: The Agile Way. His ninth will be Slack For Dummies (April, 2020, Wiley) He consults organizations on matters related to strategy, data, analytics, and technology. His contributions have appeared in The Harvard Business Review, CNN, Wired, The New York Times, and many other sites. He teaches information systems and analytics at Arizona State University's W. P. Carey School of Business.

Leave A Reply

Back to Top