My previous post was inspired by what Andrew McAfee sees as the biggest challenge facing big data: convincing people to trust data-driven algorithms over their expertise-driven intuition. In his recent VentureBeat blog post, Zavain Dar explained that the real promise of big data is that it will change the way we solve problems.
Dar first divided the types of problems we are trying to solve using the two categories of truth devised by the 18th century philosopher Immanuel Kant, who differentiated an analytic truth, one that can be derived from a logical argument, from a synthetic truth, one that can only be determined from empirical evidence or external data.
Apples and oranges
Analytic truths are based on proven models providing rules we can use for logical deduction. For example, given the rules of arithmetic, we can deduce 2 + 2 = 4 without needing to empirically prove it, such as by putting two apples next to two oranges to verify that it adds up to four pieces of fruit.
Analytic truths are problems solvable with data we already have. After all, intuition is data-based since it’s based on data our brains have accumulated through academic learning, personal experience and professional expertise.
Synthetic truths, on the other hand, are problems that can only be solved with data we do not have. Without empirical data, for example, we can not prove that adding four inbound links from popular food critic websites to our fruit orchard’s website will increase its daily number of unique visitors by 22 percent. Nor can we prove this will sell more of our apples and oranges without collecting data to correlate web traffic with fruit sales.
Google and Amazon
Google, for example, doesn’t try to logically deduce an analytic truth to base its search engine on. Instead, Google collects and synthesizes previous click streams and link data to predict what future users will want to see in search results. Likewise, Amazon doesn’t try to logically deduce analytic truths of e-commerce governing who buys what and how consumers act. Instead, Amazon collects and synthesizes previous sales transactions and shopping cart data to predict what future customers will want to buy and how much they will be willing to pay.
These are examples of what can happen, Dar explained, when “we remove ourselves from the intellectual and perhaps philosophical burden of fundamentally unearthing and understanding a structure (or even positing the existence of such a structure) and use data from previous events to optimize for future events. Google and Amazon serve as early examples of the shift from analytic to synthetic problem solving because their products exist on top of data that exists in a digital medium. Everything from the creation of data, to the storage of data, and finally to the interfaces used to interact with data are digitized and automated.”
The evolution of problem solving
While human expertise will forever remain an invaluable resource, more fields of human endeavor now have access to data and computational resources capable of producing data-driven synthetic truths beyond the capabilities of the intuition-driven analytic truths of human experts.
“The rise of big data,” Dar explained, has “shifted the manner in which we solve problems. Fundamentally, we’ve gone from creating novel analytic models and deducing new findings, to creating the infrastructure and capabilities to solve problems through synthetic means.” This is why it makes more sense, Dar argued, “to view big data not in terms of data size or database type, but rather as a necessary infrastructural evolution as we shift from analytic to synthetic problem solving.”