Thought leaders and pundits like me espouse the virtues of big data. Although you'll get no argument from me on the potential benefits of this essential trend, it's important to remember that there is still tremendous value from using basic customer information. Driving home from a networking event on the
Uncategorized
In his pithy style, Seth Godin’s recent blog post Analytics without action said more in 32 words than most posts say in 320 words or most white papers say in 3200 words. (For those counting along, my opening sentence alone used 32 words). Godin’s blog post, in its entirety, stated: “Don’t measure
In my last set of posts I started to look at some of the challenges associated with enterprise management of reference data domains, especially as the scope of use for the same conceptual reference domains expands across databases, systems, and functional areas within the organizations. Recognizing the value of capturing
How many projects have you worked on that forgot to test size, volume, and conduct load balancing in a newly converted environment? I have worked on a few of those types of projects. I know in a data warehousing effort, we always check any servers and databases, based on load,
A lot of data quality projects kick off in the quest for root-cause discovery. Sometimes they’ll get lucky and find a coding error or some data entry ‘finger flubs’ that are the culprit. Of course, data quality tools can help a great deal in speeding up this process by automating
Big data? What about the small stuff? In preparing for an upcoming business trip, I decided to rent a car on Enterprise.com. I could have sworn that I had registered on the site at some point, but I couldn't find my user name and password. Call it a senior moment.
![SAS high-performance capabilities with Hadoop YARN Architecture diagram of how SAS high-performance technology works with Hadoop YARN](https://blogs.sas.com/content/datamanagement/files/2014/08/hadoop-yarn.png)
For Hadoop to be successful as part of the modern data architecture, it needs to integrate with existing tools. This integration allows you to reuse existing resources (licenses and personnel) and is typically 60% of the evaluation criteria for integration of Hadoop into the data center. One of the most
![Share your cluster – How Apache Hadoop YARN helps SAS Architecture diagram on how SAS works with Hadoop YARN](https://blogs.sas.com/content/datamanagement/files/2014/08/Hadoop.jpg.png)
Even though it sounds like something you hear on a Montessori school playground, this theme “Share your cluster” echoes across many modern Apache Hadoop deployments. Data architects are plotting to assemble all their big data in one system – something that is now achievable thanks to the economics of modern
My previous post explained how confirmation bias can prevent you from behaving like the natural data scientist you like to imagine you are by driving your decision making toward data that confirms your existing beliefs. This post tells the story of another cognitive bias that works against data science. Consider the following scenario: Company-wide
The first step is establishing governance for reference data is assessing the existing reference data landscape: understanding what reference data sets are used, who is using them, and how they are being employed to support business processes. That suggests a three-pronged approach to identifying organizational business process and application dependencies