Blend, cleanse and prepare data for analytics, reporting or data modernization efforts

.@philsimon asks, Rather than trying to tackle a new form of governance, wouldn't your organization do better to shore up its existing data-governance practices?
Blend, cleanse and prepare data for analytics, reporting or data modernization efforts
.@philsimon asks, Rather than trying to tackle a new form of governance, wouldn't your organization do better to shore up its existing data-governance practices?
Start with the end in mind -- wise words that apply to everything, and in the world of big data it means we have to change the way we look at managing the data we have. There was a time when we managed data quality, and the main goal was
We've witnessed a significant rise in data governance adoption in recent years. Careers, technology, education, frameworks, practitioners – there's growth in all aspects of the discipline. Regulatory compliance across many sectors is a typical driver for data governance. But I also believe one of the main reasons is the realisation by
Just in time for the Strata + Hadoop World Conference, SAS became the first software vendor to achieve ODPi Interoperability with our Base SAS® and SAS/ACCESS® Interface to Hadoop products. Now, that's a lot to digest – so let me back up a second and give some background as to what this
What if you could predict with near-perfect accuracy what you’re going to sell and when your customer is going to buy? Right supply, right time is the goal German manufacturers have set themselves, without reducing the configuration options customers expect. Having almost completed stage 1 of their plan – changing
In my last post, we explored the operational facet of data governance and data stewardship. We focused on the challenges of providing a scalable way to assess incoming data sources, identify data quality rules and define enforceable data quality policies. As the number of acquired data sources increases, it becomes
As I've previously written, data analytics historically analyzed data after it stopped moving and was stored, often in a data warehouse. But in the era of big data, data needs to be continuously analyzed while it’s still in motion – that is, while it’s streaming. This allows for capturing the real-time value of data
.@philsimon on the need to adopt agile methodologies for data prep and analytics.
In Part 1 of this two-part series, I defined data preparation and data wrangling, then raised some questions about requirements gathering in a governed environment (i.e., ODS and/or data warehouse). Now – all of us very-managed people are looking at the horizon, and we see the data lake. How do
Data governance can encompass a wide spectrum of practices, many of which are focused on the development, documentation, approval and deployment of policies associated with data management and utilization. I distinguish the facet of “operational” data governance from the fully encompassed practice to specifically focus on the operational tasks for
Lately I've been binge-watching a lot of police procedural television shows. The standard format for almost every episode is the same. It starts with the commission or discovery of a crime, followed by forensic investigation of the crime scene, analysis of the collected evidence, and interviews or interrogations with potential suspects. It ends
.@philsimon chimes in on new data-gathering methods and what they mean for analytics.
I'm a very fortunate woman. I have the privilege of working with some of the brightest people in the industry. But when it comes to data, everyone takes sides. Do you “govern” the use of all data, or do you let the analysts do what they want with the data to
Since the idea of an “IoT analytical lifecycle,” may be understood in many different ways, let’s start with a definition. Performing analytics at the data center and the cloud is well established practice, and is still quite relevant. With growing numbers of connected devices and availability of computing capabilities at
.@philsimon on the downside of the Band-Aid approach.
Critical business applications depend on the enterprise creating and maintaining high-quality data. So, whenever new data is received – especially from a new source – it’s great when that source can provide data without defects or other data quality issues. The recent rise in self-service data preparation options has definitely improved the quality of
It’s nearly impossible to avoid the debate. From politicians and pundit commentary, to dinner table discussions across the United States, the hot topic for the last several years has been the rising cost of health care. Consider that health care expenditures in the US were $3 trillion in 2014 and are
Have you ever had problems matching data that has typographical errors in it? Because of the nature of arbitrary typos and incorrect spelled words a specific matching technique is required to tackle those cases. SAS Data Quality, with its traditional, in nature deterministic matching approach is by nature not best
Hadoop has driven an enormous amount of data analytics activity lately. And this poses a problem for many practitioners coming from the traditional relational database management system (RDBMS) world. Hadoop is well known for having lots of variety in the structure of data it stores and processes. But it's fair to
Some organizations I visit don’t seem to have changed their analytics technology environment much since the early days of IT. I often encounter companies with 70s-era base statistical packages running on mainframes or large servers, data warehouses (originated in the 80s), and lots of reporting applications. These tools usually continue
Two years ago, I found myself the proud, first-time owner of a garage. My wife and I quickly started to add new items to the garage – a battery-powered lawn mower, two beach cruisers and four Tommy Bahama beach chairs. They were stored with ease. What a fantastic world I'd been missing out on. But it wasn't long before we outstripped our
.@philsimon continues his series on data prep and anlytics.
In my last post, I talked about how data still needs to be cleaned up – and data strategy still needs to be re-evaluated – as we start to work with nontraditional databases and other new technologies. There are lots of ways to use these new platforms (like Hadoop). For example, many
I'm hard-pressed to think of a trendier yet more amorphous term today than analytics. It seems that every organization wants to take advantage of analytics, but few really are doing that – at least to the extent possible. This topic interests me quite a bit, and I hope to explore
"Tap into all your demand signals. Organize. Visualize. Analyze. Predict. Orchestrate. Optimize." The availability and collection of data are compelling companies to invest in demand signal management solutions to take advantage of the vast amount of information to support their planning processes. However, many have not gotten the return on
What does it really mean when we talk about the concept of a data asset? For the purposes of this discussion, let's say that a data asset is a manifestation of information that can be monetized. In my last post we explored how bringing many data artifacts together in a
If your enterprise is working with Hadoop, MongoDB or other nontraditional databases, then you need to evaluate your data strategy. A data strategy must adapt to current data trends based on business requirements. So am I still the clean-up woman? The answer is YES! I still work on the quality of the data.
The demand for data preparation solutions is at an all-time high, and it's primarily driven by the demand for self-service analytics. Ten years ago, if you were a business leader that wanted to get more in-depth information on a particular KPI, you would typically issue a reporting request to IT
The digital disruption is creating unforeseen events, such as new competitors, products and services that threaten the performance and positioning of consolidated players. Big data and analytics prove themselves, through successful user cases, as the answer to intercept the demand, prevent churn, draw an integrated view of the customer, manage