Blend, cleanse and prepare data for analytics, reporting or data modernization efforts
Data Management
Bigger doesn’t always mean better. And that’s often the case with big data. Your data quality (DQ) problem – no denial, please – often only magnifies when you get bigger data sets. Having more unstructured data adds another level of complexity. The need for data quality on Hadoop is shown by user
In the oil and gas industry, analytics are used to improve both upstream and downstream operations, from optimizing exploration and forecasting production to reducing commodity trading risk and understanding customer's energy needs. If you plan to derive value from the digital oil field, big data, and analytics, one of the first things
We’ve been talking about data recently at the Analytic Hospitality Executive. I’ve advocated to use whatever data you have, big or small, to get started today on analytic initiatives that will help you avoid big data paralysis. In this blog, I’m going to get a bit more technical than usual
I’ve had a lot of discussions with business leaders around the discrepancy between big data investment fears and successful use cases. Most of them say that "the quest for the golden use case" takes too much time and is usually not successful in the end. Ultimately, this quest can lead to
Operationalizing data governance means putting processes and tools in place for defining, enforcing and reporting on compliance with data quality and validation standards. There is a life cycle associated with a data policy, which is typically motivated by an externally mandated business policy or expectation, such as regulatory compliance.
“Aquellos que no conocen su pasado están condenados a repetirlo” La retrospección es un proceso lento. Así como en los seres humanos aún persisten en el tiempo comportamientos que no funcionan, en las organizaciones perduran procesos de información que se rompen y pueden causar grandes crisis ¿cómo evitarlo? En la
Guess what? Data governance can be considered a bottleneck and a bothersome activity at some organizations. So let’s discuss how NOT TO BE the BOTTLENECK. Defining what the data governance initiative will entail is very important here.
.@philsimon on whether companies should apply some radical tactics to DG.
Yes. But since this post needs to be more than a one-word answer to its title, allow me to elaborate. Data governance (DG) enters into the discussion of all enterprise information initiatives. Whether or not DG should be the opening salvo of these discussions is akin to asking whether the
What does the future of analytics look like in your organizations enterprise architecture? Does it include thinking about a two speed approach to analytics which includes both: An agile rapidly changing analytics platform for innovation (a lab) seperated from operations and broad enterprise audience usage A slowly moving systematic enterprise analytics platform (a factory)
If your organization is large enough, it probably has multiple data-related initiatives going on at any given time. Perhaps a new data warehouse is planned, an ERP upgrade is imminent or a data quality project is underway. Whatever the initiative, it may raise questions around data governance – closely followed by discussions about the
I’ve spent some time over the past couple of months learning more about anonymization. This began with an interest in the technical methods used to protect sensitive personally-identifiable information in a SAS data warehouse and analytics platform we delivered for a customer. But I learned that anonymization has two rather different meanings; one in the
In recent years, we practitioners in the data management world have been pretty quick to conflate “data governance” with “data quality” and “metadata.” Many tools marketed under "data governance" have emerged – yet when you inspect their capabilities, you see that in many ways these tools largely encompass data validation and data standardization. Unfortunately, we
After doing some recent research with IDC®, I got to thinking again about the reasons that organizations of all sizes in all industries are so slow at adopting analytics as part of their ‘business as usual’ operations. While I have no hard statistics on who is and who isn’t adopting
Big data. Streaming data. Complex data. We’ve all heard the reasons why organizations feel like they’re facing an insurmountable data challenge. Now, it’s time to do something about it. For the past few years, SAS has helped some of the world’s leading companies make sense of an avalanche of data.
.@philsimon on the new challenges of data governance.
Jim Harris says event stream processing determines if big data is eventful and relevant enough to process and store.
The metaphors we choose to describe our data are important, for they can either open up the potential for understanding and insight, or they can limit our ability to effectively extract all the value our data may hold. Consisting as it does of nothing but electric potentials, or variations in
Determining the life cycle of event stream data requires us to first understand our business and how fast it changes. If event data is analyzed, it makes sense that the results of that analysis would feed another process. For example, a customer relationship management (CRM) system or campaign management system like
As consumers, the quality of our day is all too often governed by the outcome of computed events. My recent online shopping experience was a great example of how computed events can transpire to make (or break) a relaxing event. We had ordered grocery delivery with a new service provider. Our existing provider
I believe most people become overwhelmed when considering the data that can be created during event processing. Number one, it is A LOT of data – and number two, the data needs real-time analysis. For the past few years, most of us have been analyzing data after we collected it,
You've probably heard many times about the fantastic untapped potential of combining online and offline customer data. But relax, I’m going to cut out the fluff and address this matter in a way that makes the idea plausible and its objectives achievable. The reality is that while much has been
In my last two posts, I introduced some opportunities that arise from integrating event stream processing (ESP) within the nodes of a distributed network. We considered one type of deployment that includes the emergent Internet of Things (IoT) model in which there are numerous end nodes that monitor a set of sensors,
In my previous post, I discussed the similarities, differences and overlap between event stream processing (ESP) and real-time processing (RTP). In this post, I want to highlight three things that need to get real. In other words, three things that should be enhanced with real-time capabilities, whether it’s ESP, RTP or
You might have lots of data on lots of customers, but imagine if you could suddenly add in a huge dollop of new, highly informative data that you weren’t able to access before. You could then use analytics to extract some really important insights about these customers, allowing you to
In my last post, we examined the growing importance of event stream processing to predictive and prescriptive analytics. In the example we discussed, we looked at how all the event streams from point-of-sale systems from multiple retail locations are absorbed at a centralized point for analysis. Yet the beneficiaries of those
As we enter the era of “everything connected,” we cannot forget that gathering data is not enough. We need to process that data to gain new knowledge and build our competitive advantage. The Internet of Things is not just a consumer thing – it also makes our businesses more intelligent. Whenever
.@philsimon says that you shouldn't bring a knife to a gun fight.
(Otherwise known as Truncate – Load – Analyze – Repeat!) After you’ve prepared data for analysis and then analyzed it, how do you complete this process again? And again? And again? Most analytical applications are created to truncate the prior data, load new data for analysis, analyze it and repeat