Blend, cleanse and prepare data for analytics, reporting or data modernization efforts
.@philsimon lists the gravest data-quality errors.
Blend, cleanse and prepare data for analytics, reporting or data modernization efforts
.@philsimon lists the gravest data-quality errors.
I've been doing some investigation into Apache Spark, and I'm particularly intrigued by the concept of the resilient distributed dataset, or RDD. According to the Apache Spark website, an RDD is “a fault-tolerant collection of elements that can be operated on in parallel.” Two aspects of the RDD are particularly
¿Cómo las empresas pueden gestionar adecuadamente su gran volumen de información? ¿Están los datos de su organización listos para convertirse en la clave para alcanzar sus objetivos empresariales? ¿Cómo tomar las mejores decisiones a partir de la analítica? No cabe duda, en un mundo cada vez más conectado el Big
Data quality has always been relative and variable, meaning data quality is relative to a particular business use and can vary by user. Data of sufficient quality for one business use may be insufficient for other business uses, and data considered good by one user may be considered bad by others.
I recently presented a webinar (via the IAIDQ) on the topic of 7 Habits of Effective Data Quality Leaders. To prepare, I looked back at the many interviews of leading data quality practitioners I had undertaken over the years. A common trait among all these interviews stood out – they
Medicare payment changes are coming. The Centers for Medicare and Medicaid Services (CMS) has announced the intention of increasing the proportion of payments to providers based on outcomes and changes in health status, as opposed to delivery of services. At the January 11th, 2016 J.P. Morgan Annual Health Care Conference,
Aktuell sprechen wir mit vielen Banken über das Management ihrer Daten. Historisch schien die Bereitstellung von Information hinlänglich gelöst: Die IT-Abteilungen stellten diverse Mart-Daten & Analyse-Tools bereit. Punkt. Banken-Daten-Management: Mehr Schein als Sein?
As I explained in Part 1 of this series, creating a strategy for the data in an organization is not a straightforward task. Two of the most important issues you'll want to address in your data strategy are data quality and big data. Data quality There can be no data that is
REpresentational State Transfer (REST) is an architectural style for designing web services that access a system's resources using HTTP methods. With the release of DataFlux Data Management 2.7 earlier this year, three new REST Application Programming Interfaces (APIs) are now available: Data Management Server Batch Jobs Data Management Server Real-time
Back before storage became so affordable, cost was the primary factor in determining what data an IT department would store. As George Dyson (author and historian of technology) says, “Big data is what happened when the cost of storing information became less than the cost of making the decision to
A recent survey by Capgemini found that 78% of insurance executive interviewed cited big data analytics as the disruptive force that will have the biggest impact on the insurance industry. That’s the good news. The bad news is that unfortunately traditional data management strategies do not scale to effectively govern
This is my first blog post, and the first of a long series around Data Governance. The first thing I want to discuss is the ability to Share DataFlux Data Quality profiling metrics in SAS Visual Analytics. This post will illustrate how to extract profiling data from the DataFlux repository
Creating a strategy for the data in an organization is not a straightforward task. Not only does our business change – our software solutions also change before we can ever get done with a data strategy. So, I choose to understand that a strategy has a vision, and my vision may change
Last year at Mobile World Congress (MWC), we saw the reality of the IoT come to fruition. We saw leaders like AT&T showing their prototypes of connected cars, containers, agricultural sites, homes, etc. And it's got me wondering what we'll see at the 2016 MWC. In 2015, communications and media
In my previous post, I discussed the characteristics of a strong data strategy, the first of which was that a formal, well-defined strategy exists within your organization. This post discusses how often (and why) your organization’s data strategy needs to be updated. While strategy encompasses and sets the overall direction for
In my two prior posts, I discussed the process of developing a business justification for a data strategy and for assessing an organization's level of maturity with key data management processes and operational procedures. The business justification phase can be used to speculate about the future state of data management required
Like most boys my age at that time, I wanted to be an astronaut. Fate, however, intervened, in the form of nearsightedness, so I had to find an alternative occupation. Coming to my rescue for the launch of Apollo 11 was my father, who presented me with a huge booklet that broke
Data virtualization simplifies increasingly complex data architectures Every few months, another vendor claims one environment will replace all others. We know better. What usually happens is an elongated state of coexistence between traditional technology and the newer, sometimes disruptive one. Eventually, one technology sinks into obsolescence, but it usually takes much longer than we expect. Think of
Love includes a range of strong and positive emotional and mental states, from the highest virtue to the simplest pleasure. An example of such a wide range of meanings is the fact that the love of a mother is different from the love of a spouse, which, in turn, is
In my last post, we touched on the importance of data migration in an overall data strategy. The reason I wanted to do this is because so many organizations see the migration of data as a technical challenge that can be outsourced and largely ignored by their internal teams. I contend
In my last post, I discussed some practical steps you can take to collect the right information for justifying why your business should design and implement a data strategy. Having identified weaknesses in your environment that could impede business success, your next step is to drill down deeper to determine where there may be
With data now impacting nearly every business activity, there should no longer be any doubt that data needs to be managed as a strategic corporate asset. This post examines the top five characteristics of a strong data strategy. Existence As I previously blogged, in today’s fast-moving business world now often takes priority
While setting up meetings with business consumers developing a data warehouse environment, I was involved in some very interesting conversations. Following are some of the assumptions that were made during these conversations, as well as a few observations. To get a well-rounded view of this topic, read my earlier post that focuses on the IT perspective.
.@philsimon on the convergence between tools such as Hadoop and strategy.
People often seek out our company for guidance related to master data management, data governance and data quality. But I see a frequent pattern, where the customer presumes that they need a particular data management solution – even if there is no specific data management problem. This approach is often triggered in reaction
The other day I was in a meeting with a client and there was an argument about who owns the data. Those arguing were IT people. In this scenario, the assumption was that data from source systems would flow into and integrate with a data warehouse. I found the discussion very interesting. Here are some of the
.@philsimon provides insights on whether a data strategy can result in competitive advantage.
In Part 1 of this series, Cheryl Doninger described how SAS Grid Manager can extend your investment in the Hadoop infrastructure. In this post, we’ll take a look at how Cloudera Manager helps Hadoop administrators meet competing service level agreements (SLAs). Cloudera Manager lets Hadoop admins set up queues to
"I skate to where the puck is going to be, not where it has been." - Wayne Gretzky I love this quote from Wayne Gretzky. It sums up how most organizations approach data strategy. Data strategy typically starts with a strategic plan laid down by the board. The CEO will