Data integration teams often find themselves in the middle of discussions where the quality of their data outputs are called into question. Without proper governance procedures in place, though, it's hard to address these accusations in a reasonable way. Here's why.
Tag: data quality
Data governance has been the topic of many of the recent posts here on the Data Roundtable. And rightfully so, since data governance plays such an integral role in the success of many enterprise information initiatives – such as data quality, master data management and analytics. These posts can help you prepare for discussing
Welcome to the 1st practical step for tackling auto insurance fraud with analytics. It is obvious why our first stop relates with data, the idiom “the devil is in the details” can easily be applied in the insurance fraud sector as “the devil is in the data”. This article analyses
.@philsimon on the need to adopt agile methodologies for data prep and analytics.
Lately I've been binge-watching a lot of police procedural television shows. The standard format for almost every episode is the same. It starts with the commission or discovery of a crime, followed by forensic investigation of the crime scene, analysis of the collected evidence, and interviews or interrogations with potential suspects. It ends
Critical business applications depend on the enterprise creating and maintaining high-quality data. So, whenever new data is received – especially from a new source – it’s great when that source can provide data without defects or other data quality issues. The recent rise in self-service data preparation options has definitely improved the quality of
Hadoop has driven an enormous amount of data analytics activity lately. And this poses a problem for many practitioners coming from the traditional relational database management system (RDBMS) world. Hadoop is well known for having lots of variety in the structure of data it stores and processes. But it's fair to
In my last post, I talked about how data still needs to be cleaned up – and data strategy still needs to be re-evaluated – as we start to work with nontraditional databases and other new technologies. There are lots of ways to use these new platforms (like Hadoop). For example, many
If your enterprise is working with Hadoop, MongoDB or other nontraditional databases, then you need to evaluate your data strategy. A data strategy must adapt to current data trends based on business requirements. So am I still the clean-up woman? The answer is YES! I still work on the quality of the data.
The demand for data preparation solutions is at an all-time high, and it's primarily driven by the demand for self-service analytics. Ten years ago, if you were a business leader that wanted to get more in-depth information on a particular KPI, you would typically issue a reporting request to IT
In DataFlux Data Management Studio, the predominate component of the SAS Data Quality bundle, the data quality nodes in a data job use definitions from something called the SAS Quality Knowledge Base (QKB). The QKB supports over 25 languages and provides a set of pre-built rules, definitions and reference data
At this stage, our organization has defined business objectives for Data Governance programme and shares business term definitions which it uses. This logical area of data and information management has been supplemented with a bridge to technical metadata – in the previous step we obtained one place, which combines the
Auditability and data quality are two of the most important demands on a data warehouse. Why? Because reliable data processes ensure the accuracy of your analytical applications and statistical reports. Using a standard data model enhances auditability and data quality of your data warehouse implementation for business analytics.
At some point, your business or IT leaders will decide – enough is enough; we can't live with the performance, functionality or cost of the current application landscape. Perhaps your financial services employer wants to offer mobile services, but building modern apps via the old mainframe architecture is impractical and a replacement
Das „Recht auf Vergessen“ hat nichts mit der Demenzerkrankung Alzheimer tun und betrifft auch nicht nur Personen im fortgeschrittenen Alter. Das „Recht auf Vergessen“ und „Portabilität“ bezeichnen Rechte im Rahmen der neuen europäischen Datenschutzverordnung, die nach vier Jahren Arbeit die nun doch schon 20 Jahren alte Verordnung ablöst, und die
The “big” part of big data is about enabling insights that were previously indiscernible. It's about uncovering small differences that make a big difference in domains as widespread as health care, public health, marketing and business process optimization, law enforcement and cybersecurity – and even the detection of new subatomic particles.
.@philsimon on whether organizations need MDM to gather valuable insights about their customers.
Master data management (MDM) is distinct from other data management disciplines due to its primary focus on giving the enterprise a single view of the master data that represents key business entities, such as parties, products, locations and assets. MDM achieves this by standardizing, matching and consolidating common data elements across traditional and big
Single view of customer. It's a noble goal, not unlike the search for the Holy Grail – fraught with peril as you progress down the path of your data journey. If you're a hotelier, it can improve your customer's experience by providing the information from the casinos and the spa at check-in to better meet your
Na tym etapie nasza organizacja posiada określone dla programu Data Governance cele biznesowe oraz zarządza i współdzieli definicje pojęć biznesowych, którymi się posługuje. Ten logiczny obszar zarządzania danymi i informacją uzupełniony został o pomost do metadanych technicznych - w poprzednim kroku uzyskaliśmy jedno miejsce łączące informacje o technicznym przepływie danych w organizacji
La grandeza de sus datos probablemente no es la característica más importante. De hecho, puede que ni siquiera figure dentro de los aspectos relevantes por los cuales usted debería preocuparse. La calidad, la integración de los silos, la manipulación y la extracción de valor de los datos no estructurados siguen
Tutti siamo a conoscenza del fatto che avere acqua pulita è una condizione necessaria alla sopravvivenza. Senza, è possibile restare in vita per circa tre giorni. Quindi cosa succede quando la fonte è inquinata? A meno di non filtrare acqua con particolare attenzione, le conseguenze per l’organismo saranno sicuramente negative.
In my previous post I discussed the practice of putting data quality processes as close to data sources as possible. Historically this meant data quality happened during data integration in preparation for loading quality data into an enterprise data warehouse (EDW) or a master data management (MDM) hub. Nowadays, however, there’s a lot of
We had just completed a four-week data quality assessment of an inside plant installation. It wasn't looking good. There were huge gaps in the data, particularly when we cross-referenced systems together. In theory, each system was meant to hold identical information of the plant equipment. But when we consolidated the
In erster Linie wird der Begriff Datenqualität mit Kunden- und Adressinformationen in Zusammenhang gebracht. Neben der Dublettensuche und Bereinigung von Adressdatenbeständen ist die Qualität der Produktstammdaten aber ebenfalls äußert wichtig, um automatisierte Prozessabläufe zu verbessern oder beispielsweise die Trefferquote bei Suchanfragen im Onlineshop zu erhöhen.
In my last post we started to look at two different Internet of Things (IoT) paradigms. The first only involved streaming automatically generated data from machines (such as sensor data). The second combined human-generated and machine-generated data, such as social media updates that are automatically augmented with geo-tag data by
The concept of the internet of things (IoT) is used broadly to cover any organization of communication devices and methods, messages streaming from the device pool, data collected at a centralized point, and analysis used to exploit the combined data for business value. But this description hides the richness of