Tutti siamo a conoscenza del fatto che avere acqua pulita è una condizione necessaria alla sopravvivenza. Senza, è possibile restare in vita per circa tre giorni. Quindi cosa succede quando la fonte è inquinata? A meno di non filtrare acqua con particolare attenzione, le conseguenze per l’organismo saranno sicuramente negative.
Tag: data quality
.@philsimon on the collision between the two.
I've spent much of my career managing the quality of data after it was moved from its sources to a central location, such as an enterprise data warehouse. Nowadays not only do we have a lot more data – but a lot of it is in motion. One of the
In my previous post I discussed the practice of putting data quality processes as close to data sources as possible. Historically this meant data quality happened during data integration in preparation for loading quality data into an enterprise data warehouse (EDW) or a master data management (MDM) hub. Nowadays, however, there’s a lot of
We had just completed a four-week data quality assessment of an inside plant installation. It wasn't looking good. There were huge gaps in the data, particularly when we cross-referenced systems together. In theory, each system was meant to hold identical information of the plant equipment. But when we consolidated the
In erster Linie wird der Begriff Datenqualität mit Kunden- und Adressinformationen in Zusammenhang gebracht. Neben der Dublettensuche und Bereinigung von Adressdatenbeständen ist die Qualität der Produktstammdaten aber ebenfalls äußert wichtig, um automatisierte Prozessabläufe zu verbessern oder beispielsweise die Trefferquote bei Suchanfragen im Onlineshop zu erhöhen.
In my last post we started to look at two different Internet of Things (IoT) paradigms. The first only involved streaming automatically generated data from machines (such as sensor data). The second combined human-generated and machine-generated data, such as social media updates that are automatically augmented with geo-tag data by
The concept of the internet of things (IoT) is used broadly to cover any organization of communication devices and methods, messages streaming from the device pool, data collected at a centralized point, and analysis used to exploit the combined data for business value. But this description hides the richness of
Anwender in Risiko- oder Controlling-Abteilungen haben – in aller Regel – keine tiefer gehenden Kenntnisse in Abfragen von Datenbanken. Excel ist die Welt, in der sie zu Hause sind und sich wohlfühlen. Komplexe Datenbankfragen, wenn etwa Zusammenhänge zwischen Datenbanktabellen identifiziert werden sollen, führt die IT-Abteilung durch und stellt die Ergebnisse
Was zeichnet ein erfolgreiches Unternehmen aus? Entscheidender Indikator für den Erfolg ist der Umsatz und der daraus resultierende Gewinn des laufenden Geschäftsjahres. Was auf der einen Seite hart erarbeitet wird, geht allerdings auf der anderen Seite oft leichtfertig verloren. So büßen viele Unternehmen laut Analystenstudien etwa acht Prozent ihres Gewinns
Throughout my long career of building and implementing data quality processes, I've consistently been told that data quality could not be implemented within data sources, because doing so would disrupt production systems. Therefore, source data was often copied to a central location – a staging area – where it was cleansed, transformed, unduplicated, restructured
In my first blog article I explained that many insurance companies have implemented a standard data model as base for their business analytics data warehouse (DWH) solutions. But why should a standard data model be more appropriate than an individual one designed especially for a certain insurance company?
A soccer fairy tale Imagine it's Soccer Saturday. You've got 10 kids and 10 loads of laundry – along with buried soccer jerseys – that you need to clean before the games begin. Oh, and you have two hours to do this. Fear not! You are a member of an advanced HOA
While it’s obvious that chickens hatch from eggs that were laid by other chickens, what’s less obvious is which came first – the chicken or the egg? This classic conundrum has long puzzled non-scientists and scientists alike. There are almost as many people on Team Chicken as there are on Team
.@philsimon on the specific risks to data quality posed by cloud computing.
Does it upset you when you log onto your healthcare insurance portal and find that they spelled your name wrong, have your dependents listed incorrectly or your address is not correct? Well, it's definitely not a warm fuzzy feeling for me! After working for many years in the healthcare, pharmaceutical and
I'm frequently asked: "What causes poor data quality?" There are, of course, many culprits: Lack of a data culture. Poor management attitude. Insufficient training. Incorrect reward structure. But there is one reason that is common to all organizations – poor data architecture.
Many data quality issues are a result of the distance separating data from the real-world object or entity it attempts to describe. This is the case with master data, which describes parties, products, locations and assets. Customer (one of the roles within party) master data quality issues are rife with examples, especially
@philsimon on what we can learn about data quality from Jeff Bezos's behemoth.
At a recent TDWI conference, I was strolling the exhibition floor when I noticed an interesting phenomenon. A surprising percentage of the exhibiting vendors fell into one of two product categories. One group was selling cloud-based or hosted data warehousing and/or analytics services. The other group was selling data integration products. Of
When you spend long enough writing and working in any industry, you inevitably see trends emerge and reach varying levels of maturity. Data governance is one such trend, as you can see from the following Google Trends chart:
.@philsimon lists the gravest data-quality errors.
I've been doing some investigation into Apache Spark, and I'm particularly intrigued by the concept of the resilient distributed dataset, or RDD. According to the Apache Spark website, an RDD is “a fault-tolerant collection of elements that can be operated on in parallel.” Two aspects of the RDD are particularly
Data quality has always been relative and variable, meaning data quality is relative to a particular business use and can vary by user. Data of sufficient quality for one business use may be insufficient for other business uses, and data considered good by one user may be considered bad by others.
I recently presented a webinar (via the IAIDQ) on the topic of 7 Habits of Effective Data Quality Leaders. To prepare, I looked back at the many interviews of leading data quality practitioners I had undertaken over the years. A common trait among all these interviews stood out – they
As I explained in Part 1 of this series, creating a strategy for the data in an organization is not a straightforward task. Two of the most important issues you'll want to address in your data strategy are data quality and big data. Data quality There can be no data that is
"I skate to where the puck is going to be, not where it has been." - Wayne Gretzky I love this quote from Wayne Gretzky. It sums up how most organizations approach data strategy. Data strategy typically starts with a strategic plan laid down by the board. The CEO will
When my band first started and was in need of a sound system, we bought a pair of cheap yet indestructible Peavey speakers, some Radio Shack microphones and a power mixer. The result? We sounded awful and often split our ear drums from high-pitched feedback and raw, untrained vocals. It took us years
In this two-part series, which posts as the calendar turns to a new year, I revisit the top data management topics of 2015 (Part 1) and then try to predict a few of the data management trends of 2016 (Part 2). Data management in 2016 The Internet of Things (IoT) made significant