One of the significant problems data quality leaders face is changing people's perception of data quality. For example, one common misconception is that data quality represents just another data processing activity. If you have a data warehouse, you will almost certainly have some form of data processing in the form
Uncategorized
The Internet of Things is going to be driven by innovative business models as much as by innovative technology. In order to ground the following discussion, I found it helpful to create this visual depiction of the IoT that defines and distinguishes the key elements that enter into these business models.
Have you ever thought about retiring in another country, where your money might go further? Well here's some quantitative data to help you make an informed decision! ... First, to get you in the mood, here's a picture of my friend Erik checking out the prices at a pedal-powered food
Recently, I had the opportunity to talk with James Haight of Blue Hill Research regarding the Internet of Things and how it is and will impact manufacturers. We also dipped our toes into other topics, including text analytics and the interesting combination of all these trends. The conversation was captured
By now, we have all heard about the Internet of Things (IoT), or the Industrial Internet. Across industries, organizations are attempting to instrument and measure all critical business systems and assets in an effort to drive improved production and service delivery. It is estimated that by 2020, companies will spend
You’ve heard about the smart grid, but what is it that makes the grid smart? I’ve been working on a project with Duke Energy and NC State University doing time-series analysis on data from Phasor Measurement Units (PMUs) that illustrates the intelligence in the grid as well as an interesting
In my last post, I pointed out that we data quality practitioners want to apply data quality assertions to data instances to validate data in process, but the dynamic nature of data must be contrasted with our assumptions about how quality measures are applied to static records. In practice, the
Was nun - Big Data oder Smart Data? Die Antwort: beides – denn das eine bedingt das andere. Die Kunst ist, aus einer riesigen Datenmenge den wesentlichen Teil zu extrahieren. Die Energiewende basiert zum einen auf dem verstärkten Einsatz regenerativer Energien, zum anderen soll die produzierte Energie effektiver eingesetzt werden. Dies
Back in the day when the prison system forced inmates to perform "hard labor", folks would say (of someone in prison): "He's busy making little ones out of big ones." This evokes the cliché image of inmates who are chained together, forced to swing a chisel to break large rocks
In the SAS DATA step, all variables are scalar quantities. Consequently, an IF-THEN/ELSE statement that evaluates a logical expression is unambiguous. For example, the following DATA step statements print "c=5 is TRUE" to the log if the variable c is equal to 5: if c=5 then put "c=5 is TRUE";
Why do people steal ATMs? Because that's where the money is!!! While the old "smash-n-grab" remains a favorite modus operandi of would-be ATM thieves, the biggest brains on the planet typically aren't engaged in such endeavors (see Thieves Steal Empty ATM, Chain Breaks Dragging Stolen ATM, An A for Effort). And of
Since the launch of Communities on SAS, hundreds of SAS employees have been among you. Some SAS employees made themselves known by selecting a telling user name (such as Cynthia@SAS), but others remained camouflaged or incognito, keeping their secret identities like the SAS superheroes they are. That's about to change.
Utilizing big data analytics is currently one of the most promising strategies for businesses to gain competitive advantage and ensure future growth. But as we saw with “small data analytics,” the success of “big data analytics” relies heavily on the quality of its source data. In fact, when combining “small” and “big” data
Have you ever wondered whether the area where you live is a good location for producing solar power? Let's create a SAS map to help find out! To get you in the right frame of mind, here is an awesome picture of some Arizona sunshine, that my good friend Eva
At the beginning of my book Statistical Programming with SAS/IML Software I give the following programming tip (p. 25): Do not confuse an empty matrix with a matrix that contains missing values or with a zero matrix. An empty matrix has no rows and no columns. A matrix that contains
In my previous post, I talked about how the Internet of Things promises new ways to use sensor and machine data by creating a highly efficient world that demands constant analysis and evaluation of the state of events across everything that surrounds us. I have also explained why it is
According to analyst firms, consulting companies and various other research, customer experience is the primary priority for insurance companies. But is customer experience overrated? Let’s start by considering the primary interactions between an insurance company and its customers: new business, billing, renewals and claims. Ask any insurance executive, especially property
@philsimon on the need to recognize DQ differences.
Data simulation is a fundamental technique in statistical programming and research. My book Simulating Data with SAS is an accessible how-to book that describes the most useful algorithms and the best programming techniques for efficient data simulation in SAS. Here are five lessons you can learn by reading it: Learn strategies
It’s common at the start of a new year to create a long list of resolutions that we hope to achieve. The reality, of course, is by February those resolutions will likely be a distant memory. The key to making any resolution stick is to start small. Create one small
We now live in the era of ‘big data’, where data and its analysis have become crucial to the modern economy. In fact, "big data is the new 'corporate gold'," according to Mark Wilkinson, managing director of SAS UK & Ireland. A recent study by Cebr found that companies in
A common task in SAS/IML programming is finding elements of a SAS/IML matrix that satisfy a logical expression. For example, you might need to know which matrix elements are missing, are negative, or are divisible by 2. In the DATA step, you can use the WHERE clause to subset data.
It’s that time of year again when we look back and consider how accurately and extensively the SAS story was covered in the media over the past year. It’s not always a simple or predictable story, but it is usually interesting. Consider some of these threads: How did companies or
To get into the mood for this blog post, you should first listen to the music video of The Who singing My Generation... I guess everybody has 'their generation' and here in the U.S. the most famous generation has been the Baby Boomers. Many companies have tried to design products they
While perusing the SAS 9.4 DS2 documentation, I ran across the section on the HTTP package. This intrigued me because, as DS2 has no text file handling statements I assumed all hope of leveraging Internet-based APIs was lost. But even a Jedi is wrong now and then! And what better
In December the Institute of Business Forecasting published the first of a new blog series on Forecast Value Added. Each month I will be interviewing an industry forecasting practitioner (or consultant/vendor) about their use of FVA analysis. The December interview featured Jonathon Karelse, co-founder of NorthFind Partners. Among his key
Outside, the Cary, NC sky is gray and winds are blowing freezing rain, but a group of statisticians at SAS are channeling warm green hills and the soft, gold light of a California evening. Team conversations alternate between distributed processing, PROC IMSTAT and how many pairs of shorts to pack.
After working in the data quality industry for a number of years, I have realized that most practitioners tend to have a rather rigid perception of the assertions about the quality of data. Either a data set conforms to the set of data quality criteria and is deemed to be acceptable
As a blogger, I often wonder whether my blog posts are 'successful' - and being a graph guy, I like to visually analyze the data, to try to answer that question. The most common measure of a blog post is probably the number of times it was viewed, so I guess
This isn't Kansas anymore. Oz has become a sprawling, smart metropolis filled with sensor data. How do we make sense of, clean, govern and glean value from this big data so we can get Dorothy home? The answer is SAS Data Management. With the latest portfolio updates, customers will be