What's the Fifth Law of Data Quality? Jim Harris explains.
Search Results: data cleansing (127)
Both the fiscal crisis and the swine flu pandemic have caused a great deal of worry and panic. Worry is sometimes a good defense, but panic only leads to more panic and that only leads to more trouble. At the same time, both crises have brought to light the importance
SAS' Leonid Batkhan explains the data cleansing task of removing unwanted repeated characters in SAS character variables.
Whether working as a business analyst, data scientist or machine learning engineer, one thing remains the same – making an impact with data and AI is what really matters. Pre-processing and exploring data, building and deploying models and turning those scoring values into an actionable insight can be overwhelming. A
Outliers provide much-needed insights into the actual relationships that influence the demand for products in the marketplace. They are particularly useful when modeling consumer behavior where abnormalities are common occurrences or unforeseen disruptions that impact consumer demand. But why do demand planners cleanse out outliers, when many are not really
Corpus analysis is a technique widely used by data scientists because it provides an understanding of a document collection and provides insights into the text.
A team of SAS employees recently participated in a data-for-good project focusing on forest fires in the Amazon. In conjunction with the Amazon Conservation Association (ACA), the team explored options to collect and analyze publicly available imagery and fire data to better understand the drivers for forest fires as well
SAS' Leonid Batkhan shows you how to remove ANY leading characters (not just blanks) from text strings to tidy up your data.
SAS' Leonid Batkhan shows you how to delete a substring from a character string - one of the common character data manipulation tasks.
SAS' Leonid Batkhan demonstrates a common character data manipulation task of inserting a substring into a character string.
At SAS, we believe analytics is the force that drives change across organizations. Today, as change has been further accelerated, digital transformation is happening faster than anyone planned. Amid these advances, the use of analytics has become even more crucial, especially as a role in mission-critical applications. In 2020, even
Interview mit Lehrstuhlinhaber für Data Science der Hochschule für angewandte Wissenschaften in Darmstadt, Professor Markus Döhring.
Analytics is playing an increasingly strategic role in the ongoing digital transformation of organizations today. However, to succeed and scale your digital transformation efforts, it is critical to enable analytics skills at all tiers of your organization. In a recent blog post covering 4 principles of analytics you cannot ignore,
I'm a big fan of the Import Data task in SAS Enterprise Guide, especially for its support of text-based files (CSV, tab delimited, fixed width, and more). There's no faster method for generating SAS code that reads your data exactly the way you need it. I use the tool so
“The future is already here — it's just not very evenly distributed.” ~ William Gibson, author The same can be said for climate change – global warming is here, in a big way, but its effects are still an arm's length away for many of us. How is climate change
IT managers see the potential for cost-cutting from transitioning application development to open source software (OSS). Today, companies can hire recent college graduates with skills in open source development and avail themselves of the free software. But is all that glitters really gold? Users groups and more formal workgroups are
How do you explain flat-line forecasts to senior management? Or, do you just make manual overrides to adjust the forecast? When there is no detectable trend or seasonality associated with your demand history, or something has disrupted the trend and/or seasonality, simple time series methods (i.e. naïve and simple
Regular expressions are a powerful method for finding specific patterns in text. The syntax of regular expressions is intimidating, but once you've solved a few pattern-recognition problems with regex, you'll never go back to your old methods.
To succeed in any data-focused hackathon, you need a robust set of tools and skills – as well as a can-do attitude. Here's what you can expect from any hackathon: Messy data. It might come from a variety of sources, and won't necessarily be organized for analytics or reporting. That's
Clark Bradley explains how SAS can make Hadoop approachable and accessible.
From national security agencies, law enforcement organizations looking to terrorism and criminal activities, internal security, audit and compliance departments, to hospitals and public health organizations guarding against disease outbreaks, there are many common needs and constant challenges, e.g.: Detect an event of interest in the early stages. Investigate suspicious events
We often talk about full customer data visibility and the need for a “golden record” that provides a 360-degree view of the customer to enhance our customer-facing processes. The rationale is that by accumulating all the data about a customer (or, for that matter, any entity of interest) from multiple sources, you
In recent healthcare blogs I’ve looked at the need to drive more value from the UK’s National Health Service (NHS) and how this relies upon the ability to make decisions based on robust, data-driven insights. But what value will these decisions have if they're not founded on a mature data
This is the second of the seven parts of blog post series “A practical guide to tackle auto insurance fraud”. While Data Management and Data Quality are the basis for every analytical journey, and this becomes even more true for fraud detection analytics, the domain knowledge and business expertise plaid
Machine learning is taking a significant role in many big data initiatives today. Large retailers and consumer packaged goods (CPG) companies are using machine learning combined with predictive analytics to help them enhance consumer engagement and create more accurate demand forecasts as they expand into new sales channels like the
I am more than glad to invite you to join me in a series of posts related to a practical guide for tackling auto insurance fraud in the new era of data science and advanced analytics. Insurers are used to face a constant threat, a powerful enemy that never rests.
The SAS Quality Knowledge Base (QKB) is a collection of files which store data and logic that define data cleansing operations such as parsing, standardization, and generating match codes to facilitate fuzzy matching. Various SAS software products reference the QKB when performing data quality operations on your data. One of
In our previous post, Econometric and statistical methods for spatial data analysis, we discussed the importance of spatial data. For most people, understanding that importance is relatively easy because spatial data are often found in our daily lives and we are all accustomed to analyzing them. We can all relate to
When your job involves making decisions that affect thousands of college students, making the right decisions can have a large impact on the future. Giving college administrators easy access to reliable analytics can help improve enrollment and graduation rates – and find answers to complex questions that cut across many
While holed up inside, like many others on the East Coast of the United States, suffering from record-breaking rainfall and watching the path of Hurricane Joaquin, I found a perfect metaphor for handling a problem in explaining analytics. Many executives bemoan the fact that it seems to take forever for