All Posts
We’re all about numbers here at SAS. So when the Global Certification program hit its 75,000th credential – we had to make it a big deal. We tracked down the 75,000th credential holder to Susan Langan, a research analyst in Maryland, and what’s even more special than Langan holding the
Once in a while, people run into an issue with the data that doesn't really need to be fixed right to ensure success of a specific project. So, the data issues are put into production and forgotten. Everyone always says, “We will go back and correct this later.” But that
The SAS DATA step supports multidimensional arrays. However, matrices in SAS/IML are like mathematical matrices: they are always two dimensional. In simulation studies you might need to generate and store thousands of matrices for a later statistical analysis of their properties. How can you accomplish that unless you can create
Suppose someone needs a kidney transplant and a family member is willing to donate one. If the donor and recipient are incompatible (because of blood types, tissue mismatch, and so on), the transplant cannot happen. Now suppose two donor-recipient pairs A and B are in this situation, but donor A
According to a 2012 report, it was estimated that over the next five years the US Internal Revenue Service (IRS) will issue more than $20 billion in potentially fraudulent tax refunds. Figures like this do little to boost taxpayers’ confidence in our nation’s tax system. And tax fraud is not
I'm ramping up my visualization skills in preparation for the next big election, and I invite you to do the same! Let's start by plotting some county-level election data on a map... To get you into the spirit of elections, here's a picture of my friend Sara's dad, when he was
Because finding analytical talent continues to be a challenge for most, here I offer tips 5, 6, and 7 of my ten tips for finding data scientists, based on best practices at SAS and illustrated with some of our own “unicorns.” You can read my first blog post for why they
Regulatory compliance is a principal driver for data quality and data governance initiatives in many organisations right now, particularly in the banking sector. It is interesting to observe how many financial institutions immediately demand longer timeframes to help get their 'house in order' in preparation for each directive. To the
Part 1 of this topic presented a simple Sudoku solver. By treating Sudoku as an exact cover problem, the algorithm efficiently found solutions to simple Sudoku problems using basic logic. Unfortunately, the simple solver fails when presented with more difficult Sudoku problems. The puzzle on the right was obtained from
In many ways it’s open season for open data; open data is one of those phrases we hear a lot but it’s not always appreciated as having value. The fact that it’s openly available is seen by some as proof that there’s no value in the data – unlike, for
¿Ha oído el viejo dicho, "sus ojos son más grandes que su estómago"? Esta es otra forma de decir que su apetito puede causar que usted llene su plato con más comida de la que en realidad puede comer. Actualmente, eso es lo que ha pasado con la combinación de
It’s February, so love is in the air (or at least hearts, chocolate, and roses are lining the isles at the grocery store) in the weeks before Valentine’s Day. For the singles in the house, don’t stop here! The stats are in, and according to the http://www.pursuit-of-happiness.org/ , people who have
In this blog series, I am exploring if it’s wise to crowdsource data improvement, and if the power of the crowd can enable organizations to incorporate better enterprise data quality practices. In Part 1, I provided a high-level definition of crowdsourcing and explained that while it can be applied to a wide range of projects
In SAS, the order of variables in a data set is usually unimportant. However, occasionally SAS programmers need to reorder the variables in order to make a special graph or to simplify a computation. Reordering variables in the DATA step is slightly tricky. There are Knowledge Base articles about how
Staying competitive in a big data world means working fast and making decisions even faster. You need to assess conditions, approve access, stop transactions and reroute activities quickly so you can seize opportunities or prevent problems. With increasing data volumes from the Internet of Things (Cisco predicts that fifty billion