SAS programmers sometimes ask, "How do I create a design matrix in SAS?" A design matrix is a numerical matrix that represents the explanatory variables in regression models. In simple models, the design matrix contains one column for each continuous variable and multiple columns (called dummy variables) for each classification
Uncategorized
I had the opportunity to interview an award-winning, fast-moving, consumer packaged goods (CPG) company in the early 2000’s. They were recognized as one of the best supply chain companies in the United States by all of the major retailers and their CPG peers. Indeed, it seemed every time a new
I recently presented a webinar (via the IAIDQ) on the topic of 7 Habits of Effective Data Quality Leaders. To prepare, I looked back at the many interviews of leading data quality practitioners I had undertaken over the years. A common trait among all these interviews stood out – they
If you're a worrier, you know there's a chance you could get bitten by a shark, or hit by a piece of falling satellite debris - these events are both possible, but not probable. Getting injured by a lawn mower, on the other hand, is something that could easily happen. With
There's a lot of chatter about analytics in the information security space. That’s actually a massive understatement. Analytics is a common buzzword, and if everyone's talking it, but how do you cut through all the noise? Who is doing what when it comes to analytics? It can be difficult to tell. As part of
A dummy variable (also known as indicator variable) is a numeric variable that indicates the presence or absence of some level of a categorical variable. The word "dummy" does not imply that these variables are not smart. Rather, dummy variables serve as a substitute or a proxy for a categorical
Asylum seekers not causing headaches for the UK’s Home Office may be hard to imagine. There are regular scare stories about asylum seekers and budget cuts, amid rising concerns over immigration and the strain its putting on the national infrastructure. But it needn’t mean headaches for policy makers and officials.
Last week I described how to use PROC IOMOPERATE to list the active SAS sessions that have been spawned in your SAS environment. I promised that I would share a custom task that simplifies the technique. Today I'm sharing that task with you. How to get the SAS Spawned Processes
Don’t get me wrong. I have no doubt in the capabilities of our SAS products and SAS solutions! But I wanted to get a firsthand experience of our new solution for text analytics, SAS Contextual Analysis 14.1. And the result is very convincing! But let’s start from the beginning. Functions
With many names, it's difficult to know whether the person is male or female. Let's use the power of analytics to determine which names are the most unisex, based on the number of boys and girls with those names. But, before we get started, here's a picture of my friend
Medicare payment changes are coming. The Centers for Medicare and Medicaid Services (CMS) has announced the intention of increasing the proportion of payments to providers based on outcomes and changes in health status, as opposed to delivery of services. At the January 11th, 2016 J.P. Morgan Annual Health Care Conference,
Medicare payment changes are coming. The Centers for Medicare and Medicaid Services (CMS) has announced the intention of increasing the proportion of payments to providers based on outcomes and changes in health status, as opposed to delivery of services. At the January 11th, 2016 J.P. Morgan Annual Health Care Conference,
In a previous blog I suggested that many readers in many applied areas are reading statistics texts under duress for a course or project, and are in truth somewhere between disinterested and terrified. In my new SAS Press book Business Statistics Made Easy in SAS® I make use of various
As I explained in Part 1 of this series, creating a strategy for the data in an organization is not a straightforward task. Two of the most important issues you'll want to address in your data strategy are data quality and big data. Data quality There can be no data that is
Our cardiovascular systems are "complex arrangements of hydraulic, yet living, components" (Swain, 2000). Just as water must constantly flow at varied rates through a water system in a city, blood must also circulate rhythmically throughout our bodies to keep us vibrant and healthy. The heart drives the entire system. We get more efficient at
When the Toyota Prius first came out, the gas mileage claims astonished everyone. But now that almost every manufacturer offers their own hybrid, is the Prius mpg really all that great? Let's analyze the data ... But before we get to the analytics, let me tell you a little about
¿Cuáles son los retos del sector financiero en Latinoamérica? ¿Cómo detectar y prevenir el fraude en los bancos y empresas de seguros de una vez por todas? ¿Podemos reducir el riesgo de pérdidas relacionadas con estas conductas fraudulentas? ¿Cómo minimizar los daños e impacto en la reputación organizacional? La lucha
Last week Sanjay Matange wrote about a new SAS 9.4m3 option that enables you to show all categories in a graph legend, even when the data do not contain all the categories. Sanjay's example was a chart that showed medical conditions classified according to the scale "Mild," "Moderate," and "Severe."
Al momento de redactar este articulo recordé una nota que describe el actual entorno en las organizaciones bancarias, de gobierno, aseguradoras y salud citada del Profesor David J Hand en su libro titulado "Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques", "Fraud will always be with us". Esta línea
What do you get when you add together: Two basketballs; six people wearing black or white t-shirts; and a chest-beating gorilla? Oddly enough, a great analogy for the challenges information security professionals constantly face (more on that in a minute ...). We'll be talking about security challenges of all kinds
If you're a SAS administrator, you probably know that you can use SAS Management Console to view active SAS processes. These are the SAS sessions that have been spawned by clients such as SAS Enterprise Guide or SAS Add-In for Microsoft Office, or those running SAS stored processes. But did
Did you ever wonder how much fuel all the jets in the world use? Perhaps these graphs will help you get a handle on it... Before we get started, here's a jet-fuel related picture to entertain you. This is a picture I took from the airplane when I went to
Neben dem 40 jährigen Gründungsjubiläum der Software Firma SAS gibt es 2016 auch das 20 jährige KSFE - Jubiläum der Kooperation der SAS Anwender in Forschung und Entwicklung zu feiern, und zwar in der ältesten Universität Schwedens.
Many simulation and resampling tasks use one of four sampling methods. When you draw a random sample from a population, you can sample with or without replacement. At the same time, all individuals in the population might have equal probability of being selected, or some individuals might be more likely
I was standing in the greeting card and seasonal gift aisle at my local grocery store recently surrounded by Valentine’s Day goodies. All the other shoppers standing in the aisle with me looked as perplexed as I felt. Should I throw in the towel, succumb to pressure, and get my
Back before storage became so affordable, cost was the primary factor in determining what data an IT department would store. As George Dyson (author and historian of technology) says, “Big data is what happened when the cost of storing information became less than the cost of making the decision to
North Carolina has over 300 miles of wide, flat Atlantic beaches as well as the highest mountain in the eastern United States, Mount Mitchell. The variety is impressive for a state that isn't even in the top half of the 50 states by size. One key reason is geometric: North Carolina
A recent survey by Capgemini found that 78% of insurance executive interviewed cited big data analytics as the disruptive force that will have the biggest impact on the insurance industry. That’s the good news. The bad news is that unfortunately traditional data management strategies do not scale to effectively govern
"The Role of Model Interpretability in Data Science" is a recent post on Medium.com by Carl Anderson, Director of Data Science at the fashion eyeware company Warby Parker. Anderson argues that data scientists should be willing to make small sacrifices in model quality in order to deliver a model that