At the KDD conference this week I heard a great invited presentation called How to Create a $1 billion Model in 20 days: Predictive Modeling in the Real World – A Sprint Case Study. It was presented by Tracey de Poalo from Sprint and former Kaggle President and well known
The ODS ExcelXP tagset has served us well over the years. It provides a reliable method to get formatted SAS output into Microsoft Excel workbooks, where the business world seems to like to live. And it's available in Base SAS, which means that you don't need SAS/ACCESS to PC Files
¿Alguna vez ha pensado en cómo sería si se pudieran predecir conflictos? Hay muchos ejemplos de cómo los grandes volúmenes de datos pueden ser utilizados para el bien del mundo. En este post, Jim Davis, Vicepresidente Senior y CMO de SAS, resalta cinco casos recientes en los que el uso del Big
A while back The Wall Street Journal published the article “Corporate Economists Are Hot Again“ that chronicles the resurgence of in-house economists in corporate America. The role of a corporate economist may bring about visuals of classic economist stereotypes (watch Ben Stein play to this stereotype as a teacher in
In his pithy style, Seth Godin’s recent blog post Analytics without action said more in 32 words than most posts say in 320 words or most white papers say in 3200 words. (For those counting along, my opening sentence alone used 32 words). Godin’s blog post, in its entirety, stated: “Don’t measure
Last Monday I discussed how to choose the bin width and location for a histogram in SAS. The height of each histogram bar shows the number of observations in each bin. Although my recent article didn't mention it, you can also use the IML procedure to count the number of
“The best data scientists are those that combine deep statistical / data / machine learning skills with domain knowledge.” “[Most companies] haven't properly addressed the need for cultural change!... There's still this prevailing perception that it's a technology & skills problem.” “Analytics only ever tells you one of two things—it
Because you are already halfway there and you should want the entire process to be data-driven, not just the historical reporting and analysis. You are making decisions and using data to support those decisions, but you are leaving value on the table if the analytics don't carry through to forecasting. In the
How many projects have you worked on that forgot to test size, volume, and conduct load balancing in a newly converted environment? I have worked on a few of those types of projects. I know in a data warehousing effort, we always check any servers and databases, based on load,
What is true customer loyalty? And how can you achieve it without compromising data privacy? According to Peter Hedberg, a senior customer relationship manager with SAS, true customer loyalty programs put the customer at the center of the relationship and use data in ways that are designed to please - not panic
When you create a histogram with statistical software, the software uses the data (including the sample size) to automatically choose the width and location of the histogram bins. The resulting histogram is an attempt to balance statistical considerations, such as estimating the underlying density, and "human considerations," such as choosing
Looking forward, ten of my SAS colleagues and I are heading to New York City this weekend for KDD 2014: Data Science for the Social Good, which runs August 24-27. This event’s full name is the 20th Association for Computing Machinery Special Interest Group on Knowledge Discovery and Data Mining,
It's easy to plot events that happened at a certain time, but what about events that extended over a range of dates, such as recessions? ... This blog post teaches you a nice trick to use for that! Let's say you have a plot of the labor force participation rate
A lot of data quality projects kick off in the quest for root-cause discovery. Sometimes they’ll get lucky and find a coding error or some data entry ‘finger flubs’ that are the culprit. Of course, data quality tools can help a great deal in speeding up this process by automating
My wife got one of those electronic activity trackers a few months ago and has been diligently walking every day since then. At the end of the day she sometimes reads off how many steps she walked, as measured by her activity tracker. I am always impressed at how many
Did you hear that Prince William is getting a new job? Next year, he’ll fly emergency helicopters for the East Anglian Air Ambulance. The prince, who’ll donate his salary to charity, called his new gig “one of the finest forms of public service.” The Duke of Cambridge won’t get any
If you read SAS blogs but never click through to the comment sections, you're missing some great information. Need proof? Check out some of these comments from the last few weeks. And then leave one of your own. Kelly McGuire has been studying the effects of negative hotel reviews online.
For Hadoop to be successful as part of the modern data architecture, it needs to integrate with existing tools. This integration allows you to reuse existing resources (licenses and personnel) and is typically 60% of the evaluation criteria for integration of Hadoop into the data center. One of the most
Even though it sounds like something you hear on a Montessori school playground, this theme “Share your cluster” echoes across many modern Apache Hadoop deployments. Data architects are plotting to assemble all their big data in one system – something that is now achievable thanks to the economics of modern
My previous post explained how confirmation bias can prevent you from behaving like the natural data scientist you like to imagine you are by driving your decision making toward data that confirms your existing beliefs. This post tells the story of another cognitive bias that works against data science. Consider the following scenario: Company-wide
In a previous blog post, I showed how to use the graph template language (GTL) in SAS to create heat maps with a continuous color ramp. SAS/IML 13.1 includes the HEATMAPCONT subroutine, which makes it easy to create heat maps with continuous color ramps from SAS/IML matrices. Typical usage includes
If you're a really big company, with many locations around the country, how do you keep track of all that? ... With a great map, of course! I recently read a CNN article about the Community Health Systems network being hacked - exposing the names, Social Security numbers, physical addresses, birthdays
Introduction Understanding the behavior of your customers is key to improving and maintaining revenue streams. It is a an important part when crafting successful marketing campaigns. With SAS Visual Analytics 7.1 you can analyze, explore and visualize user behavior, click paths and other event-based scenarios. Monitoring the customer journey by visualizing
What kind of security do we need for this conversion? In fact, where are the security people? Including security personnel upfront in any conversion project can sure save some time and heartache later. It is important to include security for the following: Source system access – You must be able
Before I started my internship with SAS, my only experience with data or analysis came from an “Introduction to Statistics” course I took freshman year to satisfy my math requirement. If I’d known then that statistics and knowing SAS programming would be the #1 skill for a bigger paycheck, or
Ein Gastbeitrag von Anne Belder, SAS Niederlande. „Die Zukunft der Banken ist digital.“ - Der unabhängige Autor und Finanzmarktanalyst Chris Skinner schreibt in seinem Buch "Digital Bank" über die radikale Veränderung im Banking. Viele Ansätze lassen sich daraus ableiten, die gerade aus Sicht eines Big Data Analytics Experten eine besondere
Heat maps have many uses. In a previous article, I showed how to use heat maps with a discrete color ramp to visualize matrices that have a small number of unique values, such as certain covariance matrices and sparse matrices. You can also use heat maps with a continuous color
Sin datos no hay experiencia del cliente, sin calidad de datos no hay lealtad verdadera, y sin integración de datos no hay esperanza de una experiencia omnicanal. Y es a través de esta experiencia, que las empresas buscan conectarse con el cliente y crear una estrategia personalizada que cubra sus
Find out which state you'll live in, if the US state borders are redrawn so we have 50 states with equal population! (Don't worry! - This is just a fun/hypothetical "what if" blog!) To get you in the mood for this topic, here's a picture of one of the many
In my industry of data and computer science, precision is typically regarded as a virtue. The more exact that you can be, the better. Many of my colleagues are passionate about the idea, which isn't surprising for a statistical software company. But in social media, precision is a stigma --