All Posts
Sometimes I get contacted by SAS/IML programmers who discover that the SAS/IML language does not provide built-in support for multiplication of matrices that have missing values. (SAS/IML does support elementwise operations with missing values.) I usually respond by asking what they are trying to accomplish, because mathematically matrix multiplication with
The SG Procedures do not support creating a 3D scatter plot. GTL has some support for 3D graphs, including a 3D Bi-variate Histogram and a 3D Surface, but still no 3D point cloud. The lack of such a feature is not due to any difficulty in doing this as
Today in manufacturing there has been a lot of investment in automation, supervisory controls, quality, and execution systems. The amount of data produced and now being captured is staggering. The data captured in industry will re-define what is “big” in big data. Yet, for all this investment: Equipment still fails. Scrap
Beyond traditional clustering and predictive models lies social network analysis. It can help describe customers’ behaviors in new ways, but what exactly is it and how can businesses use it? To find out more, I interviewed Carlos Andre Reis Pinheiro. He’s been working in social network analysis around the world
In my last two posts, we concluded two things. First, because of the need for broadcasting data across the internal network to enable the complete execution of a JOIN query in Hadoop, there is a potential for performance degradation for JOINs on top of files distributed using HDFS. Second, there are
“When it comes to the Internet of Things, the future clearly belongs to the Things”. I made this brash statement in a previous post (“Cloud encounters of the Fifth Kind”) referring to machine-to-machine (M2M) being the fastest growing component of non-human traffic on the Web. I say “brash” because that
In my previous post, I talked about how a bank realized that data quality was central to some very basic elements of its initiatives, such as know your customer (KYC), customer on-boarding and others. In this blog, let’s explore what this organization did to foster an environment of data quality
I often blog about the usefulness of vectorization in the SAS/IML language. A one-sentence summary of vectorization is "execute a small number of statements that each analyze a lot of data." In general, for matrix languages (SAS/IML, MATLAB, R, ...) vectorization is more efficient than the alternative, which is to
For Parent’s Weekend this year, I needed to choose a restaurant for dinner in my son’s college town. Our extended family was attending the college football game and spending the weekend with our son. Before making my decision, I searched the internet for all the restaurants located within a reasonable
♦We learned this week that SAS is ranked #4 on Fortune's 100 Best Companies to Work For in 2015. This makes six straight years ranking in the top four (including twice at #1). ♦The March/April 2015 issue of Analytics Magazine includes a SAS company profile by my colleague Kathy Lange. As
You may be intrigued to know how the average person compares to a gold medal winning Olympic athlete when it comes to things like height, body mass, resting heart rate, arm span, body fat etc. Or, perhaps more frightening, how you measure up? I know this will resonate with my
Public educators have increasingly been told to produce the “workforce of the future.” States are striving for alignment between what students learn and the jobs that ultimately will be available to them. This alignment is critical for students so they have the right skills and knowledge to excel at college
Does your forecast look like a radio? No? Then don't treat it like one. A radio's tuning knob serves a valid purpose. It lets you make fine adjustments, improving reception of the incoming signal, resulting in a clearer and more enjoyable listening experience. But just because you can make fine adjustments to
When I saw Robert Kosoro's cool ZIPScribble map, I knew I had to create a SAS version - and of course I had to add a few enhancements along the way.... I was perusing some of the examples on dadaviz.com, and Kosoro's ZIPScribble map caught my attention. It wasn't a particularly useful
One of the common traps I see data quality analysts falling into is measuring data quality in a uniform way across the entire data landscape. For example, you may have a transactional dataset that has hundreds of records with missing values or badly entered formats. In contrast, you may have