The CLUSTER procedure in SAS/STAT software creates a dendrogram automatically. The black-and-white dendrogram is nice, but plain. A SAS customer wanted to know whether it is possible to add color to the dendrogram to emphasize certain clusters. For example, the plot at the left emphasizes a four-cluster scenario for clustering
Uncategorized
Q: Could you send me the presentation? With audio if possible. If you'd like a pdf of the slides, email me directly: mike.gilliland@sas.com For the audio, the webinar recording is available for free on-demand review: FVA: A Reality Check on Forecasting Practices Q: Can we get the case study referred here
How do you count the number of unique rows in a matrix? The simplest algorithm is to sort the data and then iterate down the rows, comparing each row with the previous row. However, this algorithm has two shortcomings: it physically sorts the data (which means that the original locations
As promised in yesterday's Foresight-SAS sponsored webinar on "Forecast Value Added: A Reality Check on Forecasting Practices," here is Part 1 of my written response to the over 25 questions that were submitted during the event. (Note: It may take a week or so to get through all of them.)
Die Bedeutung von Daten und deren Auswertung wächst in jeder Branche – ebenso wie die Komplexität von Geschäftsprozessen und ihren wechselseitigen Abhängigkeiten zunimmt. Ob es um ein besseres Kundenverständnis geht, eine genauere Risikoberechnung oder eine Optimierung von Produktion und Logistik: der Erfolgsfaktor von morgen liegt in der Analyse der ständig
I am not a big fan of the macro language, and I try to avoid it when I write SAS/IML programs. I find that the programs with many macros are hard to read and debug. Furthermore, the SAS/IML language supports loops and indexing, so many macro constructs can be replaced
A Bar Line graph is commonly used in many domains. The SGPLOT procedure makes it easy to create bar line graphs where the user can customize it in many different ways. This post is prompted by a recent question on the communities page on creating such a graph, with one bar and
David Loshin (@davidloshin) on using analytics to pinpoint your best customers.
If an organization is spending time and money to have a forecasting process, is it not reasonable to expect the process to make the forecast more accurate and less biased (or at least not make it any worse!)? But how would we ever know what the process is accomplishing? To
A regular reader noticed my post on initializing vectors by using repetition factors and asked whether that technique would be useful to expand data that are given in value-frequency pairs. The short answer is "no." Repetition factors are useful for defining (static) matrix literals. However, if you want to expand
Imagine a business offering a multitude of products and services that seemingly have little relationship to one another, and all are supported by different data systems. This is the plight of local governments. The products and services produced and managed by local governments range from utilities, solid waste and recycling to parks
This week's SAS tip is from Art Carpenter and his latest book Carpenter's Guide to Innovative SAS Techniques. Art is a talented SAS user and prolific author--and was just recognized in the SAS Circle of Excellence for 30 years of using SAS software. After taking a look at this week's book
If you've watched any of the demos for SAS Visual Analytics (or even tried it yourself!), you have probably seen this nifty exploration of multiple measures. It's a way to look at how multiple measures are correlated with one another, using a diagonal heat map chart. The "stronger" the color
In a previous blog post, I described how to use a spread plot to compare the distributions of several variables. Each spread plot is a graph of centered data values plotted against the estimated cumulative probability. Thus, spread plots are similar to a (rotated) plot of the empirical cumulative distribution
David Loshin (@davidloshin) offers a new approach to addressing customer classification.
Suppose that you have several data distributions that you want to compare. Questions you might ask include "Which variable has the largest spread?" and "Which variables exhibit skewness?" More generally, you might be interested in visualizing how the distribution of one variable differs from the distribution of other variables. The
As part of my follow-up to SAS Global Forum 2013, I've posted a few articles about how to create your own client apps with SAS Integration Technologies. This article shows how to use Microsoft .NET -- the same approach used for SAS Enterprise Guide and SAS Add-In for Microsoft Office
This week's SAS tip is from Michele Burlew and her book SAS Macro Programming Made Easy, Second Edition. Michele is the author of several extremely helpful SAS books. Visit her author page to learn more about her work and for additional free content. The following excerpt is from SAS Press
Last week I showed how to use simulation to estimate the power of a statistical test. I used the two-sample t test to illustrate the technique. In my example, the difference between the means of two groups was 1.2, and the simulation estimated a probability of 0.72 that the t
One of the great things about SAS libraries is that you can write your programs to read and write data without having to worry about where the data lives. SAS data set on a file system? Oracle table in a database server? Hadoop data in Hive? For many SAS applications,
Data expert David Loshin (@davidloshin) delves into value pricing and customer analytics.
Einmal pro Monat fassen wir ab jetzt Studien, Videos und Veröffentlichungen über Big Data, Business Analytics und Datenvisualisierung für Sie zusammen. Das Wichtigste über den Umgang mit Daten in Unternehmen und datenbasierte Entscheidungen - im Web gefunden, zusammengetragen und ausgewählt von führenden Analysten, unseren Partnern und SAS-Experten. Zur aktuellen Ausgabe: "The
A SAS user told me that he computed a vector of values in the SAS/IML language and wanted to use those values on a statement in a SAS procedure. The particular application involved wanting to use the values on the ESTIMATE and CONTRAST statements in a SAS regression procedure, but
Die Big Data-Diskussion hat vielen Unternehmen vor Augen geführt, dass Daten mittlerweile zu einem wichtigen Produktionsfaktor geworden sind. Ohne eine gezielte und konsequente Strategie lässt sich der Wert, der in den Daten steckt, nicht heben. Gefragt sind methodische und organisatorische Weichenstellungen – unterstützt von Software, die den Brückenschlag zwischen Fachbereich
This week's SAS tip is from Gerhard Svolba and his latest book Data Quality for Analytics Using SAS. For additional bonus book content and to learn more about Gerhard and his work, visit his author page. The following excerpt is from SAS Press author Gerhard Svolba's book "Data Quality for Analytics Using SAS"
They say "Imitation is the most sincere form of flattery"... Therefore when I imitate Hans Rosling's famous world-data animation, it's not that I'm jealous, but that I'm paying homage to him! (OK, and maybe also a little bit jealous! LOL) Well, anyway, for those of you who haven't seen it,
The power of a statistical test measures the test's ability to detect a specific alternate hypothesis. For example, educational researchers might want to compare the mean scores of boys and girls on a standardized test. They plan to use the well-known two-sample t test. The null hypothesis is that the
Last time we saw two situations where you wouldn't bother trying to improve your forecast: When forecast accuracy is "good enough" and is not constraining organizational performance. When the costs and consequences of a less-than-perfect forecast are low. (Another situation was brought to my attention by Sean Schubert of
If you write a blog, you deal with spam comments. That's just part of the deal. Spammers are forever inventing new and creative methods for "tricking" you into accepting their spam comments. These comments have nothing to do with your blog topic but do contain trackback links to their own
Has anyone noticed that the REG procedure in SAS/STAT 12.1 produces heat maps instead of scatter plots for fit plots and residual plots when the regression involves more than 5,000 observations? I wasn't aware of the change until a colleague informed me, although the change is discussed in the "Details"