SAS/IML software is used by many SAS programmers, primarily for creating custom algorithms and macros that implement statistical analyses that are not built into any SAS procedure. I know that PROC IML is used regularly by pharmaceutical companies, by the financial and insurance industries, and by researchers in medical colleges and business schools, among others.
However, when I was recently asked how many research papers are published each year that use SAS/IML, I had no idea. I conjectured "a few dozen," figuring that the most SAS/IML programmers work in a corporate setting or in government, and that these researchers are less likely than academics to publish their results in journals.
Over the weekend I used Google Scholar to try to get a better estimate. I constructed the following query for Google Scholar:
SAS +IML OR "proc iml" -site:sas.com -author:wicklin |
The query omits any papers written by me or that appear on the sas.com domain. The idea was to exclude any white papers, conference proceedings, or marketing material that is created or hosted at SAS. The results also exclude articles posted to the SAS/IML File Exchange.
The results surprised me: I was wrong by an order of magnitude! I clicked on the links for a few dozen publications to ascertain how many of the hits were false positives. For example, an article that says "we decided not to use SAS/IML in this study," would appear on the list, even though the authors did not actually use the programming language. There were a few false positives and there were a few SAS manuals in the list. However, the vast majority of the Google list consisted of scholarly journal articles, books, or conference proceedings that used SAS/IML in a nontrivial way.
I encourage you to submit the query yourself and to look at the variety of applications and the wide range of journals. More than 2,500 entries were published prior to 1995. In addition to those papers, the following SAS DATA step gives the number of Google Scholar entries for the SAS/IML query for the past 20 years:
/* Results from Google Scholar. Downloaded 6/28/2015 "sas" +IML OR "proc iml" -site:sas.com -author:wicklin There were 2510 results when year <= 1994 */ data IMLPub; input Year Publications; datalines; 1995 231 1996 255 1997 311 1998 301 1999 299 2000 355 2001 361 2002 424 2003 465 2004 493 2005 543 2006 568 2007 526 2008 611 2009 633 2010 580 2011 589 2012 512 2013 603 2014 552 ; title "Number of Publications that Mention SAS/IML Software"; title2 "Data from Google Scholar"; proc sgplot data=IMLPub; series x=Year y=Publications / markers; yaxis min=0 grid values=(0 to 600 by 100) valueshint; xaxis grid; run; |
The graph appears to increase until about 2005, and has been approximately constant since then. In the past 10 years, Google Scholar reports about 550 publications per year. I had no idea that the number was that high.
I do not advocate using internet search engines to rank software based on the number of web sites that mention the software. I have argued that using the number of search results as a proxy for popularity is of dubious value and is fraught with statistical perils.
However, I found it intellectually interesting to read the titles and excerpts of the scholarly publications that mention SAS/IML software. Browse the list yourself, or see the list that includes papers on the sas.com domain for a more complete perspective.
If you are thinking about using SAS/IML software in your next research project, you might want to search Google Scholar first. Someone else might have already written a scholarly paper that solves your problem!
2 Comments
Rick,
Maybe one reason is after 2005 more and more people are trying to use R software instead of IML . R is totally free . You can do the same study to R and see what happened (I am very interested to see).
If you check the following URL, you will find more people have R skill than SAS.
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
Pingback: Video: Writing packages: A new way to distribute and use SAS/IML programs - The DO Loop