Harness the power of the cloud for learning SAS

Since this is my first post on The SAS Training Post blog, please allow me to introduce myself.  My name is Kathy and I am an instructor at SAS Headquarters in Cary, NC.   I teach SAS courses in our on-campus training center, at regional training centers, at customer sites and in our Live Web environment.  Maybe you’ve been in one of my classes – if so, it’s great to have the opportunity to “talk” to you again!

People who attend our SAS courses often ask, “Is there a way that I can practice SAS programming after class?” Now, I can say, “Yes.” After attending a SAS course, we give you all the class programs and data files you need to continue practicing what you learned, but you need a SAS environment in order to run them.  A lot of folks get to use SAS at work, but others are not as fortunate.  For those folks, I have good news! Now you can connect to a learning version of SAS!

A new product, SAS on Demand for Professionals: Enterprise Guide, provides a learning tool for those who want to practice what they learned in class, run programs from a SAS book or prepare for SAS Certification.

So, how does it work? Well, you purchase a six or twelve month license, then download the product to your local computer and connect to the analytical power of SAS over the Web. You write the code locally and submit; SAS® Enterprise Guide sends it to a SAS Server in the Cloud for execution; and the results are sent back to your local client. Cool, huh? Not only the Power to Know, but the Power of the Cloud too!

We provide many learning options and a lot of data for you. There is a point-and-click approach where you can select data sources and tasks from menus, as well as an interactive SAS programming environment. Choose the path that’s right for you as Chris Hemedinger pointed out in his recent post in The SAS Dummy blog. The product’s Getting Started Tutorial takes you from the basics thru intermediate and advanced examples.  You will also find SAS data sets, Excel workbooks, CSV files, text files, and more on the server to guide your learning.

The product website contains lots of support information, videos and FAQs.  So check out the website, watch some videos, and let me know what you think. As my former students know, I’d love to hear from you!

Want to see more now? Check out this introductory video.

Post a Comment

How to open a SAS table in SAS Web Report Studio

Business users of SAS are finding the Web Report Studio capabilities incredibly beneficial for viewing, creating, and sharing reports on the Web.  The easy-to-use query and reporting software provides a point-and-click interface for building reports from several different data sources. Once created, SAS reports can be views by many SAS applications.

SAS Web Reporting Studio provides several ways of gathering information and creating reports.  This quick SAS Training Tip video shows users how efficient opening a table directly in SAS Web Report Studio 4.3 can be!

Learn more about our training for SAS Web Report Studio.

Post a Comment

With SAS and Facebook, who needs Meetup?

In my previous blog post I talked about how to map your Outlook contacts and create a list of the 3 people nearest the zip code you were traveling to. That’s all fine and dandy, but my friends often don’t update me when they move and we just connect via Facebook.  Wouldn’t it be nice to know where your Facebook friends are?

Much like the earlier blog post, I created a SAS program and used the tasks within SAS Enterprise Guide that created 2 outputs.

  1. A listing report showing me the distance between where I’m going and the 3 friends who are closest to that location.
  2. A map of the US pinpointing my Facebook friends.

My Facebook and real life friend Roger wrote a blog on how to do this with R and Python Scripts. I figured I could do this with SAS.  Roger encountered several obstacles and he helped me greatly with some of the access token stuff. I encountered my own obstacles like one of my friends has an apostrophe in his first name. And you SAS programmers know how much SAS loves to wreak havoc with unbalanced quotes!

So the code is below and first a few disclaimers:

  1. This is not production level code. You may look at it and say “oh, you should have done it this way”, and “that part isn’t very efficient,” and you are absolutely right. Sometimes it’s important just to get things done, not necessarily to get things done perfectly.
  2. Facebook gives you what it gives you. I have at least one instance where I can clearly see a friend’s location in his profile, but it is not being returned back by the access token. This seems to be a rare occurrence, but be aware that it happens.
  3. SAS gives you what it gives you. The SASHELP.ZIPCODE data set doesn’t list all cities everywhere. So friends that live in small towns might not get picked up. And of course my friend who lives in London, while it’s a fairly large city, doesn’t get picked up either.

Big thanks goes out to Roger, Cat Truxillo, and Andy Ravenna for help and testing!

So here is how to map your US Facebook friends using SAS, as well as find the 3 friends that are nearest a particular zip code.  Here are the steps you need.

Get the appropriate Facebook info

  1. Go to this Facebook developers page
  2. Click on "Get Access Token" and choose the fields
    • User_hometown User_location from the User Data Permissions tab
    • from the Friends Data Permissions tab, choose friends_hometown and friends_location

NOTE: Not all of these are used in the program but the code is set up to assume this and only this is what is coming in.

  1. Click Get Access Token, then click Log In with Facebook.
  2. If necessary, click Submit.  (Sometimes the website will freeze on you).
  3. Copy the access token returned. NOTE: I think this expires after 24 hours.
  4. Go to https://graph.facebook.com/me/friends?access_token=????? (paste in the access token where the question marks are. )
  5. You will get prompted if you want to open or save the file. Click on the down arrow next to Save and choose Save As.  Save it to C:\temp and name it facebookfriendsid.txt   NOTE: you may have to use Windows Explorer to rename it.

The only reason we are doing steps 6 and 7 is because the token expires, it's nice to have the file of your Facebook friends on your C drive. You can skip steps 6 and 7 by using the URL directly in your FILENAME statement, but that would need to be changed every day.

  1. Include the below SAS program and in the 1st %LET statement plug in your zip code or one you are traveling to.
  2. About 1/3 of the way down the program, change the token to be the same that you copy and pasted from earlier.  I really haven’t found a way around this.
  3. Submit the program and enjoy dinner with your friends!

SAS Program

%let me_go_to=98101;

data fball (keep=name id);
infile 'c:\temp\facebookfriendsid.txt' dlm=':' firstobs=3;
input  crap $ name_field $ name :$40.
       id_field $ id :$17./
;
name=substr(name, 2,length(name)-3);
id=substr(id,2);
id=compress(id,'"');
run;

proc sql noprint;
select nobs into :numfriends
from dictionary.tables where LIBNAME='WORK' and memname='FBALL';
%let numfriends=&numfriends;
%put numfriens is &numfriends;

select name, id into :name1-:name&numfriends, :id1-:id&numfriends
from fball;

options nomprint nosymbolgen;
%*let numfriends=5; /*using for testing */
%macro loop;
%do i=1 %to &numfriends;
%put working on %bquote(&&name&i) whose id is %bquote(&&id&i);

filename fbfriend url "https://graph.facebook.com/&&id&i?access_token=AAACEdEose0cBAPAvLcJMjMAJPD6c45X25sZA2lCdQPJDypZAY5J6ZCTy8yOG8ysD1FbrjSouTZBgg3iEdmPmDDCilez9t90edSw6DeeeQ1HBhY5PsWBI";

data fbfriend&i ;
keep name location;
length location $50;

infile fbfriend length=len lrecl=500;
input record $varying1000. len @;

   put record $varying1000. len;
   namestart=index(record, 'name');
   namepart=substr(record,namestart+7);
   endlocation=index(namepart,'"');
   name=substr(namepart,1,endlocation-1);

   locationstart=index(record,'location');
   if locationstart>0; /*otherwise friend does not publish location */
partial=substr(record,locationstart);
locationnext=index(partial,'name');
startlocation=substr(partial,locationnext+ 7);
endlocation=index(startlocation,'"');
location=substr(startlocation,1,endlocation-1);
run;

filename fbfriend clear;
%end;

data combine;
length city $ 35 state $25; 
drop location;
set
  %do j=1 %to &numfriends;
  fbfriend&j
  %end;
; /*end set statement */
city=scan(location,1, ',');
state=scan (location, 2, ',');
state=substr(state,2);
run;
%mend;
%loop

proc sql;
create table merged as
select name, a.city, a.state, zip
from combine a, sashelp.zipcode z
where trim(upcase(a.city))=trim(upcase(z.city))
    and trim(upcase(a.state))=trim(upcase(z.statename))
order by name;

data merged2;
set merged;
by name;
if first.name;
run;

proc sort data=merged2 out=myzip;
  by zip;
run;

/* Create a data set containing the */
/* X and Y values for my ZIP codes. */
data longlat;
/* In case there are duplicate ZIP codes, rename 
    X and Y from the SASHELP.ZIPCODE data set. */
  merge myzip(in=mine) 
        sashelp.zipcode(rename=(x=long y=lat)keep=x y zip);
  by zip;
/* Keep if the ZIP code was in my data set. */
  if mine;
/* Convert longitude, latitude in degrees to radians */
/* to match the values in the map data set. */
  x=atan(1)/45*long;
  y=atan(1)/45*lat;
/* Adjust the hemisphere */
  x=-x;
  /* Keep only the ZIP, X and Y variables */
  keep zip x y;
run;

/* Create an annotate data set to place a symbol at the
   ZIP code locations. */
data anno;
/* Use the X and Y values from the LONGLAT data set. */
  set longlat;
/* Set the data value coordinate system. */
/* Set the function to label. */
/* Set the size of the symbol to .75. */
/* Set a FLAG variable to signal annotate observations. */
  retain xsys ysys '2' function 'label' size .75 flag 1 when 'a';
/* Set the font to the Special font. */
  style='special';
/* The symbol is a star. */
  text='M';
/* Specify the color for the symbol. */
  color='red';
/* Output the observation to place the symbol. */
  output;
run;

/* Combine the map data set with the annotate data set. */
data all;
  /* Subset out the states that you do not want. */
  /* The FIPS code of 2 is Alaska, 15 is Hawaii, */
  /* and 72 is Puerto Rico.  */
  set maps.states(where=(state not in(2 15 72))) anno;
run;

goptions reset=all border;

/* Project the combined data set. */
proc gproject data=all out=allp;
  id state;
run;
quit;

/* Separate the projected data set into a map and an annotate data set. */
data map dot;
  set allp;
/* If the FLAG variable has a value of 1, it is an annotate  */
/* observation; otherwise, it is a map data set observation. */
  if flag=1 then output dot;
  else output map;
run;

/* Define the pattern for the map. */
pattern1 v=me c=black r=50;

/* Define the title for the map. */
title 'My Facebook Friends';
title2 'Based on City locations';

/* Generate the map and place the symbols at ZIP code locations. */
proc gmap data=map map=map;
  id state;
  choro state / anno=dot nolegend;
run;
quit;

/* create listing report of 3 closest friends */
proc sql outobs=3;
  title '3 friends closest to where I''ll be';
  select name, city, state,
      zipcitydistance(zip, &me_go_to) as distance 

  from myzip
  order by distance;

/* find friends that have a location but are not getting mapped */
  proc sort data=combine out=full;
by name;

proc sort data=merged2 (drop=zip) out=most;
by name;

data missing;
merge full (in=all) most(in=most);
by name;
if all and most then delete;
run;
proc print;
title 'not in list';
run;

Post a Comment

How to use SAS to strengthen friendships

As SAS Instructors we travel fairly frequently, and like most of you we have friends scattered throughout the country. High school friends, college friends, family we like (or pretend to). And if I’m on a business trip, I like to see if they are available for drinks or dinner.

As time goes on and people move, and my brain cells deteriorate, I realized I have no idea where people actually live.  I know that my friend Troy used to live in Elida, NM, then moved to Dallas, and then Bethesda, MD, and then to Bellevue, CO. Colorado is a big state though so if I teach in our SAS office in Greenwood Village, CO, I had no idea if he is nearby or a hundred miles away.

So I wanted to write a SAS program to read my Outlook contacts, and then determine the distance between me and my friends.  So two obstacles needed to be overcome:

  1. Getting at my contacts info.
  2. Figuring out the distance.

So with some sleuthing, I learned how to take my Outlook contacts and write them out to a file, so obstacle 1 was complete.   Obstacle 2 was solved with the SAS zipcitydistance function.

Once those obstacles were overcome, I then created a SAS program and used the tasks within SAS Enterprise Guide that created 2 outputs.

  1. A listing report showing me the distance between where I’m going and the 3 friends who are closest to that location.
  2. A map of the US pinpointing my Outlook contacts.


So I’m going to share my efforts with you in case you want to do the same. Here are the steps you need.

Export your Outlook Contacts

  1. Open your Outlook Contacts window.
  2. From the File tab, click on Open then Import (yes I know we are exporting our contacts not importing. Complaints can be sent to Microsoft.)
  3. Select “Export to a File”. Click Next.
  4. Select “Comma Separated Values (Windows)”. Click Next.
  5. Choose your contacts folder. I have categorized mine to have a separate subfolder that is my personal friends (excludes business contacts).  Click Next.
  6. Specify the name and location to be c:\temp\mapcontacts.CSV. Obviously, you can put it anywhere you want to, but this is what my SAS code below assumes you named it.
  7. Click on Next and replace if prompted.
  8. Click on “Map Custom Fields…”
  9. Choose the fields of
    • First Name
    • Last Name
    • Home Street
    • Home City
    • Home Postal Code
  10. Click OK. Then click Finish.

Run Your SAS Program

  1. Open a SAS session.
  2. Bring in the following program.
  3. Change the value of the macro variable in the 1st %LET statement to be the zip code that you are going to.  I know it’s a horrible macro variable name, but I was not in a frame of mind to come up with something better.
  4. Run the program and enjoy dinner with your friends!

%let me_go_to=98101;
    data WORK.myzip  (rename=(home_postal_code=zip))             ;
    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
    infile 'C:\temp\mapcontacts.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
       informat First_Name $20. ;
       informat Last_Name $23. ;
       informat Home_Street $28. ;
       informat Home_City $20. ;
       informat Home_Postal_Code 5. ;
       format First_Name $20. ;
       format Last_Name $23. ;
       format Home_Street $28. ;
       format Home_City $20. ;
       format Home_Postal_Code z5. ;
    input
                First_Name $
                Last_Name $
                Home_Street $
                Home_City $
                Home_Postal_Code $
    ;
    if   Home_Postal_Code ne ' ';
        distance=zipcitydistance(Home_Postal_Code, &me_go_to);

  run;

goptions reset=all border;
proc sql outobs=3;
  title '3 friends closest to where I''ll be';
  select first_name, Last_name, distance
  from myzip
  order by distance;
/* Sort the data set by ZIP codes. */
proc sort data=myzip;
  by zip;
run;

/* Create a data set containing the */
/* X and Y values for my ZIP codes. */
data longlat;
 /* In case there are duplicate ZIP codes, rename
    X and Y from the SASHELP.ZIPCODE data set. */
  merge myzip(in=mine)
        sashelp.zipcode(rename=(x=long y=lat)keep=x y zip);
  by zip;
 /* Keep if the ZIP code was in my data set. */
  if mine;
 /* Convert longitude, latitude in degrees to radians */
 /* to match the values in the map data set. */
  x=atan(1)/45*long;
  y=atan(1)/45*lat;
 /* Adjust the hemisphere */
  x=-x;
  /* Keep only the ZIP, X and Y variables */
  keep zip x y;
run;

/* Create an annotate data set to place a symbol at the
   ZIP code locations. */
data anno;
 /* Use the X and Y values from the LONGLAT data set. */
  set longlat;
 /* Set the data value coordinate system. */
 /* Set the function to label. */
 /* Set the size of the symbol to .75. */
 /* Set a FLAG variable to signal annotate observations. */
  retain xsys ysys '2' function 'label' size .75 flag 1 when 'a';
 /* Set the font to the Special font. */
  style='special';
 /* The symbol is a star. */
  text='M';
 /* Specify the color for the symbol. */
  color='red';
 /* Output the observation to place the symbol. */
  output;
run;

/* Combine the map data set with the annotate data set. */
data all;
  /* Subset out the states that you do not want. */
  /* The FIPS code of 2 is Alaska, 15 is Hawaii, */
  /* and 72 is Puerto Rico.  */
  set maps.states(where=(state not in(2 15 72))) anno;
run;

/* Project the combined data set. */
proc gproject data=all out=allp;
  id state;
run;
quit;

/* Separate the projected data set into a map and an annotate data set. */
data map dot;
  set allp;
 /* If the FLAG variable has a value of 1, it is an annotate  */
 /* observation; otherwise, it is a map data set observation. */
  if flag=1 then output dot;
  else output map;
run;

/* Define the pattern for the map. */
pattern1 v=me c=black r=50;

/* Define the title for the map. */
title 'My Friends';
title2 'Based on ZIP Code locations';

/* Generate the map and place the symbols at ZIP code locations. */
proc gmap data=map map=map;
  id state;
  choro state / anno=dot nolegend;
run;
quit;

Post a Comment

6 questions with data mining expert, John Elder

John F. Elder IV, Ph.D., is President of Elder Research Inc. (ERI), a data mining consulting team.  He has authored innovative data mining tools, is a frequent keynote speaker and was co-chair of the 2009 Knowledge Discovery and Data Mining conference in Paris. His courses on analysis techniques – taught at dozens of universities, companies and government labs – are noted for their clarity and effectiveness. John was honored to serve for five years on a US presidential panel to guide technology for national security. His book with Bob Nisbet and Gary Miner, Handbook of Statistical Analysis & Data Mining Applications, won the PROSE award for Mathematics in 2009. His book with Giovanni Seni, Ensemble Methods in Data Mining: Improving Accuracy through Combining Predictions, was published in February 2010.  We recently caught up with John to ask him a few questions as he prepares to teach his new course, Data Mining: Principles and Best Practices.

 

1. In your opinion, what have been the biggest advancements in data mining over the past 10 years?

Technically, the greatest accuracy is coming from ensembles of models.  The Netflix Prize was a great example of this; the two top teams were actually both ensembles of people building ensembles of models.  The idea was invented by non-statisticians (including me), but once it was seen to work so well real statisticians (especially Jerry Friedman) broke it down and rebuilt it stronger than before.  I helped Giovanni Seni write a clear, short book translating that work to where most scientists can use it.

On the practice side, it’s the continued improvement of software, like SAS Enterprise Miner, which helps analysts automatically do increasingly more necessary but complex steps so they can focus on the hard part of the problem, including translating the squishy business task into crispy technical steps.

2. What predictions do you have for how data mining techniques and technologies will evolve in the next 10 years?

The big new thing is the emergence of two exciting new sources of data:  text and links.  Six of us just wrote a book about Practical Text Mining (a 1,000+ page monster).  Elder Research has had a lot of success working with text, even with early tools, partly as there are “low hanging fruit” whenever new data is explored.  And the link information in Social and other complex networks is fascinating.  My colleague Andy Fast got written up in ESPN magazine after predicting which football teams will make the playoffs using just the coaching network (who worked for who).  In fact, he did better than all four professional ESPN analysts (who used little facts like players, record, etc.)!  The power of looking at information in a way that others aren’t yet was also depicted well in the recent film “Moneyball.”

3. You've been a past keynote speaker at SAS' data mining conference (now titled The Analytics Conference Series) and soon you'll be teaching a new SAS Business Knowledge Series course titled Data Mining: Principles and Best Practices.  What's your favorite part of presenting and teaching?

At SAS keynotes, it’s the rock music blaring as you bound on stage!   For a few seconds, geeks are cool.

Seriously, the greatest thing about teaching is seeing the lightbulbs go off over folks’ heads.  This technology can really help mankind.  And I get to tell people about how to do it well!

4. What is your favorite industry to work with?  Give us an example of a problem you've helped to solve using data mining in that industry.

We do a lot of work to help analysts catch bad actors – such as fraudsters, insider threats, even terrorists.  My team finds it very satisfying to help our country stay safe, and help firms and government agencies save tens of millions of dollars (and sometimes, even lives).

5. Can you share a tip from your new course?

It uses SAS Enterprise Miner, and our book on practical data mining  – winner of the top award for a mathematics book in 2009 (maybe because it’s huge, color illustrated, and has the fewest equations of any analytic book!)  We’ve decided to focus on teaching how to know the true quality of your model.  It’s very hard, without training, to avoid overfit, and the great majority of work we see in the wild is not done as well as it could be.  Most analysts think their models are much better than they really are, which can be devastating when they are put to use and under-perform.  Our course will show how to get the science right and have realizable gains.  (We include code to use in those cases when the software streams don’t have all the functionality needed.)  We conclude by reviewing the Top 10 Data Mining mistakes, so folks can recognize when they’re going down the wrong path and correct efficiently.

6. Final advice?

This is a great field.  Learn to do it well, and your work will generate huge return for its investment.  This tends to make folks like to see you!

Post a Comment

Reading your mind: Writing SAS code in the new Enterprise Guide editor

I taught my first SAS Enterprise Guide course more than 10 years ago using version 1.2.  At that time, I would estimate 95% of my students did not know how to write SAS code and had no desire to learn the syntax!  They just wanted to take advantage of the great point-and-click features to query and summarize their data.  But over the years, I have watched a significant shift occur…  More and more experienced SAS programmers are converting to Enterprise Guide as their primary programming interface.  The good news is Enterprise Guide is actually designed to fulfill the needs of both types of users – those who think the DATA step is one of man’s greatest creations, and those who have no clue what the significance of the semi-colon is in a SAS program.

I like to ask SAS programmers attending my classes, “Do you use Enterprise Guide?”  I will often hear, “Oh, I write my own code, so I don’t need to use Enterprise Guide.”   A common misconception is that Enterprise Guide is strictly for point-and-click users who need to have code generated for them.  Although that is an extremely powerful (and very cool) part of the software, many SAS programmers are truly missing out on some of the amazing features included in the Enterprise Guide programming editor.  Significant effort has been devoted by developers in Enterprise Guide 4.3 to facilitate the work of our beloved programmers through the addition of some really modern and useful features.  This short video demonstrates the Enterprise Guide program editor and some of its unique functionality.

Post a Comment

Why I think SAS is the best place to work

Since we're all pretty excited about it around here, you may have already heard that today SAS ranked #3 in the 2012 FORTUNE 100 Best Companies To Work For list.  This is the 15th consecutive year that SAS has prominently ranked on the list - since its beginning!

I thought I would take a quick moment to tell you what it means to me to work at SAS and why I love my job. Professionally, SAS has allowed me to grow by continuing to provide challenging and creative work.  Every day I learn something new.  And I'm continually blown away by the people I work with.  They are some of the most intelligent, innovative, funny and kind people I've ever met.  I feel lucky to be able to say that I get to work with my friends.  Personally, SAS has become my extended family.  My SAS family has seen me through my marriage, the birth of my children (who attend the on-site daycare here at SAS), and provided an incredible amount of support and comfort during a time of personal tragedy. Of course I love the gym, healthcare center, cafes and all the other perks, but what really makes SAS the best place to work is the people.

Okay, that's enough of that mushy stuff.  If you're interested in learning more about the history of SAS check out this corporate timeline video.

Post a Comment

Kill two birds (or three!) with one stone at SAS Global Forum

Did you know that a selection of our most popular SAS training classes are offered in conjunction with SAS Global Forum at a 15% discount?  If you’re planning to attend SAS Global Forum, arrive a few days early and take a class.  These training classes could be the argument you need to convince your manager to send you to SAS Global Forum!

Statistics 1: Introduction to ANOVA, Regression
SAS Programming 3: Advanced Techniques and Efficiencies
Bayesian Analyses Using SAS
SAS Macro Language 1: Essentials
New Features in Statistical Graphics Procedures and Graph Template Language

And while you’re there, consider taking a SAS Certification exam.  All exams are being offered on April 21, also for a 15% discount.

Take a class, get certified and attend the conference all in one place!

Post a Comment

Jedi SAS Tricks: These aren't the droids... Episode 1


How the power of the Force makes ridding yourself of problematic characters so much easier! I recently was invited to become an alternate instructor for Ron Cody’s SAS Business Knowledge Series class, "SAS Functions by Example", and had the privilege of taking the class as a student under Ron’s tutelage. As Ron was introducing the advanced features of the COMPRESS function, I remembered how I've often needed to rid myself of problematic characters in a SAS program.  I sure wish I’d know about the advanced features of COMPRESS sooner! And my next thought was: I must share this incredible power with all the SAS Jedi out there, and what better place to do it than right here on the Jedi SAS Tricks blog!
Read More »

Post a Comment

Three statistics books for introductory SAS students

Recently, this question was asked of our SAS training instructors:

For a SAS Programming 1: Essentials student, what would be a good book recommendation for a Statistics book?

Here are the recommendations (in no particular order):

Step-by-Step Basic Statistics Using SAS: Student Guide and Exercises
By: Larry Hatcher

A Handbook of Statistical Analyses Using SAS
By: Geoff Der and Brian S. Everitt

SAS Statistics by Example
By: Ron Cody

Do you have a book recommendation?  Add to the list by leaving us a comment below.

 

 

Post a Comment