I've used SAS with a bunch of different REST APIs: GitHub, Brightcove, Google Analytics, Lithium, LinkedIn, and more. For most of these I have to send user/password or "secret" application tokens to the web service so that it knows who I am and what data I can retrieve. I do not want to keep this secret information in my SAS program files -- that would be a bad idea. If my credentials were part of the program -- even if they were obfuscated and not stored in clear text -- then anyone who managed to get a copy of my program could run it. And they could gain access to my data, as if they were me.
I've written about this topic for SAS-related passwords. In this article, I'll share the approach that I use for API credentials and tokens.
REST APIs: Each service requires different types of secrets
My REST API services don't require just simple user ID and password combos. It depends on the API, but usually the information is in the form of one or more tokens that I've generated using the vendor's developer console, or perhaps that have been granted by an administrator.
For example, to access the Google Analytics API, I need three things: a client ID, a client secret token, and a valid "refresh" token. I can send these three items to the Google OAuth2 API, and in return I'll receive a live "access" token that I can use to request my data. I think of this like checking into a hotel. I show my ID and a credit card at the front desk, and in exchange I receive a room key. Just like my hotel room key, the access token doesn't last forever and cannot be reused on my next visit.
Other APIs are simpler and require just a single token that never expires. That's more like a house key -- it's mine to use forever, or until someone decides to change the locks.
Whether a static token or a token-for-token exchange, I don't want to leave these keys lying around for just anyone to find and use.
Hide your tokens in a file that only you can read
My simple approach is to store the token values in a text file within my home directory. Then, I change the permissions on the file such that only my account can read it. Whether I submit my program interactively (in SAS Enterprise Guide or SAS Studio) or as a scheduled batch job, it's running under my account. I'm showing the instructions here for UNIX/Linux, but Windows users can accomplish something similar with Windows permissions.
On Linux, I've used the chmod command to specify the octal value that says "only the owner can read/write." That's "chmod 600 filename". The "ls -l" command shows that this permissions mask has been applied.
chmod 600 ./.google_creds.csv ls -l ./.google_creds.csv > -rw------- 1 myid mygroup 184 Jan 15 12:41 ./.google_creds.csv
I stored my tokens in a standard CSV format because it's easy for SAS to read and it's easy for me to read if I ever need to change it.
Use INFILE to read the tokens dynamically
With this critical data now stored externally, and the file permissions in place, I can use SAS to read the credentials/tokens within my program and store the values in SAS macro variables. In the following SAS program, I assigned a macro variable to my user root folder. Since I might run this program on Linux or Windows, I used this trick to determine the proper path notation. I also used the &SYSUSERID macro variable to make my program more portable. If I want to supply this program to any colleagues (or to you!), the only thing that's needed is to create and store the token CSV files in the proper location.
/* My path is different for UNIX vs Windows */ %let authpath = %sysfunc(ifc(&SYSSCP. = WIN, \\netshare\root\u\&sysuserid., /u/&sysuserid.)); /* This should be a file that only YOU or trusted group members can read */ /* Use "chmod 0600 filename" in UNIX environment */ /* "dotfile" notation is convention for on UNIX for "hidden" */ filename auth "&authpath./.google_creds.csv"; /* Read in the secret account keys from another file */ data _null_; infile auth firstobs=2 dsd delimiter=',' termstr=crlf; length client_id $ 100 client_secret $ 30 refresh_token $ 60; input client_id client_secret refresh_token; call symputx('client_id',client_id); call symputx('client_secret',client_secret); call symputx('refresh_token',refresh_token); run;
When I run this code in my production job, I can see the result:
NOTE: The infile AUTH is: Filename=/u/myid/.google_creds.csv, Owner Name=myid,Group Name=mygroup, Access Permission=rw-------, Last Modified=Mon Jan 15 12:41:58 2018, File Size (bytes)=184 NOTE: 1 record was read from the infile AUTH. The minimum record length was 145. The maximum record length was 145. NOTE: DATA statement used (Total process time): real time 0.01 seconds user cpu time 0.01 seconds
For this example, my next step is to call the Google API to get my access token. I'll use the macro variables that my program created with CALL SYMPUTX to build the proper API call.
/* Call Google API to exchange the refresh token for an active access token */ %let oauth2=https://www.googleapis.com/oauth2/v4/token; filename rtoken temp; proc http method="POST" url="&oauth2.?client_id=&client_id.%str(&)client_secret=&client_secret.%str(&)grant_type=refresh_token%str(&)refresh_token=&refresh_token." out=rtoken; run;
See the full explanation of this Google Analytics example in this article.