Using SAS to access Google Analytics APIs

38

Every day before I even wake up, I have little "SAS robots" that do work for me. These are SAS batch jobs that gather data from external services and build data marts, generate reports, and send e-mail. One of those SAS jobs gathers Google Analytics data about our SAS blogs at blogs.sas.com.

Read the complete SAS Global Forum paper: Using Base SAS to Access Google Analytics APIs

With SAS 9.4 Maintenance 4, it's finally easy (well, relatively speaking) to connect to complicated APIs like those supported by Google, and to gather information with an unattended batch process. It's made possible by recent enhancements in PROC HTTP and the new JSON library engine in SAS. The PROC HTTP enhancements make it easier to negotiate multi-step authentication schemes like OAuth2. And of course, the JSON engine makes it easier to parse JSON results into SAS data sets. If you scour the Internet and SAS conference papers, you might find examples that use PROC GROOVY or other tricks to call outside of SAS to drive these OAuth2 APIs and parse JSON. The recent enhancements make such tricks unnecessary, and thus provide a cleaner approach with fewer moving parts.

Joseph Henry, a senior software developer at SAS (and primary developer for PROC HTTP), wrote a useful SAS Global Forum paper about PROC HTTP and OAuth2. Joseph definitely understands this better than I do, so I recommend that you read his paper first. Joseph's example uses Google Drive. Come back to this post when you want to see the instructions I've hacked together -- it's what I've got working for Google Analytics.

How to use the Google APIs

There are four main steps to use the Google APIs. Two of these steps need to be performed just once, and don't need to be automated. The other two steps are necessary each time you access a Google API from a program, and they can be scripted. Here are the steps:

  1. Grant permission and obtain an authorization token for your "app" (a SAS program, in our case). This must be performed in a browser while logged into your Google account.
  2. Obtain an access token and refresh token. You can accomplish this with a SAS program that you run just once. You then save the refresh token (in a safe place!) for subsequent runs.
  3. Convert your saved refresh token into an access token. Do this at the start of every SAS job that needs to get Google Analytics data.
  4. And finally, use the Google Analytics API to get actual data!

There is a lot to learn about how Google APIs work and how to provision your Google account with access to resources like Google Analytics. That's more than I can go into here, so I'm going to assume a few things and skip ahead. I'm going to assume that you already have an API project created, and that you have a "client ID" and "client secret". The Google API documentation is very good on this topic, and there are many Internet resources to help you learn this part of the process.

Also, some Google APIs are free to use, while others cost money. And some are rate-limited -- that is, you can call the APIs only so many times within a given period. Your specific limits might differ depending on the type of accounts you have with Google. If you have a corporate relationship with Google, you might have an admin who has to provision your specific Google account to get access to the APIs. I'm glossing over all of these nuances -- this post is simply about the code.

Step 1. Grant permission and retrieve auth code

Do this step only once per account, per API permission. You perform this step in the browser while logged into your Google account. Enter the following URL, substituting your client-id as indicated. The URL needs to be all on one line, but I've broken it up here for readability.

 https://accounts.google.com/o/oauth2/v2/auth?
         scope=https://www.googleapis.com/auth/analytics.readonly
         &redirect_uri=urn:ietf:wg:oauth:2.0:oob
         &response_type=code
         &client_id=<your-client-id>.apps.googleusercontent.com 

You will be prompted to allow the "app" access to your Google Analytics data (read only). This is similar to allowing another app to access your Facebook or Twitter profile -- you've probably seen similar prompts during your own Internet citizenship. Permissions are an important component of allowing apps to act on your behalf via REST APIs.

Then, you'll be redirected to a web page with an auth code that you should copy and save. We'll use it in the next step. Setting the redirect_uri properly is very important: redirect_uri=urn:ietf:wg:oauth:2.0:oob. Otherwise the API won't generate a code that you can use in a tool-based app like SAS.

Step 2. Exchange the auth code for an access token

Next, run this PROC HTTP step in SAS with the POST method to exchange that auth code from Step 1 for an access token. It will return a JSON response with a valid Bearer access token. That token expires in 3600 seconds (1 hour). It also returns a refresh_token, which you can exchange again for a new access token after the first one expires. The refresh_token never expires (though it can be revoked via the developer console or API). Thus, you usually need to perform this step just once, unless your token is revoked for some reason.

/* file to store your result */
filename token "c:\temp\token.json";
%let code_given =<code-returned-from-step-1> ;
%let oauth2=https://www.googleapis.com/oauth2/v4/token;
%let client_id=<your-client-id>.apps.googleusercontent.com;
%let client_secret=<your-client-secret>;
proc http
/* put this all on one line! */
 url="&oauth2.?client_id=&client_id.%str(&)code=&code_given.
      %str(&)client_secret=&client_secret.%str(&)redirect_uri=urn:ietf:wg:oauth:2.0:oob
      %str(&)grant_type=authorization_code%str(&)response_type=token"
 method="POST"
 out=token
;
run;

Note: the refresh_token and client-id/secret values should be protected! Anyone who has access to these can get to your Google API data as if they were you. Consider storing them in a file that only you have read access to, and programmatically pull them in when running your SAS program under your host account. (Read this article to learn how I implement this technique.)

Step 3. Exchange the refresh_token for a valid access token

Typically, you'll include this step just once at the beginning of your SAS job. This takes your saved refresh_token value and asks Google to grant an access token for use in the rest of your program. The Google APIs run very fast -- you should not need to renew the token again within the same job.

/* STEP 3. Do this every time you want to use the GA API */
/* Turn in a refresh-token for a valid access-token      */
/* Should be good for 60 minutes                         */
/* So typically run once at beginning of the job.        */
%let oauth2=https://www.googleapis.com/oauth2/v4/token;
%let client_id=<your-client-id>.apps.googleusercontent.com;
%let client_secret=<your-client-secret>;
%let refresh_token=<refresh-token-from-step-2>;
 
filename rtoken temp;
proc http
 method="POST"
 /* Again, put this all on one line */
 /* broken here for readability */
 url="&oauth2.?client_id=&client_id.
     %str(&)client_secret=&client_secret.
     %str(&)grant_type=refresh_token%str(&)refresh_token=&refresh_token."
 out=rtoken;
run;
 
/* Read the access token out of the refresh response  */
/* Relies on the JSON libname engine (9.4m4 or later) */
libname rtok json fileref=rtoken;
data _null_;
 set rtok.root;
 call symputx('access_token',access_token);
run;

Step 4. Use the Google Analytics API to gather data

Finally, we're to the point where we can retrieve data from this service! And this is where those new features in PROC HTTP come into play. In SAS 9.4m3, Joseph added support for an inline HEADERS statement, which is perfect for injecting OAuth2 information like "Bearer" token into the HTTP header.

proc http
  url="<REST-api-url-call>"
  method="GET" out=ga_resp;
  /* Headers statement makes this easy */
  headers 
    "Authorization"="Bearer &access_token."
    "client-id:"="&client_id.";
 run;

I've noticed that a lot of REST API docs use cUrl (command-line URL) as in their examples. In my experience, PROC HTTP can do just about anything that cUrl can do, and these new options make it even easier to make cUrl-based examples work in SAS.

The Google Analytics API is complex on its own, so I suggest that you spend some time in the Google API Explorer to see what types of requests yield what sort of results. Google Analytics allows you to fetch metrics across different dimensions and dates, so the results can come back summarized across the dimensions that you request. Since we're bringing this data into SAS and I probably want to further summarize in my reports, I try to get as much "unsummarized" data as possible, and I avoid aggregations such as "Average views" or "Average time on page" -- as I don't want to risk calculating new summaries based on these.

I've included my program for retrieving the page views, total time on page, entrances and exits at the daily level. This is granular enough for me to create meaningful reports all of our blog authors. The program is a SAS macro that issues one API call per day in the range specified.

/* Metrics and dimensions are defined in the Google Analytics doc */
/* Experiment in the developer console for the right mix          */
/* Your scenario might be different and would require a different */
/* type of query                                                  */
/* The GA API will "number" the return elements as                */
/* element1, element2, element3 and so on                         */
/* In my example, path and title will be 1 and 2 */
%let dimensions=  %sysfunc(urlencode(%str(ga:pagePath,ga:pageTitle)));
/* then pageviews, uniquepageviews, timeonpage will be 3, 4, 5, etc. */
%let metrics=     %sysfunc(urlencode(%str(ga:pageviews,ga:uniquePageviews,ga:timeOnPage,ga:entrances,ga:exits)));
/* this ID is the "View ID" for the GA data you want to access   */
%let id=          %sysfunc(urlencode(%str(ga:<your-view-ID>)));
 
%macro getGAdata;
%do workdate = &enddate %to &startdate %by -1;
	%let urldate=%sysfunc(putn(&workdate.,yymmdd10.));
	filename ga_resp temp;
	proc http
         /* again, put this url= value ALL on a single line! */
	 url="https://www.googleapis.com/analytics/v3/data/ga
            ?ids=&id.%str(&)start-date=&urldate.
            %str(&)end-date=&urldate.
            %str(&)metrics=&metrics.
            %str(&)dimensions=&dimensions.%str(&)max-results=20000"
	 method="GET" out=ga_resp;
	 headers 
	   "Authorization"="Bearer &access_token."
	   "client-id:"="&client_id.";
	run;
 
        /* Love this JSON libname engine! */
	libname garesp json fileref=ga_resp;
 
	data ga.ga_daily%sysfunc(compress(&urldate.,'-')) (drop=element:);
		set garesp.rows;
		drop ordinal_root ordinal_rows;
		length date 8 url $ 300 title $ 250 
	          views 8 unique_views 8 time_on_page 8 entrances 8 exits 8
	          ;
		format date yymmdd10.;
		date=&workdate.;
		/* Corerce the elements into data variables */
		/* Basic on expected sequence               */
		url = element1;
		title = element2;
		views = input(element3, 5.);
		unique_views = input(element4, 6.);
		time_on_page=input(element5, 7.2);
		entrances = input(element6, 6.);
		exits = input(element7, 6.);
	run;
%end;
%mend;
 
%getGAdata;
 
/* Assemble the daily files into one data set */
data alldays_gadata;
  set ga.ga_daily:;
run;

When it's all finally working, SAS pulls in data that looks like this:
Google Analytics data via API

Full code: I've placed all of the code in a single public gist on GitHub. To use it, you will need to create your own API project with Google Analytics, then modify the code to add your own client-id, client-secret, and other values retrieved from the API.

Re-entry and disclaimers

This has been a pretty deep post about Google APIs and how to make them work from SAS programs. If you have experience with the Google APIs already, then I'm optimistic that this article is enough for you to get things working in SAS. If you've never used OAuth2 or any of the Google APIs, then this post might not be quite enough. Don't forget to read Joseph's paper about OAuth2 and PROC HTTP. You might also like his paper about using RESTful APIs.

Please leave a comment if you have questions or if you think that I left something out. I'll do my best to correct and clarify.

Share

About Author

Chris Hemedinger

Senior Manager, SAS Online Communities

+Chris Hemedinger is the manager of SAS Online Communities. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies.  He also hosts the SAS Tech Talk webcasts each year from SAS Global Forum, connecting viewers with smart people from SAS R&D and the impressive work that they do.

38 Comments

  1. I'm surprised to see in that final step you get variables named element1-element7, and have to coerce them into appropriate names and types. Is that a limitation of this particular API (that it returns JSON where everything is character and fields are not named?) or is the JSON libname engine not automatically reading these attributes? Is there an analog of XMLMAP that could be passed on the JSON engine libname statement, to define attributes? Thx!

    • Chris Hemedinger
      Chris Hemedinger on

      Quentin, yes, there is a MAP= option on the JSON engine. I show an example in another post. The Google Analytics API does return metadata about the fields it returns -- in another part of the response. Since I inspected the results and know the sequence, I didn't bother with adding code that works it out dynamically. The JSON engine is pretty good at working out char vs. numbers, but fields like dates need INFORMAT treatment. You can handle this in a JSON map or else covert the values in code as I've done here.

  2. Chris, when we attempt to run this solution we encounter a couple of SSL certificate errors, specifically "error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed," on the first two steps. Is there anything we can do? We're on an AIX Server with SAS 9.4.3 and Enterprise Guide 7.13. Thanks!

  3. Hi!

    I've tried to follow this tutorial but I get stuck on step 2 with this error:

    ERROR: No logical assign for filename .

    I've tried a bunch of things but I have no idea what the problem is, has anyone gotten the same error?

    • Chris Hemedinger
      Chris Hemedinger on

      You might need to change that "FILENAME token" path assignment to fit the paths on your machine. "C:\temp" is a valid path on my machine, but you might need to select a different folder to target. If on UNIX, you'll need a path like '/u/yourid/folder'.

      • Thanks for the answer! I use a different path which works fine, SAS doesn't seem to have a clue what the problem is.

        Since I wrote the question I've dug a little deeper and the error seems to be in my url string. When I paste it into the browser (after changing the macro variables to the real values, of course) I get "error 400 bad request".

        When I paste "https://www.googleapis.com/" into the browser I just get a message "not found", I have no idea why.

        • Chris Hemedinger
          Chris Hemedinger on

          That URL (googleapis) is not meant to be a browser destination -- it's for use by API calls and needs to be fully qualified with all of your client ID and secret and token. Are you able to use the Google developer console to use the API interactively (outside of SAS)? See if you can get that working first, then try to transcribe those settings to the SAS program.

          • Ok! I haven't upgraded SAS to Maintenance 4 yet, that may be the issue. Thank you for the help!

          • Chris Hemedinger
            Chris Hemedinger on

            Definitely need M4 for the JSON support. But PROC HTTP part *should* work even in 9.4m3.

          • Hi again! I've now updated to M4 and it all works fine! That was the issue, thanks!

  4. Dear Mr. Hemedinger,

    I am working in a SAS environment with SAS 9.3. on OS: IBM-AIX, and EG 7.1.3. As far as I understand you need to have the SAS version 9.4.M4 in order to be able to follow the steps outlined above. Do you have any idea or "how-to-guide" for being able to establish a connection to Google Analytics when working on the afore mentioned environment?

    Thanks in advance for any idea / suggestion.

    FK

    • Chris Hemedinger
      Chris Hemedinger on

      FK,

      Much of the PROC HTTP steps should work. However, you'll have to use HEADERSIN and point to a SAS fileref instead of using the HEADERS statement in order to pass some info to the REST API.

      However, the JSON parsing will be trickier. Prior to SAS 9.4 many people used simple DATA step to pull the information they needed from a JSON response. That will require trial and error on your part. However, you should be able to use the Google Developer Console to generate some representative JSON content that you can practice on.

  5. Hi Chris,

    Thanks for your knowledge sharing!

    When I ran step 2, SAS prompted an error "NOTE: The quoted string currently being processed has become more than 262 bytes long. You might have unbalanced quotation
    marks". Any ideas?

    This is the code I ran (I put all on one line)

    url="&oauth2.?client_id=&client_id.%str(&)code=&code_given.%str(&)client_secret=&client_secret.%str(&)redirect_uri=urn:ietf:wg:oauth:2.0:oob%str(&)grant_type=authorization_code%str(&)response_type=token"
    method="POST"
    out=token;

    • Chris Hemedinger
      Chris Hemedinger on

      I agree with you about REST APIs being popular, but for SAS users who need to access SQL Server or other RDBMSs, they aren't the most efficient. SAS/ACCESS engines are available for just about every database, including cloud-based systems. These engines allow SAS users to access data via SAS libraries and data members. For SAS users, REST APIs are most important for services that provide data, but no direct database layer except via HTTP.

  6. Hi Chris!
    Thanks for your post!

    I've tried to follow this tutorial but on step 2, SAS prompted this :
    ERROR: Error connecting to xxx.xx.xxx.xxx:xxx. (The connection has timed out.)
    ERROR: Unable to establish connection to http://www.googleapis.com.
    ERROR: Unable to connect to Web server.

    Any ideas?

    Thx!

    Aude

    • Chris Hemedinger
      Chris Hemedinger on

      Aude, are you able to use PROC HTTP at all to get to the web? Try this simple program with the free httpbin.org service:

      filename resp TEMP;
      
      proc http 
         url="http://httpbin.org/get" 
         out=resp;
      run;
      

      If not, it's possible that you need to specify a proxy host/port as part of the PROC HTTP -- in the case that your company uses a firewall/internet proxy to protect web traffic in your organization.

      • Thx Chris, it's ok when I specify proxy host/port in proc http option.
        However, there's a new error "411 (Length Required)". Any ideas?

      • Nicolas Royer on

        Hello Chris, This simple program doesn't work even if I specify the proxy options (ERROR: No logical assign for filename .).
        However, I can get the data using a simple url filename (see below).
        I'm using SAS9.4 TS1M2 . Do I have any ideas to explain the issue ?

        filename resp TEMP;

        proc http
        url="http://httpbin.org/get"
        out=resp
        proxyhost="10.3.6.18"
        PROXYPORT=9191
        PROXYUSERNAME="psrvsyssas"
        PROXYPASSWORD="MyPwd"
        ;
        run;
        /* => ERROR: No logical assign for filename . */

        filename isg url 'http://httpbin.org/get' proxy='http://10.3.6.18:9191/' puser='psrvsyssas' ppass='MyPwd';
        data reponse;
        infile isg MISSOVER DSD lrecl=32767;
        input a;
        run;
        /* It's OK */

        • Chris Hemedinger
          Chris Hemedinger on

          The only thing that's odd is the error you see: "No logical assignment for filename". The fileref RESP is assigned above so I wouldn't expect that error in this case, even if the proxy didn't work. You should get something like

          ERROR: Error connecting to 10.3.6.18:9191. (The connection has timed out.)
          ERROR: Unable to establish connection to httpbin.org via proxy 10.3.6.18.
          ERROR: Unable to connect to Web server.
          

          You might need to work with SAS Tech Support to see what's going on.

  7. Hi Chris,

    Thanks for the detailed post, it has been very useful.
    I'm getting stuck on the last step, This is what I'm getting:

    ERROR: File GARESP.ROWS.DATA does not exist.

    GARESP was successfully assigned, and it has some other data sets there, just not rows.

    Do you know what's going on?

    • Chris Hemedinger
      Chris Hemedinger on

      The table structure from the JSON library depends on the JSON response. Your Google Analytics query might yield a different format of response from mine. You'll have to explore the contents of the library and see what makes sense for your data processing.

      Also, if there is an error in the response or no rows of data, tables could be missing. If you expect a certain table in the response, it's a good idea to check for it with an EXIST function first, and then handle the case of missing data more gracefully.

  8. Hi,
    I am stuck! My fault I am sure.
    1) I went to api consolo and creat a project (eg. ABC_PROJECT)
    2) to this project I add the api CustomSearch
    3) I generated a key, lets call it ABC_LONG_KEY

    In sas: I wrote
    options noquotelenmax;
    filename token "c:\temp\token.json";
    %let code_given =ABC_LONG_KEY;
    %let oauth2=https://www.googleapis.com/oauth2/v4/token;
    %let client_id=ABC_PROJECT.apps.googleusercontent.com;
    %let client_secret=;

    I cannot find info for client_secret.

    When I run I get "NOTE: 401 Unauthorized"

    Any suggestion?

    My goal is to run a google query and get the results in a dataset

    Thanks a million

  9. This is amazing thanks so much

    Any chance you could provide details on how to access data from the google search console?

  10. Hi,

    First, thanks for this very useful and interesting article .
    I used your method and code instructions to connect SAS to the GA API. The problem is that I am stuck in the last step.
    I produced a JSON file before launching the data step in order to have visibility on what I'm getting back.
    I have this message :

    {
    "error": {
    "errors": [
    {
    "domain": "usageLimits",
    "reason": "accessNotConfigured",
    "message": "Access Not Configured. Google Analytics API has not been used in project ****** before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/analytics.googleapis.com/overview?project=****** then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.",
    "extendedHelp": "https://console.developers.google.com/apis/api/analytics.googleapis.com/overview?project=****** "
    }
    ],
    "code": 403,
    "message": "Access Not Configured. Google Analytics API has not been used in project ****** before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/analytics.googleapis.com/overview?project=****** then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry."
    }
    }

    I don't understand because i think i have set good parameters for my account. Do you have an idea about this problem ?

    Thanks in advance for your help.

    Best regards,

    • Chris Hemedinger
      Chris Hemedinger on

      From the messages, I'd guess that the owner of this Google Analytics project has not enabled the API use. Do your API tests work from the API console/explorer? I recommend testing everything in the console first, before trying to use from another client (like SAS).

Leave A Reply

Back to Top