Reading data with the SAS JSON libname engine

16

JSON is the new XML. The number of SAS users who need to access JSON data has skyrocketed, thanks mainly to the proliferation of REST-based APIs and web services. Because JSON is structured data in text format, we've been able to offer simple parsing techniques that use DATA step and most recently PROC DS2. But finally*, with SAS 9.4 Maintenance 4, we have a built-in LIBNAME engine for JSON.

Simple JSON example: Who is in space right now?

Speaking of skyrocketing, I discovered a cool web service that reports who is in space right now (at least on the International Space Station). It's actually a perfect example of a REST API, because it does just that one thing and it's easily integrated into any process, including SAS. It returns a simple stream of data that can be easily mapped into a tabular structure. Here's my example code and results, which I produced with SAS 9.4 Maintenance 4.

filename resp temp;
 
/* Neat service from Open Notify project */
proc http 
 url="http://api.open-notify.org/astros.json"
 method= "GET"
 out=resp;
run;
 
/* Assign a JSON library to the HTTP response */
libname space JSON fileref=resp;
 
/* Print result, dropping automatic ordinal metadata */
title "Who is in space right now? (as of &sysdate)";
proc print data=space.people (drop=ordinal:);
run;

JSON who is in space
But what if your JSON data isn't so simple? JSON can represent information in nested structures that can be many layers deep. These cases require some additional mapping to transform the JSON representation to a rectangular data table that we can use for reporting and analytics.

JSON map example: Most recent topics from SAS Support Communities

In a previous post I shared a PROC DS2 program that uses the DS2 JSON package to call and parse our SAS Support Communities API. The parsing process is robust, but it requires quite a bit of fore knowledge about the structure and fields within the JSON payload. It also requires many lines of code to extract each field that I want.

Here's a revised pass that uses the JSON engine:

/* split URL for readability */
%let url1=http://communities.sas.com/kntur85557/restapi/vc/categories/id/bi/topics/recent;
%let url2=?restapi.response_format=json%str(&)restapi.response_style=-types,-null,view;
%let url3=%str(&)page_size=100;
%let fullurl=&url1.&url2.&url3;
 
filename topics temp;
 
proc http
 url= "&fullurl."
 method="GET"
 out=topics;
run;
 
/* Let the JSON engine do its thing */
libname posts JSON fileref=topics;
title "Automap of JSON data";
 
/* examine resulting tables/structure */
proc datasets lib=posts; quit;
proc print data=posts.alldata(obs=20); run;

Thanks to the many layers of data in the JSON response, here are the tables that SAS creates automatically.

json Auto tables
There are 12 tables that contain various components of the message data that I want, plus the ALLDATA member that contains everything in one linear table. ALLDATA is good for examining structure, but not for analysis. You can see that it's basically name-value pairs with no data types/formats assigned.

json ALLDATA
I could use DATA steps or PROC SQL to merge the various tables into a single denormalized table for my reporting purposes, but there is a better way: define and apply a JSON map for the libname engine to use.

To get started, I need to rerun my JSON libname assignment with the AUTOMAP option. This creates an external file with the JSON-formatted mapping that SAS generates automatically. In my example here, the file lands in the WORK directory with the name "top.map".

filename jmap "%sysfunc(GETOPTION(WORK))/top.map";
 
proc http
 url= "&fullurl."
 method="GET"
 out=topics;
run;
 
libname posts JSON fileref=topics map=jmap automap=create;

This generated map is quite long -- over 400 lines of JSON metadata. Here's a snippet of the file that describes a few fields in just one of the generated tables.

"DSNAME": "messages_message",
"TABLEPATH": "/root/response/messages/message",
"VARIABLES": [
{
  "NAME": "ordinal_messages",
  "TYPE": "ORDINAL",
  "PATH": "/root/response/messages"
},
{
  "NAME": "ordinal_message",
  "TYPE": "ORDINAL",
  "PATH": "/root/response/messages/message"
},
{
  "NAME": "href",
  "TYPE": "CHARACTER",
  "PATH": "/root/response/messages/message/href",
  "CURRENT_LENGTH": 19
},
{
  "NAME": "view_href",
  "TYPE": "CHARACTER",
  "PATH": "/root/response/messages/message/view_href",
  "CURRENT_LENGTH": 134
},

By using this map as a starting point, I can create a new map file -- one that is simpler, much smaller, and defines just the fields that I want. I can reference each field by its "path" in the JSON nested structure, and I can also specify the types and formats that I want in the final data.

In my new map, I eliminated many of the tables and fields and ended up with a file that was just about 60 lines long. I also applied sensible variable names, and I even specified SAS formats and informats to transform some columns during the import process. For example, instead of reading the message "datetime" field as a character string, I coerced the value into a numeric variable with a DATETIME format:

{
  "NAME": "datetime",
   "TYPE": "NUMERIC",
  "INFORMAT": [ "IS8601DT", 19, 0 ],
  "FORMAT": ["DATETIME", 20],
  "PATH": "/root/response/messages/message/post_time/_",
  "CURRENT_LENGTH": 8
},

I called my new map file 'minmap.map' and then re-issued the libname without the AUTOMAP option:

filename minmap 'c:\temp\minmap.map';
 
proc http
 url= "&fullurl."
 method="GET"
 out=topics;
run;
 
libname posts json fileref=topics map=minmap;
proc datasets lib=posts; quit;
 
data messages;
 set posts.messages;
run;

Here's a snapshot of the single data set as a result.

JSON final data
I think you'll agree that this result is much more usable than what my first pass produced. And the amount of code is much smaller and easier to maintain than any previous SAS-based process for reading JSON.

Here's the complete program in a public GitHub gist, including my custom JSON map.


* By the way, the JSON libname engine actually made its debut as part of SAS Visual Data Mining and Machine Learning, part of SAS Viya. This is a good example of how work on SAS Viya continues to benefit the users of the SAS 9.4 architecture.

Share

About Author

Chris Hemedinger

Senior Manager, SAS Online Communities

+Chris Hemedinger is the manager of SAS Online Communities. He’s also co-author of the popular SAS for Dummies book, author of Custom Tasks for SAS Enterprise Guide using Microsoft .NET, and a frequent participant on the SAS Enterprise Guide discussion forum.

Related Posts

16 Comments

  1. Pingback: Using SAS DS2 to parse JSON - The SAS Dummy

  2. Really useful post, Chris. Thank you.

    I’ve been using a lot of web services that return JSON over the last couple of years. I had used Groovy then moved to Python.

    Your post prompted me to update to SAS9.4 M4 and try the new JSON engine.

    To test the engine, I adapted your ‘space’ example to query the Google Distance Matrix API. I had previously written this in Python and SAS. Now it’s a few lines of SAS code. Job done.

    This makes enriching data with external web services a lot simpler. There's lots of potential to be creative here.

    Thanks again.

    • Chris Hemedinger
      Chris Hemedinger on

      So glad it helped, Neil! With a robust PROC HTTP and now the JSON libname engine, I think we'll see lots of cases where we can reduce the context switching of processes. Fewer moving parts == fewer points of failure!

  3. Peter Johansson on

    Come on...!
    I have been on a mission trying to get people to understand how great PROC Lua is and my big finale has been my solution for parsing JSON with Lua.

    Chris, do you have any insights on change in Lua support in 9.4M4?

    • Chris Hemedinger
      Chris Hemedinger on

      Sorry to steal your thunder here, Peter. PROC LUA is of course still supported and I see some minor fixes went in, but no big changes that I'm aware of.

  4. Peter Johansson on

    An alternative for those of us struggling without the M4-release if I may?
    You can download and parse JSON using Powershell.
    Use the FILENAME PIPE with a powershell-command of Invoke-WebRequest/ConvertFrom-JSON. Then its easy to put things into place using PRX-commands.

  5. Pingback: Reporting on GitHub accounts with SAS - The SAS Dummy

  6. I have a json file that has been handed to me. I am trying to read it using the libname engine

    libname bmk JSON '/json/test.json';

    I have also tried
    filename resp '/json/test.json';

    /* Assign a JSON library to the HTTP response */
    libname bmk JSON fileref=resp;

    but get the same error either way.

    NOTE: JSON data is only read once. To read the JSON again, reassign the JSON LIBNAME.
    BMK Ø]Î
    ERROR: Error in the LIBNAME statement.

    what am I missing?

    We are running 9.4.4.0

    • Chris Hemedinger
      Chris Hemedinger on

      Is the JSON file very large, or does it contain very long records (lines)? You might need to specify a large LRECL value:

      filename resp '/json/test.json' LRECL=1000000;

      Another possibility is that the file contains special characters that cannot be processed in your current session encoding. You might need to try running SAS with ENCODING=UTF8, or at least try:

      filename resp '/json/test.json' LRECL=1000000 encoding='utf-8';

      • I am able to get this working with a simple json. But when I use the one I was given I still get the issue. I noticed that when I do a proc print and you have the 'p' variables, it works to p9, but when I go to p10 that is when it dies.

        Any thoughts on that?

        • Chris Hemedinger
          Chris Hemedinger on

          I found that bringing the data into a SAS data set (using DATA step and a SET statement) is the best approach, before trying to run any PROCs on it. Use the automap and map= options to see the structure of the data first, and perhaps shape the data you get coming in.

          If you continue having issues, you might need to work with SAS Tech Support and supply an example of the JSON that's failing. I'm happy to take a look too if you send it to me (chris.hemedinger@sas.com).

  7. Pingback: Using SAS to access Google Analytics APIs - The SAS Dummy

  8. Hi, I'm trying to run this code in EG 7.15, but getting the error below. What am I missing?

    filename resp temp;

    /* Neat service from Open Notify project */
    proc http
    url="http://api.open-notify.org/astros.json"
    method= "GET"
    out=resp;
    run;

    /* Assign a JSON library to the HTTP response */
    libname space JSON fileref=resp;

    /* Print result, dropping automatic ordinal metadata */
    title "Who is in space right now? (as of &sysdate)";
    proc print data=space.people (drop=ordinal:);
    run;

    ERROR:

    ERROR: The JSON engine cannot be found.
    ERROR: Error in the LIBNAME statement.
    56
    57 /* Print result, dropping automatic ordinal metadata */
    58 title "Who is in space right now? (as of &sysdate)";
    59 proc print data=space.people (drop=ordinal:);
    ERROR: Libref SPACE is not assigned.

    • Chris Hemedinger
      Chris Hemedinger on

      I'd guess your SAS session is not running SAS 9.4 Maint 4 or later. Run "proc product_status; run;" to verify the level of SAS that you have -- the answer will be in the log.

Leave A Reply

Back to Top