Using built-in Git operations in SAS

37

It seems that everyone knows about GitHub -- the service that hosts many popular open source code projects. The underpinnings of GitHub are based on Git, which is itself an open-source implementation of a source management system. Git was originally built to help developers collaborate on Linux (yet another famous open source project) -- but now we all use it for all types of projects.

There are other free and for-pay services that use Git, like Bitbucket and GitLab. And there are countless products that embed Git for its versioning and collaboration features. In 2014, SAS developers added built-in Git support for SAS Enterprise Guide.

Since then, Git (and GitHub) have grown to play an even larger role in data science operations and DevOps in general. Automation is a key component for production work -- including check-in, check-out, commit, and rollback. In response, SAS has added Git integration to more SAS products, including:

  • the Base SAS programming language, via a collection of SAS functions.
  • SAS Data Integration Studio, via a new source control plugin
  • SAS Studio (v3.8 for SAS 9.4, and SAS Viya 3.5 and later)

I've recorded a tutorial (12 minutes or so) that you can watch to learn how to get started quickly!

You can use this Git integration with any service that supports Git (GitHub, GitLab, etc.), or with your own private Git servers and even just local Git repositories.

Watch related webinar: Using SAS® With Git: Bring a DevOps Mindset to Your SAS® Code

SAS functions for Git

Git infrastructure and functions were added to SAS 9.4 Maintenance 6. The new SAS functions all have the helpful prefix of "GITFN_" (signifying "Git fun!", I assume). Here's a partial list:

GITFN_CLONE  Clones a Git repository (for example, from GitHub) into a directory on the SAS server.
GITFN_COMMIT  Commits staged files to the local repository
GITFN_DIFF Returns the number of diffs between two commits in the local repository and creates a diff record object for the local repository.
GITFN_PUSH  Pushes the committed files in the local repository to the remote repository.
GITFN_NEW_BRANCH  Creates a Git branch

 

The function names make sense if you're familiar with Git lingo. If you're new to Git, you'll need to learn the terms that go with the commands: clone, repo, commit, stage, blame, and more. This handbook provided by GitHub is friendly and easy to read. (Or you can start with this xkcd comic.)

You can learn about the SAS functions from the SAS documentation -- including important details about how to connect SAS to Git.

Here's an example program that clones (that is, copies into a local space) a repository that contains code samples from my blog:

data _null_;
 version = gitfn_version();
 put version=;             
 
 rc = gitfn_clone("https://github.com/sascommunities/sas-dummy-blog/",
   "c:\Projects\sas-dummy-blog");
 put rc=;
run;

In one line, this function fetches an entire collection of code files from your source control system. Here's a more concrete example that fetches the code to a work space, then runs a program from that repository. (This is safe for you to try -- here's the code that will be pulled/run. It even works from SAS University Edition.)

options dlcreatedir;
%let repoPath = %sysfunc(getoption(WORK))/sas-dummy-blog;
libname repo "&repoPath.";
libname repo clear;
 
/* Fetch latest code from GitHub */
data _null_;
 rc = gitfn_clone("https://github.com/sascommunities/sas-dummy-blog/",
   "&repoPath.");
 put rc=;
run;
 
/* run the code in this session */
%include "&repoPath./rng_example_thanos.sas";

You could use the other GITFN functions to stage and commit the output from your SAS jobs, including log files, data sets, ODS results -- whatever you need to keep and version.

Using Git in SAS Data Integration Studio

SAS Data Integration Studio has supported source control integration for many years, but only for CVS and Subversion (still in wide use, but they aren't media darlings like GitHub). By popular request, the latest version of SAS Data Integration Studio adds support for a Git plug-in.

Example of Git in SAS DI Studio

See the documentation for details: How to use the Git plug-in for SAS Data Integration Studio. Or, see this very detailed SAS communities article, with a tutorial video included!

Using Git in SAS Studio

Beginning with SAS Studio 3.8, you can manage your SAS programs in a Git repository. This integration requires a bit of set up to allow SAS Studio to connect to your repository "as you" using the standard mechanism of SSH public/private keys. Once configured, you can add repositories to your SAS Studio session, fetch the latest versions of files, stage new files, commit files, and see history. You'll see the Git content set apart with a special icon, indicating that it's managed in Git.

Read more about setup and use in the SAS Studio documentation

Add SAS Studio custom tasks from Git

Did you know that you can add custom tasks to SAS Studio? And that you can share these tasks in a central location using Git? This feature has been available for several releases. You can configure this in the Task Repositories pane of the Preferences window.

You can try this with a collection of SAS-supplied custom tasks, available here as part of our "Custom Tasks Tuesday" series.

Using Git in SAS Enterprise Guide

This isn't new, but I'll include it for completeness. SAS Enterprise Guide supports built-in Git repository support for SAS programs that are stored in your project file. You can use this feature without having to set up any external Git servers or repositories. Also, SAS Enterprise Guide can recognize when you reference programs that are managed in an external Git repository. This integration enables features like program history, compare differences, commit, and more. Read more and see a demo of this in action here.

program history

If you use SAS Enterprise Guide to edit and run SAS programs that are managed in an external Git repository, here's an important tip. Change your project file properties to "Use paths relative to the project for programs and importable files." You'll find this checkbox in File->Project Properties.

With this enabled, you can store the project file (EGP) and any SAS programs together in Git, organized into subfolders if you want. As long as these are cloned into a similar structure on any system you use, the file paths will resolve automatically.

SAS Enterprise Guide v8.2 includes even more Git integration, with support for cloning repositories, pull, push, and managing branches.

Share

About Author

Chris Hemedinger

Director, SAS User Engagement

+Chris Hemedinger is the Director of SAS User Engagement, which includes our SAS Communities and SAS User Groups. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies

37 Comments

  1. Thanks for this great post! I tried running the cloning code you supplied but got this error:

    ERROR: Unable to load libgit2 module.

    I googled that error to no avail; have you ever seen it?

    • Chris Hemedinger
      Chris Hemedinger on

      I'm checking into that -- looks like SAS can't find the Git library that it needs to interface with Git. You didn't mention your OS and SAS version. Assuming SAS 9.4m6 -- but Windows or Linux or what? And using Base SAS, EG, or SAS Studio?

      • Oops, sorry for omitting that info. Looks like 9.4 m5.

        Current version: 9.04.01M5P091317
        Operating System: LIN X64 .

  2. Danny Zimmerman on

    Jed,

    If you're running on Linux, there was an issue with loading the git libraries that was addressed with a 9.4m6 hotfix.

    The other possibility is that your version of SAS is 9.4m5 which has the first iteration of the functions but the git libraries were not being shipped with this version of SAS. The functions in 9.4m5 are not considered production and were not documented for this reason.

  3. Hi Chris,
    Very interesting post. I have a question about SAS Enterprise Guide.
    I have a project with 3 process flow and I have and ordered list to extecute them: process flow1 -> process flow2 - process flow3
    Is there any option to stop the execution if some of the process flow gives and error?
    I would like to execution if process flow1 gives errors and do not execute process flow2 and process flow3

    Thanks in advance

    • Chris Hemedinger
      Chris Hemedinger on

      The only way that I can think of would be to add a new node, probably a Program node, to the start of each flow. Then add a Condition to that node to check for an error or perhaps a macro variable flag that you define.

  4. Patrick O'Neill on

    Chris and SAS,

    Is there a way for me to load the GIT functions and try them out even before my very large organization upgrades us to 4M6?

    Saw you in person in Dallas- thanks for all your good work.

    • Chris Hemedinger
      Chris Hemedinger on

      You could try them in SAS University Edition -- free to download and install for noncommercial purposes. Use that as a POC and to help justify the upgrade!

  5. Pingback: Gifts to give the SAS fan in your life - SAS Users

  6. Fredrik Hansson on

    Hi.

    Git integration in in EG 8 looks very promising!

    My organization only permitts access to our own git installation through ssh. That prevents me to use the built-in git integration in EG (http(s) only).
    Are there plans to let EG users connect through ssh anytime soon?

    Best regards

    • Chris Hemedinger
      Chris Hemedinger on

      Fredrik, with the latest changes in SAS Enterprise Guide 8.2, you can't Clone a repository with ssh, only http. But if you clone using another tool, you can use the Git Repositories->Add feature to point EG to your local Git repo and work with it that way.

  7. Fredrik Hansson on

    Yes. Thats very nice! If only I could push/pull (to ssh) it would be perfect! :-)

    And another thing. If I could see the whole git-working tree (not only changed files), I could browse and open all my programs from one place.

    • Chris Hemedinger
      Chris Hemedinger on

      Push/pull is now supported in the latest EG and SAS Studio clients. EG supports cloning with HTTPS only, and SAS Studio supports SSH and HTTPS. But you can connect EG to a local repository you've already cloned another way.

      SAS Studio lets you add the repo as a Folder Shortcut and then you can navigate your content. EG uses the standard File->Open approach to getting to your content.

  8. Gabor Szentesi on

    Dear Chris,

    thank you for this post! I'd have a question. With CI MA 6.5 (used with DI for data loading and EG for custom code management) a near future upgrade from 9.4m4 to m6 or m7 is planned along with the possible introduction of a version control system for the custom code base. Would you recommend introducing git before or after/during the maintenance level migration?

    Thank you much,
    Gabor

    • Chris Hemedinger
      Chris Hemedinger on

      Git (and the practice of using version control) can take some time to get used to, so I recommend introducing it sooner...and then when the new tools arrive, you'll be able to accelerate your time to adopt them.

  9. Pingback: How to organize your SAS projects in Git - The SAS Dummy

  10. We are using SAS Enterprise Guide (8.3(8.3.0/103) . Since our company only allows the projects (codes) stored in our SAS server (not github or gitlab). What need to be done on our SAS server to have a repositories and do version control internally? Thanks

    • Chris Hemedinger
      Chris Hemedinger on

      You can use Program History in EG but it requires that your programs are embedded in EG projects, not external files. You can use Git without a Git server (like Gitlab or GitHub) by managing your code in a local repository. Allows for program history, versions -- but does not support collaboration or backup features. To use Git features from EG, that local repository needs to be managed from the machine/network where EG is installed. If your code is always on the SAS server file system, then you cannot use Git directly from EG. However, SAS Studio (which runs in a central server) can use Git with the server file system.

  11. Richard Paterson on

    Hi All,

    I have following the instruction but when trying to initialise the repository I get an 'Failed to initialize local repository'. I can initialize, push, pull from all other applications to this repository but not from SAS Data Integration Studio. any ideas?

    regards,

    Richard

    • Chris Hemedinger
      Chris Hemedinger on

      Richard, you should probably contact SAS Technical Support for help on this and to track it.

  12. Hi Chris, thanks for the tutorial. I'm working in SAS Data Integration Studio and I completed with succes the configuration for my GitHub, but when I try to send something to GitHub clicking on "Archive as SAS Package", that package is not sent to the Git repository, instead it result exported in a local directory. So the Git configuration seems to be ok but nothing can be sent to te repository. Have you any idea about a possible resolution?

    • Chris Hemedinger
      Chris Hemedinger on

      Git works by allowing you to work with a repository in a local directory, and then use Git tools to commit/push the artifacts to your Git server. So this may be working as expected, but I suggest working with SAS Technical Support to ensure that you're following the correct steps and seeing the expected result.

      • Hi, thanks to reply

        I created a Repository directory but the location where the exported package is stored is not that one. Of course the path of the repository directory was inserted in the setting panel of the GitPlugin, and the initialization of the repository was succesful, but it seems to make no differences. When I try to Archive a package, SAS asks me to insert a Name and a Description, and that seems to be a confirm that the GitHub is ready to recieve my versioning, but unfurtunately the package is not sent where i want and no sinchronization between GitHub and my directory is done.

  13. Valerio Quattrini on

    Hi Chris, good evening.

    I have a question about the storing of the versions in Git, using SAS Data Integration Studio. As you know, the way to delete permanently a version is to use the "Archivied SAS Package" window in order to eliminate the object you want to delete. Using this approach, the file named "Archivies.xml" results updated and inside it any reference to the deleted version will disappear. Instead, If I delete a version manually from the RepositoryGit folder or in my GIT centralized URL, i still see this version inside the "Archivied SAS Package" chronology, and I can eventually import back that version. So my doubt is: where are these versions stored?? Same question about version merging: If I have a version A (the old one) and a version B (the new)., where is the old version located? I can't find it phisically, but i can use it for a merge, so it must exist somewhere!

    Thanks,
    Valerio.

    • Chris Hemedinger
      Chris Hemedinger on

      Velerio, good questions. I don't know a lot about how SAS DI Studio works in this case. I suggest opening a question with SAS Technical Support to get the best answer.

  14. karolina touwen on

    thank you for this article. I was wondering if more SAS programms are integrated now, as of MArch 2022 with GIT, for example Visual Analytics? could you make an update on that? regards

  15. Hi Chris, always enjoy reading your blog! Is there a recommended approach to using the Enhanced Editor with GitHub? Maybe just keeping all the code local but have an automatic sync up to GitHub nightly?

    • Chris Hemedinger
      Chris Hemedinger on

      Yes, if using Base SAS on Windows, I think that's the way to go. SAS Display Manager won't have any built-in Git awareness. You could also consider using VS Code and the new SAS extension to manage your files (even if you use SAS on Windows to run them).

  16. Shima Pishnamaz on

    Hi Chris, thank you for the helpful blog. I am trying to use Git within SAS Studio (Viya4). After connecting my github account I can successfully clone into my remote repo. When I change something in Viya and try to push my changes into the remote repo I get a remote 'origin' already exists error. How can I fix thisin the Studio interface given there is no CMD in which I can do some things I used to do to fix these types or errors when working from my local computer?

    • Chris Hemedinger
      Chris Hemedinger on

      Hi Shima, this was a problem with an earlier version of SAS Viya but was fixed in 2022.1.1. Do you have a later version than that or maybe this is something different?

Back to Top