Performance tips for viewing data in SAS Enterprise Guide

28

SAS users love to look at data. And the data grid in SAS Enterprise Guide is a convenient way to view the contents of a data set. While small data sets can be rendered lickity-split for quick viewing, sometimes people get justifiably anxious when opening very large data. Perhaps they've been burned with a bad experience, because opening a huge transaction table can hang up a system. When dealing with very large data, it's wise to be cautious.

In general, the SAS Enterprise Guide data grid can open and show data quickly, even if the data sets are very large. This is because the data grid does not read in all of the data records at once, but reads in only the records that are needed to fill the rows of the grid. The data grid uses a "virtual mode", which allows you to scroll quickly around your data set, no matter the size.

datagrid
In addition to the scroll bar, which shows you approximately where you are in the data set, there are a few keyboard shortcuts that can help you to navigate:

  • Ctrl+End - takes you to the last record, last column in the data set.
  • Ctrl+Home - takes you to the first record, first column in the data set.
  • Ctrl+G - brings up a "Go to" window, in which you specify exactly which record number to bring to the top of the view.

Note: The concept of "record number" is a specific to SAS data sets, since traditional SAS files store records sequentially. When you view a database table (accessed through a SAS/ACCESS library), the idea of "record number" might not make as much sense, since the database rules determine the sequence. However, you can still use the keyboard shortcuts to navigate your view.

Even for all of its built-in smarts, the data grid can still perform slowly in certain situations. If you're experiencing slow performance, here are some tips to help.

Understand your network latency. SAS Enterprise Guide retrieves data records from your SAS session, which is typically on another machine in your network. If your network connection is slow or the remote machine is very remote (such as in another country), fetching those records will take more time.

Avoid opening data unnecessarily. When you add data to your project, whether using File->Open Data or by running a task that produces output data, SAS Enterprise Guide will automatically open the data for viewing. Simply opening the data can cause a noticable delay when there is network latency or a slow database connection. If you're already familiar with the data and don't need to see the records every time, you can change this default behavior. From Tools->Options, select Data General and uncheck the option for "Automatically open data when added to the project."

Avoid scrolling through very wide data. As I stated earlier, large data isn't generally a problem in SAS Enterprise Guide. However, when the data are very wide, the grid can be slow. Why? Because even though the data grid reads only enough data records to fill the grid, it reads only entire records -- even if some of the variables in the record aren't visible on the screen at the time.

I define "wide" data as hundreds or thousands of variables, or maybe fewer variables that each have very large lengths. Consider these two LENGTH statements, each of which allocate 2000 variables but yield dramatically different record lengths:

/* 2000 numeric vars, 8 bytes each = 16,000 bytes per record */
/* a data set with just these fields can render quickly in SAS EG */
length x1-x2000 8;
 
/* 2000 character vars, 1000 bytes each = 2,000,000 bytes per record! */
/* a data set with these fields can really slow things down */
length c1-c2000 $ 1000;

Use the Data Explorer with care. The Data Explorer (added in SAS Enterprise Guide 5.1) is a popular new feature (see Marje Fecht's enthusiastic review). The Data Explorer can provide at-a-glance "Quick Stats" of variable distribution and frequency. However, the behind-the-scenes analyses that support those stats can be expensive for very large data. SAS Enterprise Guide sets some automatic constraints in order to limit the impact, but you should set your expectations accordingly.

Understand the special treatment for database tables. When you access data in a library using a SAS/ACCESS engine (for example, connecting to Oracle or Teradata), SAS Enterprise Guide automatically limits the data grid view to 10,000 records. You can configure that number in the application options, but the reason for the constraint is to limit the impact on the database connection and to keep your DBA happy (well, as happy as DBAs tend to be, anyway). 10,000 records should be enough for you to "eyeball" the shape of the data. If you need to see rows that meet specific criteria, consider using the Query Builder.

To paraphrase King Burger's Bon Qui Qui (from the famous MAD TV sketch): You can view your data your way, but don't get crazy. I hope that with a little knowledge about what happens behind the scenes, you can click through your data with more confidence.

Share

About Author

Chris Hemedinger

Director, SAS User Engagement

+Chris Hemedinger is the Director of SAS User Engagement, which includes our SAS Communities and SAS User Groups. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies

28 Comments

  1. Hello Chris,

    Great reading all your posts, very informative. Wanted to ask you, if it is best practice to register SAS lbraries in SAS SMC for the SAS users at my site. Or is it better if every user has their own user id and password to connect to their data sources like Oracle, DB2, SQL etc.

    Please let me know, appreciate your opinion.
    Ritesh

    • Chris Hemedinger
      Chris Hemedinger on

      Ritesh,

      Thanks for the kind words.

      Regarding your question, I don't think this is an "either/or" situation. If you use DBMS libraries at your site, it's best to provide access control at the database level. Whether that's by using group permissions or individual accounts -- that's up to you and the DBAs. But it's not the recommended practice to rely on SAS metadata to "secure" access to these DBMS resources.

      That said, there are advantages to registering libraries and tables in SAS metadata. When you have registered tables, you can organize the tables into the SAS Folder structure that might make them easier to find and work with. Also, you can then use these tables as the basis for SAS information maps and reports in SAS Web Report Studio.

  2. Chris thk u for the opinion.

    Our site has 400 users and each user having their own access to the DBMS libraries was being an issue. We have SAS EG, SAS DI. We are slowly moving to a Teradata shop and was planning to follow this SAS link which suggests this preferred method:

    http://support.sas.com/resources/papers/93286_TS808.pdf

    "As a SAS software user, you might need to query a Teradata table from SAS®
    business-intelligence (BI) clients. There are several ways that you can access a Teradata database from a BI client application. However, the preferred method, which is described in this document, is to define a Teradata library in SAS® 9.2 Management Console or later. SAS
    Management Console provides a point of control for security management of this library."

    Please let us know, Thanks, Ritesh

    • Chris Hemedinger
      Chris Hemedinger on

      Ritesh -- yes, that's a good document with good guidance. Note that it also lays out your possibilities: use a group login or use individual accounts, each of which can be managed using Auth Domains in SAS Metadata. If you use a group account, then anyone who knows the connection details and credentials can access the library via SAS code, even if the library definition is "hidden" and controlled with SAS Metadata permissions.

  3. I just upgraded to SAS 9.4 and EPG 6.100

    When I open a node in a project that had previously been executed it now seems to take an extremely long time to open and even tells me "Not Responding" for up to a minute or so.
    Then the program appears and works fine. Until I go to another node and return to that node.
    Any ideas?

    • Chris Hemedinger
      Chris Hemedinger on

      Jim,

      It's certainly not supposed to be slow...so something is wrong.

      Is this a program node? I know that there was a hotfix for some specific performance issues (and a few other bugs) in July -- have you applied it?

      Otherwise -- is the program large? Is it stored in the project or on the file system, and if the file system -- is it local or remote? You should contact SAS Tech Support for better diagnosis and steps.

  4. Phillip Higgins on

    Hi Chris,

    You've posted some great little tips here - thanks for sharing.

    Cheers,
    Phillip

  5. Hi there,
    This is the first time I work with SAS EG and I'm having trouble to familiarize with the way the data is viewed here, I was custom to the "old" way with the explorer menu on the left side (in SAS basic) and now I don't know how to open the temporary data created in my work space, they are just showed when I re run the whole code, but if for any reason I run just a piece of code, the rest of the datasets seems to "disappear", I'd like to know the way to view all the datasets created in the work space while running just pieces of code.

    Thanks!

    • Chris Hemedinger
      Chris Hemedinger on

      Ruth, you can add your data sets to the project explicitly by selecting them from File->Open->Data or exploring the Libraries in the Servers List. By adding them this way, they won't "disappear" when they aren't referenced in a SAS program as output data -- especially when you run just a portion of the program that doesn't include that data as output.

      Use the View->Workspace Layout options to split your screen so you can see two data sources at once.

  6. Hi Chris, our SAS server is in AWS EC2 cloud. The server itself is fast enough to run programs, but if I want to browse the dataset on EG, it's very slow for the dataset to be opened, even a very small dataset, say sashelp.class. Any idea how to make it faster ?

    • Chris Hemedinger
      Chris Hemedinger on

      Hi Alex,

      I know that SAS Enterprise Guide v7.11 (the update to 7.1) introduced a few targeted improvements to address latency in a WAN or cloud environment. However, I don't think that the process for reading data records has improved all that much. I don't know if there are any settings that you can tweak in the cloud instance to make it better.

      You could post your question to the Administration community to learn if any other users have seen this and found a remedy that works for them.

      • Thanks for the quick reply. You mentioned 7.11 changes, does that mean 6.1 is better for the connection over WAN ? Thanks

        • Chris Hemedinger
          Chris Hemedinger on

          No, I think 7.11 would be better -- that's where the improvements are. But I don't think that the efficiency of reading data records was greatly improved; it was mostly around the overhead of making a connection to the server, retrieving SAS library information, etc. I think you'll find 7.11 a little better for this, but I wouldn't expect a dramatic difference.

          • Thanks Chris for the information. I think if we upgrade the infrastructure (network, etc) , I think it would definitely help the dataset browsing speed on EC2. But my question, how much this hardware improvement could help, or do you think EG itself is a bigger factor? In other words, hardware could help but not too much ? If that's the case, what's your suggestion ?

          • Chris Hemedinger
            Chris Hemedinger on

            Hi Alex, I know that reading data in the data grid within EG is slower over a WAN or Cloud instance. I don't know how much of a difference that a network upgrade would make for this particular behavior, but no doubt it would help performance / user experience in general.

  7. It appears that there is a know bug *feature* in Enterprise Guide 7.1 with the data grid that makes viewing datasets slower, if not impossible depending on the size of your datasets. http://support.sas.com/kb/55/102.html .This is compounded when opening datasets over a WAN. I've been experiencing slow performance with EnterpriseGuide 7.1 with a client in the UK, and as a short term measure we're going to use Enterprise Guide 6.1 with SAS 9.4, until the release of Enterprise Guide 7.1.2, that we've been advised will have improved data grid performance.

    • Chris Hemedinger
      Chris Hemedinger on

      David, that's true -- and the developers have been working hard to address this. I'm using the (under-development) 7.12 version right now, and it's really very nice. In addition to the performance boost, there are some cool usability things added (like a zoom +/- feature!). I think you're going to like it.

      • Hi Chris, do we have a ETA for Enterprise Guide 7.12? I was led to believe it would be around the end of February 2016 which isn't too far away now. Although we chose not to use 7.1 with SAS 9.4 due to the data grid issue we're now find that Enterprise Guide 6.1 is unstable and crashes out on a regular basis for a number of end users. Bizarrely this doesn't happen with 7.1? Any news would be much appreciated.

        • Chris Hemedinger
          Chris Hemedinger on

          David, timely question. I think that today is the day! Hopefully if you check back tomorrow, you'll find that new software is available. Note that 7.12 is not an automatic update (like the hotfix packages) -- you'll have to update your software depot and install it.

          • Chris Hemedinger
            Chris Hemedinger on

            Oh, and the updated data grid is very nice -- can handle large data very smoothly and has some great new features, including a ZOOM! Ctrl+ and Ctrl- (or Ctrl+mousewheel) will zoom you in and out!

  8. Thanks for the information Chris, You wouldn't happen to know if Enterprise Guide 7.1.2 would be available in depots for SAS 9.4 M2?

    I'll take a look in our latest depot, as we downloaded one as of the 19th February.

    It will be good to catch up again, in Vegas :)

    Cheers

    David

    • Chris Hemedinger
      Chris Hemedinger on

      The official release happened Feb 21, I think. You can run EG 7.12 with SAS 9.4m2 (or any SAS release back to 9.2), but a 9.4m2 depot won't have it. You'll need a software order that contains the latest versions of SAS, or a separate order just for the EG portion. Just to be clear, you don't need to install 9.4m3 (or the latest) to use EG 7.12, but you do need an updated software depot.

      • Hi Chris,

        Many thanks for the information, as always. Does SAS 'officially' support Enterprise Guide 7.12 with SAS 9.4 M2?

        It's just that someone in the UK mentioned 7.12 is only officially supported with SAS 9.4 M3. Is that incorrect? It would be nice to confirm this, as it means a large UK SAS client will be left using Enterprise Guide 6.1 (due to the data grid issues with 7.1 & 7.11), if Enterprise Guide 7.12 isn't officially supported on 9.4 M2.

        Thanks

        David

    • Chris Hemedinger
      Chris Hemedinger on

      Larry, sorry to hear that! EG 7.13 has more data grid performance improvements, including loading data on a background thread so that it won't hang up a session. Have you tried that?

Back to Top