When can too much memory hurt SAS?

By Margaret Crevar on SAS Users February 15, 2016

With memory being affordable now, we are constantly being asked by customers about doubling and tripling the amount of RAM that SAS recommends. More is better, right?

Often, but we have found a specific scenario using SAS datasets where that is not the case. Remember, that increasing memory generally means a commensurate increase not only in operational and computation memory, but in the host system file cache as well. The host system file cache is the portion of memory where SAS pages all of its READS and WRITES to and from storage.

Consider this quick example that arose recently. A customer had a host with a lot of memory, and hence, hundreds of Gigabytes of host system file cache, all to himself. This was a quiet system in which he enjoyed the spoils of excess.

Here is a quick example of what can backfire with populating a large host cache with very large SAS files. Consider the SAS program:

               DATA newds;
                  SET testds;
               RUN;

               DATA newds;
                  SET newds;
               RUN;

The first DATA step creates a new file, by setting an existing file. In the second DATA step, we are updating the dataset, newds, “in place.” The second DATA step in this case runs significantly longer than the first DATA step which set the dataset into a new file, even though they appear to be doing the same type of operation. The reason the second DATA step takes longer is the newds.sas7bdat.lck file that is created in SAS WORK by the DATA statement, cannot just be committed and closed as newds.sas7bdat until all the data associated with the original SAS data set newds.sas7bdat in the SET statement, has been flushed from file cache (i.e. RAM). The exact same pages of the original newds.sas7bdat residing in the host cache, are not being updated, they are being used to create a copy of that file into the new locked file newds.sas7bdat.lck. So we can’t commit the new file, until the pages from old file with the same name is flushed from host cache, and the original file deleted on storage.

If this file is 100s of Gigabytes in size, and most of its pages reside in the host system file cache, this flush can take a considerable amount of time, much longer than just a rename of the file like the first DATA step above, for instance. In the second DATA step, the original file must be emptied from cache by the page flush deamons, and on storage, to be replaced by the newds.sas7bdat.lck version before it can be closed and committed to storage.

So, very large SAS data files that fit into host system file cache, and have to be flushed before SAS can updated that file with the same name, can lead to much longer response times for that operation. This delay is commensurate with the size of the file and how many of its pages reside in the host cache. Please be aware it is generally not a good idea to update a file “in place,” e.g. update a file with the same name, for very large files, to avoid this type of behavior.

About Author

Margaret Crevar
Manager, SAS R&D Performance Lab

Margaret Crevar has worked at SAS since May 1982. She has held a variety of positions since then, working in sales, marketing and now research and development. In her current role, Crevar manages the SAS Performance Lab in R&D. This lab has two roles: testing future SAS releases while they're still in development to make sure they're performing as expected; and helping SAS customers who are experiencing performance issues overcome their challenges.

14 Comments

Simon Dawson on February 20, 2019 9:03 pm

I've observed that different systems and filesystems behave very differently with respect to their interactions with the page cache.

When I was using ext4 and overwrote an existing file (eg. dd if=source of=target, where target already exists) the close() system call is made and there was a code path taken in the kernel that went into ext4_release_file() that caused the pages for the file to be flushed synchronously when ext4 called filemap_flush() and it killed performance. If I mimic the behavior that SAS has when you update a dataset in place this didn't happen. The IO used the page cache and I was basically stuck behind the speed of CPU as the kernel shuffled data pages around in the page cache. Everyone was super fast. When I used ZFS on Linux I would see the throughput fluctuate slightly based on how close to the periodic sync ZFS does that I began the write.

I'd say the moral of the story is experiment, trace and tune.

Reply
- Margaret Crevar on February 21, 2019 9:04 am
  
  You are correct. ET4 does lots of journaling as well. And this is the reason we prefer customers use XFS from Red Hat over EXT4. Did you test XFS?
  
  Reply
  - Simon Dawson on February 21, 2019 8:50 pm
    
    Took a peak at XFS out of interest. The XFS throughput was quite a bit higher on the system I'm experimenting on. For those who are interested this system I'm using for my tinkering is a DELL Latitude 7480 with an Intel i7-7600U @ 2.8 Ghz running Ubuntu 18.10 with the generic 4.18.0-15-generic kernel.
    
    XFS 3.3 GB/s
    ext4 2.4 GB/s
    
    I'd take these measurements with a grain of salt though I did zero tuning for either configuration. The behavior I described with regards to the filemap_flush() blocking the close occurred in XFS also. In the case where the IO got blocked behind writing pages from the page cache the throughput was basically the same for XFS and ext4. There wasn't much in it.
    
    The line in XFS where filemap_flush() is triggered is here https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/cosmic/tree/fs/xfs/xfs_inode.c#n1669 This causes the close syscall to block and it takes ages. There is commentary in the source that explains why this occurs.
    
    This code path in the kernel where the close syscall gets blocked by close wasn't something I could triggered from any SAS application. It only occurs in the case where a process calls open with the O_TRUNC flag. SAS opens a new file with the .lck extension writes then replaces the file with an unlink and rename so this slow code path I found where pages are flushed synchronously doesn't get called.
    
    Reply
Chris on April 13, 2016 10:43 pm

Sorry for the gibberish. Posting aligned monospaced text doesn't seem possible. Never mind the benchmark.

My first point stands though. 🙂

Reply
Chris on April 12, 2016 7:10 pm

Hi Margeret,

My point was that the additional RAM should make the second step faster *than if there was no RAM* (not compared to the first step) since the table is read from RAM rather than from disk,(and even if the cache flush during output negates some of the gains).

On the topics optimising access speed, in my experience, any procedure accessing large data sets can be sped up by properly setting the reading options. Here is a benchmark (details in the book) where the read time using SAS defaults is 1.96s seconds. The wrong options can slow it down to almost 10s (top left of table) , the right ones can speed it up to 0.4s (bottom right).

BUFNO 1 5 25 100 500 BUFSIZE SGIO 0 no 1.96 4k no 2.41 2.62 1.66 1.65 1.67 yes 9.62 6.09 1.55 1.18 1.39 8k no 2.19 1.13 1.68 1.22 1.25 yes 5.40 3.20 1.19 1.02 0.45 16k no 1.04 1.04 2.77 1.15 2.60 yes 3.18 1.91 0.62 1.09 0.38 32k no 1.93 1.81 0.94 1.97 0.93 yes 2.16 1.42 0.91 0.79 0.42 64k no 1.86 0.89 2.96 1.01 1.27 yes 0.82 1.96 0.96 0.71 . 128k no 0.89 1.09 1.06 3.30 1.23 yes 1.22 0.99 0.38 0.40 .

Reply
Chris on April 10, 2016 7:00 pm

Hi Margaret,
SGIO would avoid this issue under Windows. But is it really an issue?
I would argue that maybe a large OS cache is still beneficial in the case you highlight. While there is time wasted in flushing the cache's contents for output, there is time saved by reading the file from the cache for input. And intuitively (I haven't benchmarked it), the time saved by not reading the file from disk (very slow medium) is probably greater than the time used to flush the RAM cache (very faster medium).
So yes, step2 is slower than step one, but all the RAM still makes step2 faster than if there was nor RAM cache at all. No?
I did a lot of benchmarks on direct IO and buffers when using SAS data sets in http://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490

Reply
- Margaret on April 11, 2016 7:49 pm
  
  Chris,
  
  Thank you for you comment. The issue is we have several SAS customers who are adding memory to make SAS DATA steps run faster and questioning why it is not helping. That is the reason for the initial post to explain why they are seeing what they are seeing.
  
  I am also glad you are seeing good performance using direct IO and buffers with SAS> My experience has been this does not help most customers, unless they heavily use the SAS DATA step.
  
  Reply
Kirk Paul Lafler on February 19, 2016 7:03 am

Nice article, Margaret!

Kirk

Reply
SF on February 17, 2016 8:59 pm

This post is misleading. The amount of memory has nothing to do with this issue. This behavior is observable on even the smallest table build. In my opinion, it is a deep rooted design flaw.

Reply
- Margaret Crevar on February 18, 2016 3:21 am
  
  Thank you for your post. Please note that the performance issue is associated with how long it takes to flush the original file out of memory. Flushing memory is not done by SAS, but by the operating system's flush daemon. If you feel that this daemon should work faster, I suggest that you work with your operating system's support team to see if there are any tuning parameters that can be set to make this task run faster. If you learn of any, please share them with this community.
  
  Reply
Margaret on February 16, 2016 10:32 am

You are correct.

Reply
Prashant on February 15, 2016 10:24 pm

Hi Margaret,

It would have been helpful to represent both the Data Step cases presented above Visually to demonstrate how the New Dataset is being created ?

So in the first Data Step Case the newds.sas7bdat.lck file also will need to wait for the entire data to be read from testds.sas7bdat and only then it will be committed as a New file ie newds.sas7bdat.Also in this case the Commit need not wait for any deletion of the Input Data set (testds.sas7bdat) because it has a different name. Hence the time for creating the New dataset is less. Am i Right or Wrong? Please advice.

Thanks

Reply
Nikola Markovic on February 15, 2016 6:54 pm

Hi Margaret,

Not being able to control the way the kernel manages the cache is frustrating. This is where I tend to dynamically reassign some of that underutilised ram as tmpfs, and use it properly instead of depending on the kernel to figure it out and sporadically flush to disk. Would using the USEDIRECTIO option actually improve real life performance in the scenario you describe here? I know it's not an option with SASWORK, but, say, if it was another libname?

FYI - we're presenting a paper on the technique at SGF. I think you'll find it interesting.

Nik

Reply
- Margaret on February 16, 2016 10:32 am
  
  Direct IO would solve this issue, but like you said, it does not work with the SAS WORK library. I am very interested in your SAS Global Forum paper and will do my best to attend your presentation.
  
  Reply

Blogs

Blogs

When can too much memory hurt SAS?

About Author

14 Comments

Leave A Reply Cancel Reply