With the growing use of SAS on commodity hardware, many organizations are running lots of SAS servers on separate instances of operating system in a SAS infrastructure. This configuration is great for optimizing resources, but when these SAS servers have to share data, then SAS recommends the use of a clustered file system.
This recommendation presents an issue for some companies. Because clustered file systems are not part of their standard operating system, it is an additional expense. So, to avoid driving up the cost of the hardware infrastructure for SAS, some IT administrators are proposing the use of NFS to share files among the SAS servers running on different instances of an operating system. Let’s look in more detail at the pros and cons for NFS as a shared file system with SAS.
When NFS is a wise choice
So, let’s quickly discuss why using NFS may be wise. NFS is great when used in a mostly “read” environment or for SAS shops that have small (less than 1GB) SAS data files. So, it can be good to use permanent SAS data files that are accessed primarily in a read-only manner with SAS jobs.
NFS can be affected by network bandwith, but not its speed and capacity per se. The real issue is largely one of NFS metadata cache coherency that causes the cached file system metadata to “dump” very frequently. NFS does this every time a read or write lock is placed on a file or the file’s attributes such as size change. This dumping of the cached metadata drastically interrupts large sequential writes and affects the ability to process the data because the file system is constantly re-reading via the network and updating the cached file system metadata.
When NFS is not optimal
NFS file sharing may not perform adequately when SAS users are making lots of updates to files or when data manipulation requires a significant amount of temporary storages. The SAS WORK file system is 50% write, 50% read and 100% delete when the SAS session is properly terminated. This type of IO does not work well with NFS as we’ve learned from SAS customer experience and testing within SAS. (Details on why this does not work well can be found in this SAS paper: A Survey of Shared File Systems (updated October 2014).
Please note that there are several storage arrays (many from Network Appliance and EMC Isilon) that only support an NFS-based file system. We are sure the underlying storage array is great, but because the file system associated with them is NFS-based, we strongly recommend they not be used for SAS WORK or for permanent SAS data files where lots of writes occur.
As always, please let us know if you have any questions or comments on the above.
13 Comments
Hello Margaret and Fellow SAS users,
Thanks for the information regarding NFS in a SAS platform.
I would never use NFS for any SAS datasets, neither permanent SAS libraries, or WORK/UTILLOC filesystems.
But I do have questions regarding the possibility to use NFS on some SAS deployment directories.
For example, our SAS platform is on a Solaris operating system, it composed of 8 servers:
1. SAS Metadata Server Cluster: has 3 SAS Metadata Servers. All three servers:
- share SAS Metadata server backup directory;
- share SAS Metadata server installation directory;
2. SAS Compute Server Cluster (GRID): has 3 SAS GRID nodes. All three GRID nodes:
- share SAS GRID compute server installation directory;
- share SAS GRID compute server configuration directory;
3. SAS Mid-tier Server Cluster: has 2 SAS Mid-tier Servers. The two servers:
- share SAS Mid-tier installation directory;
4. All the above SAS tiers & servers, :
- share a centralized SAS Configuration Backup Vault;
- share SAS Software Depot directory;
The above shared directories or filesystems are not for SAS datasets, or WORK/UTILLOC filesystems, but for SAS deployment/installation/configuration/backup purposes.
My question is, could NFS used for these shared directories or filesystems ?
I appreciate your prompt reply and expert opinion !
Alex
In general, the answer is yes to all of the above.
My only concern is how much writing the backups will do. The main issue with NFS is doing writes to it.
Hope this helps.
Margaret
Just an update to my previous comment around CIFS and the cache entries. These changes alleviate the issue BUT do not make it completely go away. There still appears to be windows of time, probably measured in microseconds, where two hosts can get the same directory entry and one hosts data disappears.
Margaret's advice, of Jan 2015, not to use CIFS as a shared file-system is very valid.
For customers who are using CIFS/SMB to access data using Windows 2008R2 you must make sure that the hotfix described in Microsoft kb article 2646563 is installed. This fixes a known issue, driving by a timing problem, when SAS data sets are deleted and immediately recreated in the same remote directory.
Based on one customer situation failures were intermittent and occurred about 1 time in 1000 delete/create sequences. In this customer the problem described was an incorrect Authorization refused message.
A further update to the Microsoft kb article above. Even with the above hotfix installed there is still a slight window of opportunity for the DELETE/CREATE sequence described in the kb article to fail. This can be addressed by setting: DirectoryCacheLifetime to 0 under
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters
See https://technet.microsoft.com/en-us/library/ff686200%28v=ws.10%29.aspx for more details. This may have performance implications but it should prevent the problem occurring.
Further to my previous note in this space. Extensive customer testing has shown that only changing DirectoryCacheLifetime to 0 reduces the issue BUT to solve the problem completely these THREE registry values must be set to zero:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters
1. DirectoryCacheLifetime to ZERO.
2. FileNotFoundCacheLifetime to ZERO.
3. FileInfoCacheLifetime to ZERO.
See https://technet.microsoft.com/en-us/library/ff686200%28v=ws.10%29.aspx for full details on this matter.
Margaret,
You're right on the money. Again. As Bart already illustrated NFS has very cumbersome metadata management. I paraphrase a famous fellow countryman: "Use of NFS considered harmful". If you see to what great length clustered filesystem solutions have to go to overcome the issues of file metadate and locking, a subject largely disregarded by NFS or left to the client, it is amazing to see it coming up again and again. Of course a NAS is better manageable for large generic IT shops as we find them in Government and Oil&Gas. But for us data pumping punters it's just bad news. We are currently developing our SAS grid and discussing our options for shared storage.
So, much to my surprise I was recently confronted with a EMC Isilon publication titled "EMC Isilon scale-out storage for SAS grid computing". No surprise it concludes that it works. At 60MB/s/core no less! (which imho is rather underwhelming). It can't be very new as it uses SAS 9.2 for the benchmarks. It is also full of marketing poetry that does evoke a feeling of SAS endorsement without ever confirming that.
I look forward to your thoughts on how I shoud read this in the light of this blog post.
Thank you for your comments.
The paper you referenced has been pulled, so I am surprised you were able to get a copy. The most recent paper on using SAS with EMC Isilon is this paper http://support.sas.com/resources/papers/Advisory-Regarding-SAS-Grid-Manager-with-Isilon.pdf Please let me know if you have any questions on what is in this paper.
Could you please tell me if there is any good reading material which discusses pros/cons of having SAS on a NAS vs SAN environment. Basically SAS on NAS vs SAS on SAN. Your help is appreciated.
Not sure this article is correct. Your talking about nfs and sas datasets. But then blast nfs for the work directories so you are covering 2 very different topics.
1 nfs.. basically dont use nfs from a server .. using nfs from say isilon is preferred especially with its massive infiband and 8x10gbe trunks.
2
Work directories.. should always be local to the machine and on fastest drives possible. Whether lun on fiber with flash IO local or local mix of flash and 0plus1 raid or just raid0. etc..
NetApp also supports windows file services - CIFS.
Please note that SAS does not like using CIFS for a clustered file system for Windows for many of the same reasons we do not like NFS. Details can be found in the A Survey of Shared File Systems paper. http://support.sas.com/rnd/scalability/papers/SurveyofSharedFilepaper_20131010.pdf
I would like to put it a bit blunter: never use NFS on the SAS server, at least not in Linux (we tried both SUSE and RHEL).
IT implemented SAS on NFS on our Linux environment, against our advise.
It was a disaster. When one process was writing a data set to a disk, all other processes that read from that same disk stalled.
and this was not SAS-related, it was easy to replicate:
1. start a dd in one shell that writes a big file to an NFS file system
2. start another shell and while the dd is running, do an ls on that same file system. it takes > 30 seconds before it comes back with the file list.
The NAS system never made it to production. After 2 years IT agreed that we should have a SAN; works great.