With the growing use of SAS on commodity hardware, many organizations are running lots of SAS servers on separate instances of operating system in a SAS infrastructure. This configuration is great for optimizing resources, but when these SAS servers have to share data, then SAS recommends the use of a clustered file system.
This recommendation presents an issue for some companies. Because clustered file systems are not part of their standard operating system, it is an additional expense. So, to avoid driving up the cost of the hardware infrastructure for SAS, some IT administrators are proposing the use of NFS to share files among the SAS servers running on different instances of an operating system. Let’s look in more detail at the pros and cons for NFS as a shared file system with SAS.
When NFS is a wise choice
So, let’s quickly discuss why using NFS may be wise. NFS is great when used in a mostly “read” environment or for SAS shops that have small (less than 1GB) SAS data files. So, it can be good to use permanent SAS data files that are accessed primarily in a read-only manner with SAS jobs.
NFS can be affected by network bandwith, but not its speed and capacity per se. The real issue is largely one of NFS metadata cache coherency that causes the cached file system metadata to “dump” very frequently. NFS does this every time a read or write lock is placed on a file or the file’s attributes such as size change. This dumping of the cached metadata drastically interrupts large sequential writes and affects the ability to process the data because the file system is constantly re-reading via the network and updating the cached file system metadata.
When NFS is not optimal
NFS file sharing may not perform adequately when SAS users are making lots of updates to files or when data manipulation requires a significant amount of temporary storages. The SAS WORK file system is 50% write, 50% read and 100% delete when the SAS session is properly terminated. This type of IO does not work well with NFS as we’ve learned from SAS customer experience and testing within SAS. (Details on why this does not work well can be found in this SAS paper: A Survey of Shared File Systems (updated October 2014).
Please note that there are several storage arrays (many from Network Appliance and EMC Isilon) that only support an NFS-based file system. We are sure the underlying storage array is great, but because the file system associated with them is NFS-based, we strongly recommend they not be used for SAS WORK or for permanent SAS data files where lots of writes occur.
As always, please let us know if you have any questions or comments on the above.