Recently my wife and I took our annual anniversary trip – this time we went to the Grand Canyon, staying in Las Vegas. In researching our options to fly from Raleigh-Durham (RDU) to Las Vegas (LAS), we had several different selection criteria:
- what time we wanted to leave
- what time we wanted to arrive
- price
- number of stops
- layover time
- airline – loyalty program
- type of aircraft – seating, amenities, food, wifi
All of the flights we looked at would get us from RDU to LAS and back. So the destination wasn’t the issue – it was how much value we placed on each of the attributes: arriving in the afternoon (hotel check-in is 3:00pm) versus spending more money for a nonstop flight, for example. We made our decisions based on our specific needs at that time. We also have different opinions of what was important (I’m basically cheap, and my wife refuses to take the red-eye flight).
The evaluation of storage for a SAS solution can be viewed in a similar fashion. There are tradeoffs to be made, or certainly criteria which will be evaluated and prioritized. This blog posting will briefly examine three such attributes and how they may impact storage in a SAS environment.
Who says you can’t have it all?
This diagram highlights three of the more common attributes that are considered when evaluating storage. While there are certainly other considerations (capacity, interfaces, architecture), these three are usually involved in most storage decisions. This diagram also suggests that there are tradeoffs to be considered: for example, between price and performance (higher performance may require higher price). Let’s briefly examine each of these, and where we may see tradeoffs in a SAS environment.
Price
Price is usually among the first attributes that come up in any discussion of storage. Everyone is looking to save money, and unfortunately storage often gets compromised. Consider this scenario: our SAS deployment will need about 5 terabytes (TB) of storage. In terms of raw capacity, a new 5TB disk drive can be bought from a number of online vendors for around $150.00 USD. While this drive may meet the capacity requirements, it most likely is not the best selection for a SAS deployment – especially if there are performance or availability considerations. Typical enterprise-class SAS storage may involve configurations with multiple disks and controllers, and perhaps shared storage such as Network Attached Storage (NAS) or Storage Area Networks (SAN). Factoring in these, and possibly other, considerations would most likely (significantly!) increase the price of our storage.
Performance
SAS applications are consumers of storage, and have significant performance expectations for I/O throughput. Many SAS field consultants can share stories of under-performing storage leading to failed deployments and unhappy customers. SAS has minimum recommended I/O throughput rates of file systems that are to be used in a SAS environment, and the Performance Evaluation team within SAS R&D has written several papers that document best practices and tuning guidelines. There is even a usage note about testing throughput for your SAS9 File Systems. Multiple configuration options are reviewed and discussed, ranging from shared file systems to external SAN or NAS arrays.
Availability
Deploying SAS applications into a business-critical environment or where there are availability requirements such as a Service Level Agreement require careful attention to the type and configuration of storage used. Since SAS is implemented on the host OS file systems, commonly used high availability strategies can be used effectively. From simple strategies, such as configuring local storage using RAID mirroring, to more complex enterprise-class solutions, such as redundancy through a SAN, the appropriate level of high availability can be designed and deployed to assure that the storage is designed to meet the needs of the business.
So how does all this fit together?
As you can see, none of these criteria should be considered independent of the others when designing and evaluating storage solutions for SAS environments. There will be tradeoffs made in the evaluation process, and priorities will be established. For some areas, such as performance, there are guidelines established by SAS R&D. In other areas, specific needs of the customer (a specific SLA, for example) may dictate specific design decisions. In addition, there’s some flexibility in certain areas – filesystems containing SAS permanent data should be allocated to a more available, more protected storage area than the temporary filesystem of SASWORK. A detailed analysis of the storage needs of the SAS deployment as a part of the overall architecture design will consider these three, in addition to other criteria.
In case you were wondering, we didn’t take the red-eye.
2 Comments
Could you please tell me if there is any good reading material which discusses pros/cons of having SAS on a NAS vs SAN environment. Basically SAS on NAS vs SAS on SAN. Your help is appreciated.
Thanks for your question.
Here are a few papers that are available on support.sas.com:
Virtualized Environment for SAS® High-Performance Analytics
http://support.sas.com/resources/papers/proceedings13/459-2013.pdf
Best Practices for Data Sharing in a Grid Distributed SAS® Environment
https://support.sas.com/rnd/scalability/grid/Shared_FileSystem_GRID.pdf
A Survey of Shared File Systems
Determining the Best Choice for your Distributed Applications
https://support.sas.com/rnd/scalability/papers/SurveyofSharedFilepaper_20131010.pdf
Usage Note 42197: A list of papers useful for troubleshooting system performance problems
http://support.sas.com/kb/42/197.html