SAS recently performed testing using the Intel Cloud Edition for Lustre* Software - Global Support (HVM) available on AWS marketplace to determine how well a standard workload mix using SAS Grid Manager performs on AWS. Our testing demonstrates that with the right design choices you can run demanding compute and I/O applications on AWS. You can find the detailed results in the technical paper, SAS® Grid Manager 9.4 Testing on AWS using Intel® Lustre.
In addition to the paper, Amazon will be publishing a post on the AWS Big Data Blog that will take a look at the approach to scaling the underlying AWS infrastructure to run SAS Grid Manager to meet the demands of SAS applications with demanding I/O requirements.
System design overview – network, instance sizes, topology, performance
For our testing, we set up the following AWS infrastructure to support the compute and IO needs for these two components of the system:
- the SAS workload that was submitted using SAS Grid Manager
- the underlying Lustre file system required to meet the clustered file system requirement of SAS Grid Manager.
The SAS Grid nodes in the cluster are i2.8xlarge instances. The 8xlarge instance size provides proportionally the best network performance to shared storage of any instance size, assuming minimal EBS traffic. The i2 instance also provides high performance local storage, which is covered in more detail in the following section.
The use of an 8xlarge size for the Lustre cluster is less impactful since there is significant traffic to both EBS and the file system clients, although an 8xlarge is still is more optimal. The Lustre file system has a caching strategy, and you will see higher throughput to clients in the case of frequent cache hits which effectively reduces the network traffic to EBS.
Steps to maximize storage I/O performance
The shared storage for SAS applications needs to be high speed temporary storage. Typically temporary storage has the most demanding load. The high I/O instance family, I2, and the recently released dense storage instance, D2, provide high aggregate throughput to ephemeral (local) storage. For the SAS workload tested, the i2.8xlarge has 6.4 TB of local SSD storage, while the D2 has 48 TB of HDD.
Throughput testing and results
We wanted to achieve a throughput of least 100 MB/sec/core to temporary storage, and 50-75 MB/sec/core to shared storage. The i2.8xlarge has 16 cores (32 virtual CPUs, each virtual CPU is a hyperthread on a core, and a core has two hyperthreads). Testing done with lower level testing tools (fio and a SAS tool, iotest.sh) showed a throughput of about 3 GB/sec to ephemeral (temporary) storage and about 1.5 GB/sec to shared storage. The shared storage performance does not take into account file system caching, which Lustre does well.
This testing demonstrates that with the right design choices you can run demanding compute and I/O applications on AWS. For full details of the testing configuration and results, please see the SAS® Grid Manager 9.4 Testing on AWS using Intel® Lustre technical white paper.
12 Comments
HI Margaret, as a new SAS Admin coming into the environment the architecture group would have to relay some of the initial information you are discussing here -- correct ? I would like to know where to start to gather all the vital parts. What to run or review to gather all the initial facts ? thanks for your availability
What information are you looking for? General administration information, or information specific to administrating SAS 9.4 on the public cloud?
Hi,
Nice information about run SAS Grid Manager in the AWS cloud. Can you explain about What are the keys to Amazon executing so well on AWS?
Thanks,
BenStokes,
AWS Developer
I don't understand the question. " What are the keys to Amazon executing so well on AWS?"
Is Lustre file system the only file system supported by a SAS 9.4 grid on AWS?
What are the other supported file systems for a SAS 9.4 grid on AWS?
Yes, Lustre is the only one that is available and approved from Amazon for SAS Grid.
Quick question about the diagram on (http://blogs.sas.com/content/sgf/2015/04/27/can-i-run-sas-grid-manager-in-the-aws-cloud/), what is the licensed core count? I.e. that configuration requires XX core licenses. Thanks!
The i2.8xlarge AWS EC2 instance has 32 vCPUs which equates to a 16 core SAS license per EC2 instance.
More details on each AWS EC2 instance can be found on this web page. https://aws.amazon.com/ec2/instance-types/ To determine the SAS core license value, divide the vCPU number by 2.
Margaret,
Is this rule applied to Exalogic, which works with the same architeture of VCPU?
Are you referring to Oracle's public cloud offering? The answer is yes if their definition of a VCPU is actually a hyperthread on their system. SAS Foundation and SAS Grid only run on physical cores.
In fact, I was referring to Oracle Exalogic on-primeses, but the definition of a VCPU is the same of the Oracle's public cloud when the on-primeses hardware has a virtualized deployment. So, do I just need to count the physical core for licenses?
Best Regards.
That is correct. SAS Foundation and SAS Grid are license for the number of physical cores in your hardware infrastructure.