Most organizations enjoy a plethora of SAS user types—batch programmers and interactive users, power users and casual—and all variations in between. Each type of SAS user has its own needs and expectations, and it’s important that your SAS Grid Manager environment meets all their needs.
One common solution to this dilemma is to set up separate configurations based on a mix of requirements for departments, client applications and user roles. The grid options set feature in SAS 9.4 makes this task much easier. A grid options set is a convenient way to name a collection of SAS system options, grid options and required grid resources that are stored in metadata.
Why it’s important to tune SAS Grid Manager for interactive users
SAS Enterprise Guide users running interactive programs typically expect the results to be returned almost immediately. At present, the current out-of-the-box grid options are set for long-running batch jobs. These options include a latency of 20 seconds on the start of every server session, so SAS Enterprise Guide may experience unhappy delays.
More good news is the fact that SAS Enterprise Guide and other SAS software products are grid-aware. Once the optimum grid options set is defined and named, it is applied automatically whenever a user accesses the application and submits a job.
In this post, I’ll use Platform RTM for SAS to walk you through a few simple steps and provide a set of options that you can use as a baseline for tuning SAS Grid grid for your SAS Enterprise Guide users.
1) Reduce grid services sleep times.
The first tuning to perform is usually at the cluster level, to reduce grid services sleep times so that the interactive session starts faster. In Platform RTM, select Config►LSF►Batch Parameters and edit these settings:
Never set these values to 0. You should tailor the actual values to your grid, considering factors such as number of nodes, number of concurrent users, patterns of utilization and so forth. You may need multiple iterations to tune performance to suit the needs of your SAS user type. Figure 1 shows a recommended starting point.
2) Increase the number of job slots.
SAS Enterprise Guide and SAS Add-In for Microsoft Office are designed to keep the server session open for the full duration of the client session unless a user explicitly chooses to disconnect from the server. For SAS Grid Manager, this open session means that one job slot on that server is taken.
Therefore, for SAS Enterprise Guide use, you have to increase the number of job slots for each machine (use the MXJ parameter) from a default of 1 per core up to 5 or even 10 per core, depending on volume of usage. This step will increase the number of simultaneous SAS sessions on each grid node.
Interactive workloads are usually sporadic, intermittent, with short CPU bursts followed by periods of inactivity when the user is reviewing the results or exploring the data. Because these jobs are not I/O- or compute-intensive like large batch jobs, more jobs can be safely run on each machine
3) Implement CPU utilization thresholds for each machine.
Next, it is advisable to implement CPU utilization thresholds for each machine to prevent servers from being overloaded. With this limit in place, even if many users submit CPU-intensive work at the same time, SAS Grid Manager can manage the workload by suspending some jobs and resuming them when resources are available.
Changes in Step 2 and Step 3 are made at the host level. In RTM, select Config►LSF►Batch Hosts►default, edit Max Job Slots value and add the Advanced Attribute ut. See Figure 2.
4) Create dedicated queues.
Even with this tuning, one user can easily use up all of the slots of a grid by starting many SAS Enterprise Guide sessions or by writing code that uses all the available slots for a single SAS session. When a machine runs out of slots, it is closed for use and work is routed to the next available slot. If all machines are closed and no machine has a free slot, no user can get another workspace. It doesn’t matter that the user with many open sessions is not actually using the resources. He or she might go for lunch, leaving his session open on a results page with no CPU, no I/O, nothing used on the server.
The best way to prevent this is by creating a dedicated queue called EGDefault, with a UJOB_LIMIT parameter low enough (for example, 3 slots as shown in Figure 3). After that, each user will be then limited to 3 concurrent server sessions, whether started from the same client or from different SAS Enterprise Guide instances. When using SAS Enterprise Guide parallel features, the value of UJOB_LIMIT should be higher, provided that proper server sizing has been performed to accommodate for the additional resources required.
In RTM, you can create this queue selecting Config►LSF►Queues►Add. To make this the default queue for SAS Enterprise Guide users, all you have to do is create a grid options set in SAS Management Console and add this EGDefault queue as a grid option to it.
5) Create other grid options sets as needed.
There will always be ad hoc users or projects that do not fit into default categories (for example, they might be running jobs that have a high priority or jobs that require a large number of computing resources). For users requiring higher priority for their jobs or require more computing resources, it is just a case of defining a new queue such as EGPower. To prevent misuse, it's common to limit access to this special queue to selected users.
In previous releases, additional queues would been created by defining a special user group and then adding it to the USERS parameter in the queue definition. While effective, this has the disadvantage of duplicating user-related management both in metadata and in grid configuration files. With SAS 9.4, it possible to apply metadata security to grid options sets to keep all in one place—that is, in metadata.
6) Set options for other interactive and batch queues.
Finally, if you have other queues, for example, ones dedicated for example to SAS® Data Integration Studio users or to batch processing, put job slot limits there, too, to compensate the large increase to the Max Job Slots parameter we made for default hosts. Figure 4 shows the Advanced Attribute PJOB_LIMIT added to a batch queue, to enforce the limit of one batch job per physical core on every host.
When you have all queues defined, your final configuration may look like the following:
For more details about using SAS Enterprise Guide in a SAS Grid Manager, you can refer to my SAS Global Forum 2014 presentation: Effective Use of SAS® Enterprise Guide® in a SAS® 9.4 Grid Manager Environment.