Help, I lost my SAS server!

Like most SAS users and administrators, you usually don't know where your backend SAS servers are located--probably in some basement server farm or perhaps another building or even another town. But I'm sure you do know that your SAS client application must have a way to reach services running on those hosts. But what happens if the server fails? How will your application find the lost server? How can a grid-enabled SAS servers be configured to cause the least disruption to services?

Let's review what we know about client applications and their communication with host servers:

The SAS client applications must know the SAS server name (or at least an alias) and port where services are listening.
The SAS client applications store the SAS server name and port for services that are required in configuration files or in metadata.

For most grid-enabled installations, when a SAS server is taken offline or fails to start, those services are protected by failover mechanisms and restarted on a different server. If server names and identities are stored in the client application or metadata, how can the connection be restored? Well, as SAS administrator, you have to manually update the SAS client application’s configuration information or SAS Metadata definitions with the new location—not an easy task for a business user and not manageable for large enterprises.

There are two solutions to enable client applications to access lost services regardless of whether the service is running on a primary or a failover server:

a hardware solution using a virtual IP switch or IP load balancer
a software solution using DNS resolution

In this post, I'd like to detail more about a hardware solution for avoiding lost servers!

Normal connection sequence

Finding the IP address for a server name is referred to as resolving the IP address and is built into the operating system’s networking software. Let’s see what happens under the covers when a client such as SAS Management Console tries to resolve the SAS Metadata Server's name.

SAS Management Console makes a request to connect to the SAS Metadata Server on sgcwin071.exnet.xyz.com on port 5555.
The networking software determines the physical IP address for the sgcwin071.exnet.xyz.com name and returns it to SAS Management Console.
The connection request is properly routed to the physical server where the SAS Metadata Server is running.

Connection sequence with a virtual IP switch

The virtual IP switch, or IP load balancer, is a hardware device that serves as an intermediary between a client application and the services running on the grid. In effect, they help decouple the grid operation from the physical structure of the grid. When the client application wants to connect to the service, it connects to the load balancer, which then redirects the request to the appropriate host machine running the service. Common IP switches on the market include Cisco, F5 BIG-IP or Barracuda Load Balancer.

Here's the connection sequence between SAS Management Console and SAS Metadata Server with a virtual IP switch in place:

SAS Management Console makes a request to connect to the SAS Metadata Server on meta_alias on port 5555.
The networking software determines the physical IP address for the meta_alias name, which is actually an alias setup pointing to the virtual IP switch, and returns it to SAS Management Console.
SAS Management Console sends the connection request to the virtual IP switch for the IP address: 10.0.0.2.

The virtual IP switch is configured to forward requests for the IP address 10.0.0.2 to the primary or secondary server based on which server is listening to port 5555. In this case, sgcwin071 is running the SAS Metadata Server on port 5555, so the request is forwarded to the IP address 123.456.78.90, and the connection is completed.

Connection sequence with IP switch with failover

Now let's look and the connection sequence when one of the servers in the configuration fails, for example, sgcwin071. The grid has been configured to restart the SAS Metadata Server on the failover server, in this example, SGCWIN072.

The SAS Management Console client application runs follows the same connection pathway as always; however, the virtual IP switch recognizes that port 5555 is open on sgcwin072, not sgcwin071. Therefore, the switch forwards connection requests for 10.0.0.2 to IP address 123.456.78.91.

The end user does not know, but now the application has reached a different server! There's no disruption in service, and no action required by the SAS administrator.

Advantages of a hardware solution

The advantages of a virtual IP switch are evident:

It provides a single virtual IP address for critical applications to clients.
It provides quick failover because switching incoming traffic to the failover host is nearly instantaneous.
SAS clients don’t need to know where applications are running.

This solution may be more expensive, since organizations have to purchase and maintain additional hardware. Additionally, to avoid that the virtual IP switch itself becoming a single point of failure, the installation would be required to have at least two of them.

In my next post, I’ll show how to achieve the same result with a software implementation. Stay tuned!

--Edoardo

3 Comments

john on June 9, 2015 12:14 am

I thought of getting and starting with super computer with high specs ( http://www.spectra.com/hitachi/ ) I wont probably get any issues about it,

Jeff on November 6, 2014 2:57 pm

How is the health check implemented from the load balancer? I've seen this done on other services where a specific connection string is sent from the load balancer, tcp portt 5555 , to return a specific value back to the load balancer. The return value would enable the load balancer to determine which server is best able to return the traffic requests from the client.

Pingback: Help, I lost my SAS server again! - SAS Users Groups

Blogs