SQL is an important language for any programmer working with data. In SAS Cloud Analytic Services (CAS) you can execute SQL queries using the fedSQL.execDirect CAS action!
Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. I've broken the series into logical, consumable parts. If you'd like to start by learning a little more about what CAS Actions are, please see CAS Actions and Action Sets - a brief intro. Or if you'd like to see other topics in the series, see the overview page.
In this example, I will use the CAS procedure to execute the execDirect action. Be aware, instead of using the CAS procedure, I could execute the same action with Python, R, or even a REST API with some slight changes to the syntax for the specific language.
Why not PROC SQL?
The FedSQL language is the SAS implementation of the ANSI SQL:1999 core standard. FedSQL provides a scalable, threaded, high-performance way to query data and create tables. FedSQL also enables you to process and join data using industry-standard query expressions and SQL expressions. In SAS Viya, the execDirect CAS action submits a SAS FedSQL language statement for immediate execution in the CAS server. Moreover, you can also use PROC FEDSQL to query tables in CAS. Behind the scenes the CAS API converts the procedure to the execDirect action.
For my PROC SQL fans, be careful. If you use PROC SQL to query a CAS table, the entire CAS table is sent back to the client for processing. This can cause slow performance or cause a data transfer error because PROC SQL does not execute in CAS.
Execute SQL in SAS Viya
Consider the code below. To execute fedSQL directly in CAS, use the fedSQL.execDirect CAS action with the query parameter. In the query parameter specify the query as a string. In the FROM clause use a two level naming convention to specify the CAS table name and caslib. Here I'll specify the cars table in the Casuser caslib.
proc cas; fedSQL.execDirect / query="select Make, Model, MSRP, mean(MPG_City,MPG_Highway) as MPG_Avg from casuser.cars where Make='Toyota' order by MPG_Avg desc"; quit;
The execDirect action executes the query in the distributed CAS environment and returns the expected results.
Using a SOURCE block
While the query we executed was simple, this is not always the case. Adding a complicated query as a string can make writing the query difficult.
Instead of specifying the query as a string in the CAS procedure, use the SOURCE statement to embed text in a variable. In the following example, I'll execute the same query as before. However, this time I'll nest the query inside a SOURCE block by specifying the SOURCE statement and name the variable MPG_toyota. Then I'll add the query inside the SOURCE block and use the ENDSOURCE statement to end the block.
proc cas; source MPG_toyota; select Make, Model, MSRP, mean(MPG_City,MPG_Highway) as MPG_Avg from casuser.cars where Make='Toyota' order by MPG_Avg desc; endsource; fedSQL.execDirect / query=MPG_toyota; quit;
After the SOURCE block is complete, you can reference the variable as the value to the query parameter in the execDirect action.
And the results:
The results returned are the same, but using a SOURCE block makes the code easier to write and maintain.
In SAS Viya, FedSQL provides a scalable, threaded, high-performance way to query data and create new CAS tables from existing tables in the CAS server. In this example we saw two distinct ways to run SQL code on SAS Viya. This is only the beginning. See the resources below for more details on PROC FedSQL.
SAS® Viya®: FedSQL Programming for SAS® Cloud Analytic Services
SAS® Viya®: How to Emulate PROC SQL Using CAS-Enabled PROC FedSQL
execDirect CAS Action
SAS® Cloud Analytic Services: Fundamentals