Many SAS Enterprise Guide users practically live in the Query Builder. For those who understand their data tables, the Query Builder provides a tremendous amount of flexibility to pull and manipulate data. The Query Builder produces SQL programs behind the scenes, which translates well for database-centric work.
Sometimes a complex query requires a two-pass process. For example, suppose that you want to create a table of CARS that contains only those models with a higher-than-average price. First you need a query to calculate the average price across all cars. Then you need a second query to filter the records with a price that's higher than that result.
A SAS Enterprise Guide user might build something like this:
This method has obvious downsides: it requires two passes through the data, and two sets of results are brought into your SAS session. If the source table is in a database, this can be especially inefficient.
SAS Enterprise Guide 5.1 supports the idea of subqueries, a concept familiar to SQL programmers. With a subquery (sometimes called "subselect" or "inner query"), you can express the entire operation in one SQL step. That's one trip to the database, and only one set of results brought back to SAS.
I'll briefly describe how this works. For comprehensive documentation and examples about the subquery feature in SAS Enterprise Guide 5.1 and later, read Michael Burke's excellent conference paper on Finding Your Inner Query with SAS Enterprise Guide.
The subquery feature relies on the ability to define query templates (also new in SAS Enterprise Guide 5.1), which are query definitions that you can reuse as building blocks for other query steps. (Query templates are useful on their own, as a way to store a query within your task menus so that you can use it across different projects.)
Using the subquery feature
First, use the Query Builder to define the "inner query". In my example, that query contains just one item, the calculated average of MSRP.
Instead of running the query, use the drop-down arrow on the Run button to select Create Template. The Create Template window allows you to assign a name to this definition, and it's automatically added to your Tasks->Task Templates menu. Note that you can create the query template without even running this part of the query. That can be a time saver if you're a Query Builder Ninja with high confidence in your result.
Next, design the outer query with a new Query Builder task. Select all of the fields that you want to include in the result. In the Filter tab, add a new filter for MSRP. Instead of comparing MSRP to a constant or other expression, you'll select from a list of available Subquery templates.
In your Filter tab, you'll see a partial view of the inner SELECT expression.
When you complete your selections and have the other query options just the way that you want them, click Run to execute the query.
If you examine the SAS program that the task creates, you'll see the inner SELECT clause:
PROC SQL; CREATE TABLE WORK.expensive AS SELECT t1.Make, t1.Model, t1.MSRP, FROM SASHELP.CARS t1 WHERE t1.MSRP >= ( SELECT /* AVG_of_MSRP */ (AVG(t1.MSRP)) AS AVG_of_MSRP FROM SASHELP.CARS t1 ); QUIT; |
You can also use subqueries as a way to filter/compare a column to a list of values. In that case, your inner query simply needs to have a single-column list of results (instead of a single number, as in my example). Read Michael's subquery paper for all of the possibilities and limitations.
From Michael's paper, here are the types of subqueries that you can create using query-based templates:
- Subqueries that return a single value
- Subqueries that return multiple values (multiple rows of a single field)
- Subqueries that appear as part of a filter of the raw data (on the WHERE clause)
- Subqueries that appear as part of a filter on the grouped data (on the HAVING clause)
- Subqueries that appear as part of a recode condition as part of recoding a column in a computed column (on the SELECT clause)
Here are the types of subqueries that you cannot create using query-based templates:
- Subqueries that form a derived table (subqueries that appear on the FROM clause)
- Subqueries that refer to columns on the outer query (correlated subqueries)
More than one way to skin this data
There are many other ways to accomplish this simple example in SAS. For example, you may be able to use the HAVING clause feature in the Query Builder, or use SQL joins and special join conditions to filter a table based on another result set. But the ability to create a subquery has been long-requested from SAS Enterprise Guide users who understand databases. I hope you'll agree that it's an important addition to your query and reporting toolbox.
2 Comments
Hello Chris.
I always do subqueries using the approach you describe above. But after installing SAS EG 7.1, I can't get it to work. When I try to use template in a Query Builder task, I get an error.
I still have SAS EG 5.1 on my machine, and it works fine there. I can also create a project using subquery in 5.1, open it in 7.1 and run it.
Do you know anything about it?
Error is as follows:
Exception Details:
----------------------------------------
Exception type: SAS.EG.SDS.SDSException
Message: Server-udførelse mislykkedes
Source: SAS.EG.SDS.Model
Target Site: SASWorkspace
Stack Trace:
at SAS.EG.SDS.Model.Server.SASWorkspace()
at SAS.EG.ProjectElements.Forms.FilterBasicPage.LaunchSubqueryUI(String templateName, String templateXML)
at SAS.EG.ProjectElements.Forms.FilterBasicPage.ShowValuePicker(Control ctrlToUse)
at SAS.EG.ProjectElements.Forms.FilterBasicPage.cmdAddToList_Click(Object sender, EventArgs e)
Best regards, Olga Kazarina.
Olga,
I don't know why it's not working. This is something that you should track with SAS Technical Support. One possible difference is the way that SAS EG 7.1 connects to the SAS workspace and passes locale/encoding options. I see that your SAS session is probably running in Danish (from the error message) and perhaps that's a difference from the way EG 5.1 connects. I'll also send this to the R&D team, but definitely open a track to make sure you get a resolution.