Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. I've broken the series into logical, consumable parts. If you'd like to start by learning a little more about what CAS Actions are, please see CAS Actions and Action Sets - a brief intro. Or if you'd like to see other topics in the series, see the overview page. Otherwise, let's learn how to create calculated columns in CAS tables.
In this example, I will use the CAS procedure to create columns in CAS tables for use in CAS actions. I could execute similar code in Python, R and more with some slight changes to the syntax for the specific language. The other languages also include specific methods to accomplish the same objective.
I'll start by executing the following code to create a CAS table named PRODUCTS in the Casuser caslib with 80 million rows. Then I'll create a variable named productsTbl to reference the PRODUCTS CAS table, and use the variable as the value to the table parameter in the fetch CAS action.
proc cas; productsTbl = {name = 'products', caslib = 'casuser'}; table.fetch / table = productsTbl; quit; |
Create a Calculated Column
To create a calculated column you add the computedVarsProgram sub parameter to the CAS table reference. The computedVarsProgram parameter specifies an expression for each computed variable and takes a string as it's value. The string can be:
- an assignment statement that can include most SAS functions.
- a SAS IF/THEN/ELSE statement
- a SAS LENGTH or LABEL statement
I'll start by creating a simple calculated column named Total_Price that multiplies Price by Quantity. This calculation triggers when referencing the productsTbl CAS table in the table parameter of a CAS action. Here I'll use the fetch action to view 20 rows of the table.
proc cas; productsTbl = {name = 'products', caslib = 'casuser', computedVarsProgram = 'Total_Price = Price * Quantity;'}; table.fetch / table = productsTbl; quit; |
Notice in the results above, the new column Total_Price was created during the execution of the fetch action. The original CAS table was not modified.
Create Multiple Calculated Columns
Next, I'll create a second column by adding a new assignment statement in the string. The second calculated column, Product_fix renders upper case values in the original Product column. In the string you must end each assignment statement with a semicolon. Above all, a new column is required since the computedVarsProgram sub parameter cannot overwrite existing columns in the table. To update values in a CAS table column, use the table.update action.
proc cas; productsTbl = {name = 'products', caslib = 'casuser', computedVarsProgram = 'Total_Price = Price * Quantity; Product_fix = upcase(Product);'}; table.fetch / table = productsTbl; quit; |
Notice in the results above, the fetch action created the Total_Price and Product_fix columns.
Use Conditional Logic
In addition, you can also use a traditional SAS IF/THEN/ELSE statement in the computedVarsProgram sub parameter. For example, I'll create a new column named Return_fix that replaces missing values in the Return column with the value No.
proc cas; productsTbl = {name = 'products', caslib = 'casuser', computedVarsProgram = 'Total_Price = Price * Quantity; Product_fix = upcase(Product); if Return = "" then Return_fix = "No"; else Return_fix = "Yes";' }; table.fetch / table = productsTbl; quit; |
In the results above there are three new calculated columns: Total_Price, Product_fix and Return_fix.
Storing the Calculated Columns in a Variable
Lastly, instead of adding the calculated columns as a string, I like to store them in a variable using the CAS language SOURCE block statement. Storing the string in the the SOURCE block enables you to embed the calculated columns as a text string and avoid quotation issues that can occur. I'll name my SOURCE block variable createColumns and use it as the value in the computedVarsProgram sub parameter.
proc cas; source createColumns; Total_Price = Price * Quantity; Product_fix = upcase(Product); if Return = "" then Return_fix = "No"; else Return_fix = "Yes"; endsource; productsTbl = {name = 'products', caslib = 'casuser', computedVarsProgram = createColumns }; table.fetch / table = productsTbl; quit; |
The results show the same result as the previous section.
While this is great, how can I format the new Total_Price column using the traditional SAS dollar format? Look closely at the Return_fix column. Do you see an issue with the Yes values? Well, stay tuned for part 2!
Summary
In summary, using the computedVarsProgram sub parameter allows you to easily create calculated columns in a CAS tables during the execution of CAS actions. A few key points:
- The computedVarsProgram sub parameter takes a string as it's value and uses traditional SAS assignment statements, IF/THEN/ELSE statements, and you can also use the LABEL and LENGTH statements.
- Using the computedVarsProgram sub parameter does not modify the original CAS table. It is simply applied during the execution of a CAS action.
- If you create a dictionary variable with the CAS table reference, you can then use that variable in a variety of actions within the table parameter.
- Using a SOURCE block enables you to embed your text in a variable without using a string.
Additional resources
fetch Action
SAS® Cloud Analytic Services: Fundamentals
Code
2 Comments
Great question Arne. So in this example I'm showing to how to create a column for ad hoc analysis for any action. However, to create a new table and promote it using the new columns you will have to use the table.copyTable action (if you are using Viya 3.5 you will have to use the table.partition action in the same way). The copyTable action will reference the table, create the new columns you specify, filter columns if you want and so on and create a new CAS table with that information.
Try this code. It's using most of the code from the last example of this blog. Let me know if this works.
* Storing the Calculated Columns in a Variable *;
proc cas;
source createColumns;
Total_Price = Price * Quantity;
Product_fix = upcase(Product);
if Return = "" then Return_fix = "No";
else Return_fix = "Yes";
endsource;
productsTbl = {name = 'products',
caslib = 'casuser',
computedVarsProgram = createColumns
};
table.fetch / table = productsTbl;
* Create and promote a new table in CASL *;
table.copyTable /
table = productsTbl,
casOut = {name = 'final_table', /*New table information */
caslib = 'casuser',
promote = TRUE};
* Confirm the new table exists. It'll also identify the table is promoted *;
table.tableInfo / caslib = 'casuser';
quit;
That's great.
But I'm struggling to promote the update table.
So I want the added columns to persist.