The SAS® Viya® platform provides multiple methods to append to CAS tables

0

Appending data tables is a common task for data analysis. The release of the SAS Viya platform offers several new methods to append CAS tables. This article provides examples of three methods that you can use.

DATA step

Before the SAS Viya platform was released, the only way you could append to a CAS table was to use the DATA step. For example, the following code uses the APPEND=YES option in the DATA statement to append CASUSER.B to CASUSER.A:

Caslib _all_ assign;
Data casuser.a(append=yes);
   Set casuser.b;
Run;

This code is very straightforward, and it allows you to adjust the data with a single step. However, as familiar and dynamic as the DATA step is, new methods are now available to improve the appending performance.

The CASUTIL procedure

PROC CASUTIL is versatile and allows you perform several tasks within SAS® Cloud Analytic Services (CAS). PROC CASUTIL was available in SAS® Viya® 3.x, but the APPEND statement was not added until the SAS Viya platform was released.

Here is the syntax for the APPEND statement in PROC CASUTIL.

PROC CASUTIL <options>;
APPEND SOURCE="table-1"<SRCCASLIB="caslib" > <DATASOURCEOPTIONS=(data-source-options)>
TARGET="table-2" <TGTCASLIB="caslib">;

Append a table to a table

The table listed in the SOURCE= option is appended to the table listed as the TARGET= table. Here is the syntax to append the B table in CAUSER to A table in CASUSER.

Proc casutil;
append source='b' srccaslib='casuser' target='a' tgtcaslib='casuser';
quit;

Append a data set to a table

PROC CASUTIL also allows you to append a SAS data set to a CAS table using the APPEND option in the LOAD statement. This syntax appends the SAS data set MYSASLIB.B to the A table in CASUSER:

proc casutil;
  load data=mysaslib.b casout='a' outcaslib='casuser' append;
quit;

The CAS procedure

The ability to append to a CAS table was also added to PROC CAS in the SAS Viya platform. You can use the table.Append action to append one CAS table to another. Here is the syntax for the APPEND action:

table.append <result=results> <status=rc> /
*source={
caslib="string",
dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>},
*name="table-name",
singlePass=TRUE | FALSE,
where="where-expression"
},
*target={
caslib="string",
*name="table-name"
}
;
* indicates a required parameter

Similar to the DATA Step, PROC CAS is extremely versatile, can contain a number of available actions, and can also take advantage of CASL to make your code much more dynamic. Refer to the documentation for PROC CAS to see all the functionality that it contains.

Here is example code that appends the B table in CASUSER to the A table in CASUSER:

 proc cas;
     table.append /
     source={caslib='casuser', name='b'}
     target={caslib='casuser', name='a'};
 quit;
View more Problem Solvers posts

Method choice

You can choose different methods to append tables together based on what you need to accomplish within the job. Here are some example choices:

  • If you need to append two tables together, then I recommend using PROC CASUTIL with the APPEND statement.
  • If you need to run multiple actions against the appended tables, then I recommend using PROC CAS with the table.Append action.
  • If the data needs to be modified, then the DATA step would be the best choice.

Here is a sample program that you can run to test the different methods of appending. Depending on your environment, PROC CASUTIL and PROC CAS might process faster than the DATA step:

cas;
caslib _all_ assign;
options msglevel=i fullstimer;
 
/* random sample data */
data casuser.b;
array chars(5) $5 c1-c5;
array nums(5) n1-n5;
do i=1 to 5000000;
  do j=1 to 5;
    chars(j)=repeat(byte(rand('integer',65,90)),rand('integer',1,5));
    nums(j)=rand('integer',1,9999);
  end;
  output;
end;
drop i j;
run;
 
proc casutil;
  copy casdata='b' incaslib='casuser' casout='a1' outcaslib='casuser' replace;
  copy casdata='b' incaslib='casuser' casout='a2' outcaslib='casuser' replace;
  copy casdata='b' incaslib='casuser' casout='a3' outcaslib='casuser' replace;
quit;
 
/* DATA step */
data casuser.a1(append=yes);
  set casuser.b;
run;
 
/* PROC CASUTIL with the APPEND statement */
proc casutil;
  append source='b' srccaslib='casuser' target='a2' tgtcaslib='casuser';
quit;
 
/* PROC CAS using the table.Append action */
proc cas;
  table.append / 
  source={caslib='casuser', name='b'}
  target={caslib='casuser', name='a3'};
quit;

SAS usually provides more than one way to accomplish a task. In the SAS Viya platform, there are multiple ways to append CAS tables together to improve performance.

Share

About Author

Kevin Russell

SAS Technical Support Engineer, CAS and Open Source Languages

Kevin Russell is a Technical Support Engineer in the CAS and Open Source Languages group in Technical Support. He has been a SAS user since 1994. His main area of expertise is the macro language, but provides general support for the DATA step and Base procedures. He has written multiple papers and presented them at various SAS conferences and user events.

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top