Customizing the output data set created using the OUTPUT statement
When you request statistics on the PROC MEANS statement, the default printed output creates a nice table with the analysis variable names in the left-most column and the statistics forming the additional columns. Even if you create an output data set with no statistics listed on the OUTPUT statement, the default statistics, N, MIN, MAX, MEAN, and STD, are output in a nice table format as values of the _STAT_ variable and the analysis variables form the other columns. Once you start requesting specific statistics on the OUTPUT statement, the format of the data set changes. A wide output data set is the result with only one observation and a variable for each analysis variable-statistic combination. If you use a BY or CLASS statement, you will get multiple observations for each unique value of the BY or CLASS variable combinations, but the analysis variable structure is the same. The structure of this data set can be hard to view. It would be nice if the output data set could maintain the format of the printed output or the default output data set. You can do that with a few simple code modifications using PROC TRANSPOSE and DATA Step logic.
If you want the output data set to look like the default printed output, use the steps below.
/* create a sample data set */ data test; input x1 x2 x3; cards; 1 2 3 4 5 6 7 8 9 ; run; /* Request statistics on the OUTPUT statement. */ /* Use the AUTONAME option to create variable names in the form: */ /* variable_statistic */ proc means noprint; var _numeric_; output out=new(drop=_type_ _freq_) mean= median= std= q1= q3= / autoname; run; /* Transpose the data set so that each statistic becomes an observation. */ proc transpose data=new out=out; run; /* Create new variables that contain the statistic name and the */ /* original variable name. */ data out1; set out; varname=scan(_name_,1,'_'); stat=scan(_name_,2,'_'); drop _name_; run; proc sort data=out1; by varname; run; /* Transpose the data to get one observation for each */ /* original variable name and one variable for each */ /* statistic. This mimics the default printed output. */ proc transpose data=out1 out=out2(drop=_name_); by varname; id stat; var col1; run; proc print data=out2; title 'Looks like default printed output'; run; |
Looks like default printed output
varname | Mean | Median | StdDev | Q1 | Q3 |
x1 | 4 | 4 | 3 | 1 | 7 |
x2 | 5 | 5 | 3 | 2 | 8 |
x3 | 6 | 6 | 3 | 3 | 9 |
If you want the output data set to look like the default output data set, use the steps below in place of the last three steps above.
proc sort data=out1; by stat; run; /* Transpose the data to get one observation for each */ /* statistic name and one variable for each */ /* original variable. This mimics the default */ /* output data set. */ proc transpose data=out1 out=out3(drop=_name_); by stat; id varname; var col1; run; proc print data=out3; title 'Looks like default output data set'; run; |
Looks like default output data set
stat | x1 | x2 | x3 |
Mean | 4 | 5 | 6 |
Median | 4 | 5 | 6 |
Q1 | 1 | 2 | 3 |
Q3 | 7 | 8 | 9 |
StdDev | 3 | 3 | 3 |
If your variable names contain underscores (_) in the name, replace the DATA OUT1 step above with the following DATA OUT1 step:
/* Depending on your variable name, you may need to increase */ /* the length $11 assignment below. */ data out1; set out; length varname $11; lenvar=length(_name_); lenstat=index(reverse(trim(_name_)),'_')-1; varname=substr(_name_,1,(lenvar-lenstat)-1); stat=scan(_name_,-1,'_'); drop _name_ lenvar lenstat; run; |
Customizing the output using PROC TEMPLATE
For the default printed output from PROC MEANS, in the past we said that the only option for formatting the statistics was with the MAXDEC= option which controls the number of decimal places. We are no longer limited to this level of formatting since we can use PROC TEMPLATE. We can format the statistics in the printed output by modifying the Base.Summary table template.
/* create sample data */ data test; input amount ; cards; 1000 12345 8500 ; run; /* Modify the base.summary table template using */ /* EDIT statements. */ ods path(prepend) work.templat(update); proc template; edit base.summary; edit mean; format=dollar12.2; end; edit sum; format=dollar12.; end; end; run; proc means data=test n mean sum; var amount; run; /* restore default template */ proc template; delete base.summary; run; |
Custom format for printed output
The MEANS Procedure |
Analysis Variable : amount | ||
N | Mean | Sum |
3 | $7,281.67 | $21,845 |
Customizing the output data using STACKODSOUTPUT
In SAS 9.3, a new option was added that only affects the output data set created using the ODS OUTPUT statement. This option is STACKODSOUTPUT (alias: STACKODS) and it is placed on the PROC MEANS statement. This option allows the data set to resemble the default printed output.
/* Use the STACKODSOUTPUT option on the PROC MEANS statement */ /* and use the ODS OUTPUT statement to create output data set. */ ods output summary=with_stackods(drop=_control_); proc means data=sashelp.class stackodsoutput sum mean std nway; class sex; var height weight; run; proc print data=with_stackods noobs; title 'Output data set with STACKODSOUTPUT'; run; |
Output data set with STACKODSOUTPUT
Sex | NObs | Variable | Sum | Mean | StdDev |
F | 9 | Height | 545.300000 | 60.588889 | 5.018328 |
F | 9 | Weight | 811.000000 | 90.111111 | 19.383914 |
M | 10 | Height | 639.100000 | 63.910000 | 4.937937 |
M | 10 | Weight | 1089.500000 | 108.950000 | 22.727186 |
If you are not satisfied with the output from PROC MEANS, either the default printed output or the output data set, you are just a few steps away from reshaping it into the format you desire.
WANT MORE GREAT INSIGHTS MONTHLY? | SUBSCRIBE TO THE SAS TECH REPORT
17 Comments
Hi Kathryn,
Thanks for the post! I'm a beginner and still not quite understand.. 🙁
I have a dataset with 15 variables in 8 quarters (2 years) and 19 locations. I want to list the means by each location and the means need to be arranged by quarter so that we can tell the change pattern across time. Could you please write a sample code for this purpose. Thank you so much!!
Sophie
Sophie,
Thanks for your comments. For a custom question, it would be best to open a tracking number with Technical Support using the link below:
Technical Support
Thanks,
Kathryn
Hello Kathryn McLawhorn,
Greetings...!
Hope you are doing great!
Your post is much Informative, I just would like to know can we able to import Excel (.XLSX format) file to SAS 9.1?. I tried many codes but aren't working. So, I have changed excel file format XLSX to .CSV and then it's working. But I want .XLSX file as it is in SAS 9.1 without changing the Format. Is there any way to Import .XLSX format to SAS 9.1?
Thanks in advance.
Regards,
Vijay
I am glad the blog post was informative. Since your question about importing .XLSX files is not related to the blog content, I would suggest opening a tracking entry with Technical Support.
Kathryn
In addition to using PROC MEANS and the STACKODSOUTPUT option, you can use PROC UNIVARIATE and the OUTTABLE= option. PROC UNIVARIATE outputs 46 descriptive statistics, some of which are not available in PROC MEANS. For details and a comparison with PROC MEANS, see "Save descriptive statistics for multiple variables in a SAS data set."
That's one of the great things about SAS - there are always multiple ways to approach a task.
Kathryn
The code runs correctly for me. If you are talking about the replacement DATA step if your variable names contain _, then that comes after the PROC TRANSPOSE in the prior code. If you are getting errors, I would suggest opening a tracking entry with Technical Support and include your log.
Kathryn
The example references _name_ in a data step before it actually exists (it is created by proc transpose). Could you please fix your example so that it works without errors? or, could a moderator delete or fix it?
Kathryn,
Your post is outstanding and easy to understand. I love it and I am really, but really brand new to SAS and I am learning it by myself with the help of various posts and excellent explanations like yours. However, I wonder if you could guide in the following situation.
Data: 16 subjects divided in 2 groups ('exper' and 'control') with the same 2 var ('rest' and 'work').
Proc needed: ttest for the 2 var ('rest' and 'work') in the 'exper' group only.
I used Proc MEANS to obtained the 5-statistics for each group, and have obtained this table:
-------------------------------------------------------------------------------------------------------
Group----variable----label----5-statistics values
control----rest---------rest-----
------------work--------work----
exper-----rest---------rest-----
------------work--------work----
I know that I can conduct the ttest manually by working out the statistics and obtaining the standard error of the mean, etc., etc.
However, I would like to create a subset 'name' control' and use the proc ttest for this subset only and obtain the ttest for the variables 'rest' and 'work'.
How can I do that?
Thank you for your help.
Mike
Mike,
Thanks for your kind comments. PROC TTEST is supported by the SAS/STAT group. I would suggest opening a tracking entry with Technical Support and describe your situation above. The appropriate statistician should be able to help you with that.
http://support.sas.com/ctx/supportform/createForm
Kathryn
Preeti,
It is not clear to me what you are trying to do. When I look at the output data set created by the ODS OUTPUT statement, I see the Label variable; and I have been able to rename it in a subsequent DATA step. If you have a particular problem, I would suggest opening a tracking entry with Technical Support with your code and the particular issue you are having. I would be happy to help you address it there.
Kathryn
For some reason, when I use STACKODS, I am not able to use the variable for labels ("Label"). I want to be able to pull in a number of STACKODS datasets and have the variable names and labels show but I can't call the outputted variable for label despite removing variable labels using PROC DATASETS so that I could read the original variable names, and checking the DICTIONARY TABLES. I can see that the variable name is still "Label", it shows up when I do descriptives using _CHAR_, but I cannot directly call it. Do you have any thoughts as to why?
Thanks for sharing the link to the paper.
Kathryn
See also the SAS Global Forum paper explaining a macro to do this at
http://support.sas.com/resources/papers/proceedings12/231-2012.pdf
Any reason why I'm getting an error on the yaxis ranges clause??
31 proc sgplot data=new;
32 vbar type / response=value;
33 yaxis ranges=(0-20 200-230);
______
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, DISCRETEORDER, DISPLAY, FITPOLICY,
GRID, INTEGER, INTERVAL, LABEL, LABELATTRS, LOGBASE, LOGSTYLE, MAX, MIN, MINOR,
NOTIMESPLIT, OFFSETMAX, OFFSETMIN, REFTICKS, REVERSE, THRESHOLDMAX, THRESHOLDMIN,
TICKVALUEFORMAT, TYPE, VALUEATTRS, VALUES, VALUESHINT.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
You are welcome. I hope you found this helpful.
Kathryn
I did not know about STACKODS. Thank you.