Often there are questions from users on creating histogram using a Log X axis. One such question came up this weekend, where a user wanted a histogram of her data using log axis. Before we get into her specific case, let us first clarify what we may want to see when we say "Histogram on Log Axis". This could mean one of these two cases:
1. Histogram of the linear values, displayed on a log x axis. This histogram has equal width bins in linear data space. When displayed on a log axis, the bins are drawn with varying pixel width.
Using the sashelp.cars data set, the first case on the right shows a histogram of the original data in linear space, on a LOG x axis. Note, each bin represents the same amount of the data, but the widths of the bins in pixels are reducing as we go to the left due to the log axis.
proc sgplot data=sashelp.cars;
On the right is a graph of the log transformed data on a default axis. First, we create a new data column logMpg=Log10(mpg_city). Then use logMpg as the analysis variable for the histogram. This will create the graph shown here, where each bin is now has a constant pixel size. Note the x axis tick values and axis label.
proc sgplot data=cars;
histogram logMpg / fillattrs=graphdata1
In the graph on the right we have replaced the x- axis values (log scale) with their respective untransformed linear values at equal spacing using the ValuesDisplay option. A tick value is displayed at each value provided in the Values option, but the actual text displayed is from the valuesDisplay option. Axis label is now "MPG".
proc sgplot data=cars;
histogram logMpg / fillattrs=graphdata1;
xaxis fitpolicy=none valueattrs=(size=7) values=(0.90 1 1.3 1.47 1.6 1.7 1.78 1.85 1.90 1.954 2)
valuesdisplay=(" " "10" "20" "30" "40" "50" "60" "70" "80" "90" "100") label='MPG';
Discussing with experts here at SAS, the second and third graphs above have more practical uses instead for the first one. Often users wan to see if their data, or some transform, has a normal distribution. So, it is useful to view the histogram of the transformed data. Then, subsequent processing can be done on the transformed data. It turns out that the user who asked the original question about using the Log axis also really wants the last case shown above.
Now, let us talk about using TYPE=LOG on the x axis. Often, in this case when the data has a range of over 2 or 3 orders of magnitude, you may see a Note in the log saying:
NOTE: Log axis cannot support zero or negative values in the data range. The axis type will be changed to LINEAR.
This happens despite the fact at all the data is positive and is a bit confusing. What is going on is that the histogram is computing the BinStart and BinWidth values internally, and the default numbers can cause the lower edge of the first bin to have a negative x value. This is the reason for the warning if you then ask for TYPE=log. If you must set the TYPE=Log and get the graph like the first one above, make sure your BinStart and BinWidth combination satisfy the following criteria:
zero < BS-BW/2 < min value in data
Full SGPLOT program: