You’ve seen the notices on the SAS website or your company bulletin board. Perhaps you’ve gotten a meeting invitation via email or even heard a colleague talking about it. Still, you haven’t taken the plunge yet – you haven’t attended a SAS Users Group meeting, maybe because you’re uncertain about which group would be best for you.
That’s the great thing about SAS Users Groups – we know you aren’t a one-size-fits-all user. And because of that, we offer different types of groups to fit your needs.
Learn about the benefits of joining a SAS Users Group – as explained by SAS users themselves - in the video below, then read the rest of this blog post to determine what user group is right for you!
The SAS Visual Analytics 7.2 release introduces a new look and feel to the Hub with the modern (HTML5) version. It also introduces a new feature, Shared Collections, which allows an administrator to push a Shared Collection to all users’ Hub view, including Guest access. And bonus - it will also get pushed down to any SAS Mobile BI users!
The idea of a Collection is not new and was available with the previous versions of Visual Analytics. You can include a variety of content in a Collection such as: tables, reports, explorations and stored processes, and the Collection groups it all together under a Collection name. In VA 7.2 we can now publish a Collection which will make it a Shared Collection.
Let’s look at some screenshots to walk us through how to configure Shared Collections.
On July 14, 2015 SAS released SAS Factory Miner, our latest advanced data mining and machine learning product. This new product provides automated predictive modeling by segment in a high velocity grid enabled environment, allowing modelers to run hundreds of models within minutes and find champion models by segment quickly. Below is a short description of how it works and a top ten list of things to know about SAS Factory Miner.
Data sources for modeling are defined in metadata, just like when using SAS Enterprise Miner. Target variables and segment variables are identified, variable roles and levels are defined and once the data source is saved in metadata, it is available to everyone in your organization. Read More »
Customizing the output data set created using the OUTPUT statement
When you request statistics on the PROC MEANS statement, the default printed output creates a nice table with the analysis variable names in the left-most column and the statistics forming the additional columns. Even if you create an output data set with no statistics listed on the OUTPUT statement, the default statistics, N, MIN, MAX, MEAN, and STD, are output in a nice table format as values of the _STAT_ variable and the analysis variables form the other columns. Once you start requesting specific statistics on the OUTPUT statement, the format of the data set changes. A wide output data set is the result with only one observation and a variable for each analysis variable-statistic combination. If you use a BY or CLASS statement, you will get multiple observations for each unique value of the BY or CLASS variable combinations, but the analysis variable structure is the same. The structure of this data set can be hard to view. It would be nice if the output data set could maintain the format of the printed output or the default output data set. You can do that with a few simple code modifications using PROC TRANSPOSE and DATA Step logic.
If you want the output data set to look like the default printed output, use the steps below.
While the InfoWindow approach is fine by most measures, in some cases it can be rather clunky. When SAS output is quite large, displaying it in a Google map’s Info Window will either require a larger Info Window size, which will obstruct the Google map area on the screen, or it will require horizontal and vertical scrolling within the Info Window, which is also not the best approach from a usability standpoint.
Here, I am going to demonstrate a different design solution when a computer screen is divided into two panes: one pane is used to display a Google map with all its interactive features; another pane is used to display a SAS report, based on the interaction with the Google map pane. Read More »
In a SAS Environment there is a lot of metadata, metadata about configuration such as server definitions, users, groups and roles and metadata about content like data, reports and jobs etc.
SAS Administrators often want to report on metadata. They want to know what reports have been developed and where they are stored, what jobs have been modified perhaps in a certain time period, where data is located, or which users are members of which groups and roles.
This documentation of SAS metadata is often very important prior to any upgrade to a new release. An inventory of configuration and content may influence the upgrade approach selected and will help to validate the success of the upgrade.
There are many ways to accomplish this in SAS, but until recently none of them have been particularly easy. In the past we have used programming interfaces to metadata such as the metadata data step functions or PROC metadata to develop SAS programs which create these reports. The SAS(R) 9.4 Language Interfaces to Metadata documentation provides two examples of this in the section Examples: Using Metadata Language Elements to Create Reports
In SAS 9.4 there are a couple of ways that are quicker and simpler than writing your own code (I know, it is not as much fun). In this blog we will look at satisfying two of the most commonly requested metadata reports. Firstly a report of all users and their group and role membership, and secondly a list of all user developed content. Read More »
The Analytics 2015 conference in Las Vegas, Oct. 26 and 27 is designed for you. So why wouldn’t you help choose the content? New this year, we’re asking the analytics community to vote on one data mining and one forecasting topic that they want to hear at the conference. The voting takes place on AllAnalytics.com.
The sessions you can choose from include:
Data mining - open source integration with SAS
Data mining - video data mining
Data mining - ensemble modeling
Forecasting - count series forecasting (for time series that are discretely valued)
Forecasting - multistage models for highly seasonal and/or sparse demand series
I asked our forecasting expert, Ken Sanford and the data mining-meister, Patrick Hall to break it all down for us. These guys are serious about their areas of expertise. Just watch… Read More »
One of the great things that the new Data Mart will do for you is combine data from all the machines found in a multi-machine deployment into one storage area, where it is used to create many of the reports found in the Report Center. This capability began with the 14w41 release (M2) of SAS 9.4. The SAS Environment Manager Data Mart (hereafter called the “data mart”) is part of the new Service Architecture Framework, and it is the part that provides forensic analysis of system usage and capacity, resource allocations, and more. The APM (Audit, Performance, Measurement) part of the data mart contains many of the same functions as the older APM package from previous SAS releases.
But, to combine data from more than one machine into one data mart requires a little fiddling. First, you have to set up file mounts to be able to read the files from the other machines. The files needed are the log files generated by all the SAS servers, from all the different machines in a deployment. Of course it’s assumed that the machines can all see each other on the network, via the domain names.
This is my second blog on the topic of anonymization, which I’ve spent some time over the past several months researching. My first blog, Anonymization for data managers, focused on the technical process. Now let’s dive into the role for analysts, report designers and information owners.
To analysts and reporting experts, anonymization means something quite different. For them, it means the process of rendering personal information non-personal, in much more general terms.
Anonymization in this context includes a wider set of practices for ensuring that personal data is well-governed, and that when summary statistics about a group of individuals are published, the number of individuals making up any one aggregate group is large enough that those individuals cannot be personally identified from the characteristics of the group.
For example, suppose a high school publishes details of its exam results, and breaks them down in several ways as part of a national initiative to be open about its performance, allowing parents and local government to see how the school is performing. The school is at risk of breaching the anonymity of some of their pupils, by inadvertently revealing sensitive personal information about them, if they are not cautious about choosing which figures to publish and which to keep confidential. (There is nothing wrong with calculating those figures, just with publishing them).
Now suppose there are 30 students in a particular class in the school, of whom 15 are male and 15 are female. In the whole class, 6 (1 boy and 5 girls) are in an ethnic group which is in a minority in the school’s local area. The remaining 24 students in the class are from the local majority ethnic group. Read More »
I’ve spent some time over the past couple of months learning more about anonymization.
This began with an interest in the technical methods used to protect sensitive personally-identifiable information in a SAS data warehouse and analytics platform we delivered for a customer. But I learned that anonymization has two rather different meanings; one in the context of data management and another in the context of data governance for reporting, sharing or publishing information.
I think both sides of the topic are interesting enough that it’s worth writing about them – I hope you will find both of them interesting too, as overall, this is a subject about which anyone who handles sensitive or personal data should know something.
To data managers, anonymization often means the technical process of obscuring the values in sensitive fields in the data, by replacing them with equivalent, but non-sensitive values which are still useful for e.g. joining tables, representing individuals in time-series or transactional etc. In some SAS products, eg SAS Federation Server, this is called ‘masking’. Read More »
All code examples are provided as is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.