A beginner’s guide to SAS Users Groups: How to choose the group that’s right for you

You’ve seen the notices on the SAS website or your company bulletin board. Perhaps you’ve gotten a meeting invitation via email or even heard a colleague talking about it. Still, you haven’t taken the plunge yet – you haven’t attended a SAS Users Group meeting, maybe because you’re uncertain about which group would be best for you.

That’s the great thing about SAS Users Groups – we know you aren’t a one-size-fits-all user. And because of that, we offer different types of groups to fit your needs.

Learn about the benefits of joining a SAS Users Group – as explained by SAS users themselves - in the video below, then read the rest of this blog post to determine what user group is right for you!

 

Read More »

Post a Comment

Use Hub Shared Collections to automatically push VA reports to your users

VisualAnalyticsThe SAS Visual Analytics 7.2 release introduces a new look and feel to the Hub with the modern (HTML5) version. It also introduces a new feature, Shared Collections, which allows an administrator to push a Shared Collection to all users’ Hub view, including Guest access. And bonus - it will also get pushed down to any SAS Mobile BI users!

The idea of a Collection is not new and was available with the previous versions of Visual Analytics. You can include a variety of content in a Collection such as: tables, reports, explorations and stored processes, and the Collection groups it all together under a Collection name. In VA 7.2 we can now publish a Collection which will make it a Shared Collection.

Let’s look at some screenshots to walk us through how to configure Shared Collections.

Read More »

Post a Comment

What you should know about SAS Factory Miner

On July 14, 2015 SAS released SAS Factory Miner, our latest advanced data mining and machine learning product.  This new product provides automated predictive modeling by segment in a high velocity grid enabled environment, allowing modelers to run hundreds of models within minutes and find champion models by segment quickly.  Below is a short description of how it works and a top ten list of things to know about SAS Factory Miner.

Data sources for modeling are defined in metadata, just like when using SAS Enterprise Miner.  Target variables and segment variables are identified, variable roles and levels are defined and once the data source is saved in metadata, it is available to everyone in your organization. Read More »

Post a Comment

Customizing output from PROC MEANS

ProblemSolversCustomizing the output data set created using the OUTPUT statement

When you request statistics on the PROC MEANS statement, the default printed output creates a nice table with the analysis variable names in the left-most column and the statistics forming the additional columns.  Even if you create an output data set with no statistics listed on the OUTPUT statement, the default statistics, N, MIN, MAX, MEAN, and STD, are output in a nice table format as values of the _STAT_ variable and the analysis variables form the other columns.  Once you start requesting specific statistics on the OUTPUT statement, the format of the data set changes.  A wide output data set is the result with only one observation and a variable for each analysis variable-statistic combination.  If you use a BY or CLASS statement, you will get multiple observations for each unique value of the BY or CLASS variable combinations, but the analysis variable structure is the same.  The structure of this data set can be hard to view.  It would be nice if the output data set could maintain the format of the printed output or the default output data set.  You can do that with a few simple code modifications using PROC TRANSPOSE and DATA Step logic.

If you want the output data set to look like the default printed output, use the steps below.

Read More »

Post a Comment

Integrating SAS reports with Google maps: two-pane solution

In my previous post SAS ODS destination - Google maps I showed how to incorporate SAS ODS output into Google maps using Google Map’s InfoWindow – that ubiquitous bubble window that opens up on a Google map, that can be used to display all sorts of different information (see also The power of SAS-generated InfoWindows in Google maps).

Two-pane solution

While the InfoWindow approach is fine by most measures, in some cases it can be rather clunky. When SAS output is quite large, displaying it in a Google map’s Info Window will either require a larger Info Window size, which will obstruct the Google map area on the screen, or it will require horizontal and vertical scrolling within the Info Window, which is also not the best approach from a usability standpoint.

Here, I am going to demonstrate a different design solution when a computer screen is divided into two panes: one pane is used to display a Google map with all its interactive features; another pane is used to display a SAS report, based on the interaction with the Google map pane. Read More »

Post a Comment

Vote on the Analytics 2015 sessions you want to hear

The Analytics 2015 conference in Las Vegas, Oct. 26 and 27 is designed for you. So why wouldn’t you help choose the content? New this year, we’re asking the analytics community to vote on one data mining and one forecasting topic that they want to hear at the conference. The voting takes place on AllAnalytics.com.

The sessions you can choose from include:

  • Data mining - open source integration with SAS
  • Data mining - video data mining
  • Data mining - ensemble modeling
  • Forecasting - count series forecasting (for time series that are discretely valued)
  • Forecasting - multistage models for highly seasonal and/or sparse demand series

I asked our forecasting expert, Ken Sanford and the data mining-meister, Patrick Hall to break it all down for us. These guys are serious about their areas of expertise. Just watch… Read More »

Post a Comment

Providing multi-machine data for the SAS Environment Manager Data Mart

Multi-Machine Deployment SAS Environment ManagerOne of the great things that the new Data Mart will do for you is combine data from all the machines found in a multi-machine deployment into one storage area, where it is used to create many of the reports found in the Report Center. This capability began with the 14w41 release (M2) of SAS 9.4.  The SAS Environment Manager Data Mart (hereafter called the “data mart”) is part of the new Service Architecture Framework, and it is the part that provides forensic analysis of system usage and capacity, resource allocations, and more. The APM (Audit, Performance, Measurement) part of the data mart contains many of the same functions as the older APM package from previous SAS releases.

A previous blog by Gilles Chrzaszcz described how to initialize the Service Architecture Framework and various components of it. That is also described in the document that comes with the installation, also found at this link:   SAS_Environment_Manager_Service_Architecture_Quickstart.pdf.

But, to combine data from more than one machine into one data mart requires a little fiddling. First, you have to set up file mounts to be able to read the files from the other machines. The files needed are the log files generated by all the SAS servers, from all the different machines in a deployment.  Of course it’s assumed that the machines can all see each other on the network, via the domain names.

Read More »

Post a Comment

Anonymization for analysts, report designers and information owners

This is my second blog on the topic of anonymization, which I’ve spent some time over the past several months researching. My first blog, Anonymization for data managers, focused on the technical process. Now let’s dive into the role for analysts, report designers and information owners.

To analysts and reporting experts, anonymization means something quite different. For them, it means the process of rendering personal information non-personal, in much more general terms.

Anonymization in this context includes a wider set of practices for ensuring that personal data is well-governed, and that when summary statistics about a group of individuals are published, the number of individuals making up any one aggregate group is large enough that those individuals cannot be personally identified from the characteristics of the group.

For example, suppose a high school publishes details of its exam results, and breaks them down in several ways as part of a national initiative to be open about its performance, allowing parents and local government to see how the school is performing. The school is at risk of breaching the anonymity of some of their pupils, by inadvertently revealing sensitive personal information about them, if they are not cautious about choosing which figures to publish and which to keep confidential. (There is nothing wrong with calculating those figures, just with publishing them).

Now suppose there are 30 students in a particular class in the school, of whom 15 are male and 15 are female. In the whole class, 6 (1 boy and 5 girls) are in an ethnic group which is in a minority in the school’s local area. The remaining 24 students in the class are from the local majority ethnic group. Read More »

Post a Comment

Anonymization for data managers

anonymization_1I’ve spent some time over the past couple of months learning more about anonymization.

This began with an interest in the technical methods used to protect sensitive personally-identifiable information in a SAS data warehouse and analytics platform we delivered for a customer. But I learned that anonymization has two rather different meanings; one in the context of data management and another in the context of data governance for reporting, sharing or publishing information.

I think both sides of the topic are interesting enough that it’s worth writing about them – I hope you will find both of them interesting too, as overall, this is a subject about which anyone who handles sensitive or personal data should know something.

To data managers, anonymization often means the technical process of obscuring the values in sensitive fields in the data, by replacing them with equivalent, but non-sensitive values which are still useful for e.g. joining tables, representing individuals in time-series or transactional etc. In some SAS products, eg SAS Federation Server, this is called ‘masking’. Read More »

Post a Comment