Tips for reading XML files into SAS® software

ProblemSolversXML has become one of the major standards for moving data across the Internet. Some of XML’s strengths are the abilities to better describe data and to be more extensible than any of its predecessors such as CSV. Due to the increased popularity of XML for moving data, I provide a few tips in this article that will help when you need to read XML files into SAS software.

Reading XML Files

You can read XML files into SAS using either the XML engine or the XMLV2 engine. The XML engine was the first engine that SAS created to read XML files. The XMLV2 engine includes new functionality and enhancements and is aliased to the XML92 engine in SAS® 9.2.

It is easy to read XML files using the XMLV2 engine when the XML file that you read uses the GENERIC markup type and conforms to a very rectangular definition. Here is an example of the XML file layout that is required to be read natively using the XMLV2 engine:

If the file is not in this format, the following informative error message is generated in the SAS log:

Read More »

Post a Comment

Need an aggregation of an aggregation? Use Visual Data Builder to pre-aggregate your data

Report designers often discover after aggregating data by groups in the Visual Analytics Designer that it would also be nice to see additional aggregations of the data, for example, a maximum or minimum of that sum across groups. This means creating an ‘aggregation of an aggregation.’ If you plan your report objectives in advance of loading your data, you can do this by creating an aggregation initially in the Visual Data Builder, rather than the Designer.  And in many cases, it’s possible to get the desired report results by simply loading a small additional table to memory.

Here’s an example of a report with a sum aggregation on the profit measure and Facility Region as the category data item. The report also shows the maximum, minimum, and average of the regional profit values, along with the difference between each region’s profit sum and the average. It’s possible to assign a SUM aggregation to the Profit measure, but the challenge appears when trying then to create a MAX aggregation across the regions.

Visual Data Builder to pre-aggregate your data01

Read More »

Post a Comment

Auditing using the SAS Environment Manager Report Center– tips and tricks

As an addendum to my previous two blogs on using the SAS Environment Manager Report Center, this blog illustrates further tips and tricks you can use to help make the creation of your custom reports easier.

The Ad-Hoc Reporting section of the Report Center is specifically designed to provide a “testing ground” for reports you may want to try. It can be your most useful tool in designing and creating the reports you want. Notice that the first four “reports” are most useful if you have a good idea of what content you want your report to have; they present  specific report content, and it’s easy to see what type of data these reports would  contain, just based on the titles or names shown.

Read More »

Post a Comment

3 good resources for humans who want to learn more about machine learning

Learn more about machine learning"Shall we play a game?"

If you’re a child of the ’80s like me, you might recognize this famous line from the movie WarGames. This innocent-sounding question comes not from one of the movie’s human stars, but from a military super-computer named Joshua, after a bored high school student, played by Matthew Broderick, accesses the computer’s hard drive.

Thinking he’s hacked into a video game company, Broderick’s character accepts Joshua’s challenge and chooses the most intriguing game he can find: global thermonuclear war. To Joshua, though, it’s not just a game. Joshua is an intelligent computer programmed to learn through simulations like the one Broderick’s character initiates. And because the computer actually does control the arsenal of U.S. nuclear weapons, it’s a “game” that puts the planet on the brink of World War III.

If you want to see how the movie ends, I’d encourage you to check it out.

The reason I mention the movie is because, in addition to scaring the heck out of me, WarGames was my first exposure to machine learning, the idea that computers can learn and adapt based on the data they collect. Of course, machine learning has changed a lot since my 1983 Hollywood introduction.

Read More »

Post a Comment

Using Multiple Quality Knowledge Base Locales in a DataFlux Data Management Studio Data Job

In DataFlux Data Management Studio, the data quality nodes (e.g., Parsing, Standardization, and Match Codes) in a data job use definitions from the SAS Quality Knowledge Base (QKB).  These definitions are based on a locale (Language and Country combination).  Sometimes you would like to work with multi-locale data within the same data job and these data quality nodes have LOCALE attributes as part of their Advanced Properties to help you do this.

For example, you may want to work with data from the United States, Canada, and the United Kingdom within the same data job.  Note:  You must have the QKB locale data installed and be licensed for any locales that you plan to use in your data job.

The Advanced properties you will need to use are LOCALE_FIELD and LOCALE_LIST.  LOCALE_FIELD specifies the column name that contains the 5-character locale value to use for each record.  LOCALE_LIST specifies the list of locales that should be loaded into memory for use within the node.

Read More »

Post a Comment

Seven of my favorite big data presentations from SAS Global Forum 2016

big data presentations from SAS Global Forum 2016

Nowadays, nearly every organization analyzes data to some degree, and most are working with “Big Data.”  At SAS Global Forum 2016 in Las Vegas, a vast number of papers were presented to share new and interesting ways our customers are using data to IMAGINE. CREATE. INNOVATE., as this year’s conference tagline reminds us.

Below you’ll find a collection of a few of my favorites on big data topics, ranging from SAS Grid Manager to Hadoop to SAS Federation Server. The common point? It’s easier than ever to modernize your architecture now. I hope these papers help you continue to advance your organization.

Paper 2020-2016: SAS® Grid Architecture Solution Using IBM Hardware
Whayne Rouse and Andrew Scott, Humana Inc.

This paper is an examination of Humana journey from SAS® 9.2 to SAS® 9.4M3 and from a monolithic environment to a SAS Grid Manager environment on new hardware and new storage. You can find tips such as the importance of understanding the old environment before starting and applying that understanding to building the new environment.

Read More »

Post a Comment

Use a stack container to pick your category in SAS Visual Analytics Reports

Pick your category? If this title seems familiar, that’s because in my last blog, Use parameters to pick your metric in VA Reports, I covered how to use parameters to allow your users to pick which metric they want to view in their visualizations. This is a great technique that offers a solution to many report requirements.

But, what if your users require specific axes labels and titles for your visualizations? What if your users require reference lines? If you encounter these requirements then consider using a stack container to meet these needs.

Let’s take a look; but first, here is a breakdown of the report we will be looking at in this blog. This report does not have any report level prompts but it does have two section prompts. Section prompts filter the data for every object on this section. There is a drop-down list control object that prompts the user for Year, and there is also a button bar control object that prompts the user for Continent.

Then in the report body there is a list control object, a text box and a stack container. The list control object prompts the user for Country. The text box provides the first half of the report title. And the stack container provides a way to organize multiple visualizations on your report; it layers or “stacks” the objects as if they were in a slide deck. The stack container provides navigation options to cycle through the visualization objects that were added. In this example, I added two bar charts and one line chart object to the stack container.

Use a stack container in SAS Visual Analytics

Read More »

Post a Comment

Are you solving the wrong problem?

Solving Business Problems with SASBeing a SAS consultant is about solving problems. In our day-to-day work we solve myriads of all sorts of problems – technical problems, data problems, programming problems, optimization problems – you name it. And in the grand scheme of things we solve business problems.

But without a well-defined business problem, all our problem-solving efforts become irrelevant. What is the benefit of optimizing a SAS program so it runs 2 seconds instead of 20? Well, you can claim a ten-fold improvement, but “so what?” if that program is intended to run just once! Given the number of hours you spent on such an optimization, you were definitely solving the wrong problem.

The ice cream maker

There was one event early in my life that made an unforgettable impression on me and forever changed my problem-solving mindset. When I was a teenager, my father and I bought Mom a present for her birthday – an ice cream maker – a little bowl with a slow electrical mixer that you place into a fridge. Yes, it was a cleverly self-serving gift on our part, but, hey, that was what she really wanted!

Read More »

Post a Comment

Seeing the FREQ procedure's one-way tables in a new light

ProblemSolversPROC FREQ is often the first choice when you want to generate basic frequency counts, but it is the last choice when it is compared to other statistical reporting procedures. People sometimes consider PROC FREQ last because they think they have little or no control over the appearance of the output. For example, PROC FREQ does not allow style options within the syntax, which the REPORT and TABULATE procedures do allow. Also, you cannot control the formats or headings with statements in the procedure step.

Sometimes, a simple frequency (via a one-way table) is all you want, and you don’t want to have to create an output data set just to add a format. A one-way table in PROC FREQ is unique also in that it includes a cumulative count and percent. These calculations cannot be done in other procedures without additional code or steps. However, there is a simple way to make some basic modifications to your table. By adding a PROC TEMPLATE step to modify either the Base.Freq.OneWayFreqs or the Base.Freq.OneWayList table template, you can change the formats of the statistics, move the label of the variable, change the labels of the statistics, and suppress the Frequency Missing row that appears below the table. These changes apply to all output destinations, including the traditional listing output.  You can also use PROC TEMPLATE to make small modifications to the nodes that are generated by one-way tables in the table of contents for non-listing ODS destinations.

Read More »

Post a Comment

Creating custom reports from the Data Mart

EnvironmentManagerMy previous blog discussed the SAS Environment Manager Report Center, and talked about its organization, and how to start using some of the prompts to get the reports you want. The next step is to learn to use some of the example reports provided to help you design your own, production-level reports.

It’s helpful to distinguish ad-hoc, or exploratory reporting, vs. standardized, production-style reports, which might be run daily, weekly, or any other number of times. You can experiment with many ad-hoc reports and create hundreds of variations and different reports “on the fly,” just by manipulating the various parameters provided within the Report Center interface. Perhaps more importantly, you can use knowledge gained from this process to design more permanent production reports, using some of the provided examples as “templates.”

First, you need to understand the basic structure: All the SAS Environment Manager Reports are SAS Stored Processes available from the Stored Process web application. Most of them accept user prompts that specify many details about the data to include and the way the output should appear. Then, the stored process passes these parameters to a small set of reporting macros, which perform the data manipulations and produce the actual report using well-known SAS Procedures.

In addition to all the pre-defined reports, there are two sets of example reports specifically intended to be used as templates from which you can develop your own custom report. Often, this only requires a few changes to parameters being passed to the reporting macro. These example reports are located here:

Read More »

Post a Comment