Let’s chat about big data and innovation

stubbs “The best data scientists are those that combine deep statistical / data / machine learning skills with domain knowledge.”

“[Most companies] haven't properly addressed the need for cultural change!... There's still this prevailing perception that it's a technology & skills problem.”

“Analytics only ever tells you one of two things—it either confirms what you knew or it suggests that you were wrong.”

“Lots of companies are getting value out of information, analytics, and big data, I think it's more a question of whether they can keep getting *more* value of their investment and capability. I prefer to look at it in terms of potential and relative performance rather than an absolute measure.”

These are just some of the insights Evan Stubbs shared during yesterday’s All Analytics Book Club e-chat about his new book, Big Data, Big Innovation: Enabling Competitive Differentiation through Business Analytics.  Evan is the Chief Analytics Officer for SAS Australia / New Zealand and he sits on the board of the Institute of Analytics Professionals of Australia. His book takes a hands-on approach, addressing not only the practicalities of big data and analytics, but also the culture that needs to exist to be successful. The author of two previous titles, Evan says he wrote this new book “for the ‘decision makers,’” in an effort to answer the question, “How do I innovate?”

As one participant put it, Evan has “taken the discussion to a level beyond where a lot of people seem to be focused, past the ‘you need analytics, accept it’ to ‘here's what you can do with it to move your company forward.’”

Do you think you could benefit from Evan’s expertise? Join the next e-chat on Tuesday, Sept. 2, at 3:00 p.m. ET, where the discussion will focus on “making it real” and “making it happen.” The Book Club will wrap up with a live, on-air interview and Q&A with Evan on Tuesday, Sept. 9, at 3:00 p.m. ET (register now).

You can also get the full story – now at 20 percent off. AllAnalytics community members who order a copy of Big Data, Big Innovation from the SAS store between now and Sept. 21, 2014, will get 20 percent off the retail price along with free shipping. Use promo code A2BCPP. (Store and discount are US only at this time. International community members can find purchasing options listed by country here.)

Post a Comment

SAS Press makes a stop in The Golden State

In just a few short days, I’ll fly cross-country to attend the Western Users of SAS Software conference (WUSS). I make no secret of the fact that I love California: San Diego, San Francisco, Los Angeles, among others, are all great cities to visit.

This year WUSS will be held Sept. 3-5 in San Jose, the third-largest city in California (in fact, according to Wikipedia, it is the tenth-largest city in the United States). I look forward to checking out the city and also meeting and greeting the wonderful attendees at the conference.

I know a lot of veteran SAS Press authors will be there, and I hope to recruit other potential authors to join their ranks. We are currently looking for authors interested in writing both full-length books and shorter pieces. You can visit this link for more information. Stop by and see me at the SAS Press booth. I hope to see you there!

Post a Comment

The Most Unusual Way You’ve Learned JMP

How do you learn best? In your sleep, when the unconscious mind is most receptive to suggestions?

fox1

Calling to the “powers that be” for one of those “aha” moments when everything just sinks in?

Mountain2

Swearing to your parents that you were actually studying when you came up with this plan?

Waterfall3

When I was in school, way back when, my favorite way to learn was in the most serene place possible.

sunset4

Well, in Robert Carver’s latest book, “Practical Data Analysis with JMP®, Second Edition," he makes learning JMP as easy and relaxing as possible. With his help, things like “initial rendering of 3D Scatterplot with Density Ellipsoid” will seem like child’s play.

scatterplot

You can leverage JMPs visual and intuitive environment in the service of *understanding* statistical concepts and still relax…Believe me, Robert will be your HERO!

So, find that “happy and unusual way” to learn, grab Robert Carver’s latest book as an instrument to build your JMP knowledge and kick back for some intellectually stimulating reading.

I’m ready, how about you?

Post a Comment

SAS author’s tip: Wrangling specific statistical values from SAS output

This week’s author tip is from Jack Shostak’s new book SAS Programming in the Pharmaceutical Industry, Second Edition.

If you're interested in this week's free tip and want to learn more about the topic or book, visit our online catalog. You'll find a free book excerpt, example code and data, and more.

General Approach to Obtaining Statistics

The previous sections show you how to extract p-values for a commonly used set of statistical tests. This section describes a general step-by-step approach for getting your statistics from a SAS procedure into data sets for clinical trial table or graph reporting. Here are the steps to follow:

  1. Determine which statistics you need in your table by looking at the listing destination output of your statistical procedure.
  2. Check the SAS procedure syntax to see whether there is an output data set that will provide you with the statistical values that you need. The output data sets from the SAS procedures are usually easier to use than the ODS OUTPUT data sets.
  3. If you cannot find what you need in an output data set from the statistical procedure, use ODS OUTPUT to send your statistics to a data set. To determine the name of the data set object to output, perform an ODS TRACE on your SAS procedure like this:
ods trace on;
proc ...
run;
ods trace off;

Then go to your SAS log to see which “tables” or data sets the SAS procedure makes. Each block of text in your SAS listing output typically translates into a SAS data set in ODS. You can see what each table is called by looking at the “Output Added” blocks in your SAS log. These blocks look something like this:

Output Added:

-------------

Name:       ShortName

Label:      Dataset Label

Template:   3 level name

Path:       2 level name

-------------

4. “ShortName” from step 3 is what your ODS object, and, in this case, data set, is called. Simply wrap an ODS OUTPUT statement around your SAS procedure to create the needed data set:

ods output ShortName = yourdatasetname;
proc ...
run;
ods output close;

The statistics that you need are now in the data set called “yourdatasetname.”

Note that when you obtain statistics from an ODS output data set, the results that you see there may appear different from what you see in your ODS listing destination (LST file).  This is because a SAS procedure may round to a different precision in the ODS listing destination from the precision at which you present your ODS output statistics. The numbers in the data set are the same, but the way they are rounded may make the statistic appear different.

(The following excerpt is from SAS Press author Jack Shostak’s book, “SAS Programming in the Pharmaceutical Industry, Second Edition”. Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (Please note that results may vary depending on your version of SAS software.)

 

Post a Comment

SAS author's tip: Using the %SYSFUNC and %QSYSFUNC macro functions

This week’s author tip is from Michele Burlew and her new book SAS Macro Programming Made Easy, Third Edition. Burlew chose this tip because she says the %SYSFUNC and %QSYSFUNC functions allow you to use SAS language functions in macro programming. Access to these functions greatly increases macro programming power and can simplify writing macro code.

We hope you find this tip useful. You can also read an excerpt from Burlew’s book.

The functions %SYSFUNC and %QSYSFUNC apply SAS programming language functions to text and macro variables in your macro programming. Providing access to the many SAS language functions in your macro programming applications, %SYSFUNC and %QSYSFUNC greatly extend the power of your macro programming.

Since these two functions are macro language functions and the macro facility is a text-handling language, the arguments to the SAS programming language function are not enclosed in quotation marks; it is understood that all arguments are text. Also, the values returned through the use of these two functions are considered text.

Functions cannot be nested within one call to %SYSFUNC and %QSYSFUNC. Each function must have its own %SYSFUNC or %QSYSFUNC call, and these %SYSFUNC and %QSYSFUNC calls can be nested.

Using %SYSFUNC to Format a Date in the TITLE Statement

The TITLE statement in Example 6.5 shows how the elements of a date can be formatted using %SYSFUNC and the DATE SAS language function.

title 
  "Sales for %sysfunc(date(),monname.) %sysfunc(date(),year.)";

On January 30, 2014, the title statement would resolve to

Sales for January 2014

The excerpt is from SAS Press author Michele Burlew’s book “SAS Macro Programming Made Easy, Third Edition” Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software).

Post a Comment

Using Big Data & Analytics to Fight Fraud!

It is estimated that a typical organization loses 5% of its revenues to fraud each year (www.acfe.com).  The total cost of insurance fraud (non-health insurance) in the US is estimated to be more than $40 billion per year (www.fbi.gov).  The advent of Big Data & Analytics has provided new and powerful tools to fight fraud.  In my new book, Analytics in a Big Data World, I discuss fraud detection as one important application area.  Furthermore, I have also recently partnered with SAS to develop a new course on the topic of Fraud Analytics using Supervised, Unsupervised and Social Network Methods.

What are the current challenges in fraud detection?

The first challenge is finding the right data.  Analytical models need data and in a fraud detection setting this is not always that evident.  Collected fraud data are often very skew, with typically less than 1% fraudsters which seriously complicates the detection task.  Also the asymmetric costs of missing fraud versus harassing non-fraudulent customers represent important model difficulties.  Furthermore, fraudsters try to constantly outperform the analytical models such that these models should be permanently monitored and re-configured on an ongoing basis.

What analytical approaches are being used to tackle fraud?

Most of the fraud detection models in use nowadays are expert based models.  When data becomes available, one can start doing analytics.  A first approach is supervised learning which analyses a labelled data set of historically observed fraud behavior.  It can be used to both predict fraud as well as the amount thereof.  Unsupervised learning starts from an unlabeled data set and performs anomaly detection.  Finally, Social network learning analyses fraud behavior in networks of linked entities.  Throughout my research, I have found this approach to be superior to all others!

What are the key characteristics of successful analytical models for fraud detection?

A successful analytical model should first possess a good statistical accuracy in terms of hit rate.  It should detect as many as possible of the fraudsters.  Besides this, analytical models should be interpretable.  By understanding the fraud patterns, we can start developing new fraud prevention strategies.  Finally, the models should also be operationally efficient.  This is especially relevant in, e.g., a credit card fraud setting where a fraud decision needs to be made in a few seconds.

For more information about this topic, I am happy to refer to my new book Analytics in a Big Data World.  I also teach a new course on the topic.

For an interview with me and my PhD student Véronique van Vlasselaer working on social networks for fraud detection, watch this video:

You can read more about my work here www.dataminingapps.com.

Post a Comment

SAS author’s tip: What is Clinical Endpoint Committee data?

This week’s author tip is from Jack Shostak’s new book SAS Programming in the Pharmaceutical Industry, Second Edition.

Shostak has been using SAS for nearly 30 years. In that time, he’s co-authored two other books, Common Statistical Methods for Clinical Research with SAS Examples, Third Edition, and Implementing CDISC Using SAS: An End-to-End Guide.

(The following excerpt is from SAS Press author Jack Shostak’s book, “SAS Programming in the Pharmaceutical Industry, Second Edition”. Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (Please note that results may vary depending on your version of SAS software.)

Clinical Endpoint Committee (CEC) Data

It is often the case that the endpoint/event form captures data that are not entirely objective because they contain some level of clinical judgment. For instance, when precisely is a cold cured, was an event truly a myocardial infarction, or did any given event truly occur? The clinical site investigator may decide, using his or her clinical judgment, that a given event occurred, but often it is necessary to have an independent assessment of that event by another physician. This independent review helps to ensure that events are reported in a consistent way across multiple clinical sites for a clinical trial. Usually what happens is that a condition on the regular case report form “triggers” the release of a CEC form to be sent to the CEC. The CEC then takes the CEC form and verifies whether or not an actual event occurred based on the data available in the patient’s clinical records at the given site. A sample CEC form follows:

CEC Data

In this CEC form, “event” would be replaced by some clinical finding such as “myocardial infarction,” “stroke,” “seizure,” or the like. Once again, this form is extremely simplified, and there are usually a number of associated data variables captured that help to support the existence of the event.

The biggest problem for the statistical programmer when using CEC data is reconciling these data against the regular CRF endpoint/event data. This can be a difficult task, especially when you consider that a patient may have more than one event on a given day. Fortunately, because the endpoint/event data are so critical to a clinical trial, the quality of the reconciliation from the CEC form to the CRF form is not often relegated to some form of fuzzy data join. Usually there will be a definitive linkage via a key mapping data set that links the CEC event data to the CRF event data. However, if that key data set does not exist, then the statistical programmer must prepare for some difficult programming. It is also worth noting that the data from the adverse event forms, laboratory forms, and other forms, as well as a specific “event” form, may in fact trigger clinical events. This may add to the complexity of the reconciliation programming.

Post a Comment

Make yourself known at the 2015 regional conferences

ShelleySessoms

The U.S. Regional SAS Users Group Conferences are smaller, intimate conferences where you can show off your SAS expertise. The conference coordinators are always looking for strong presenters and keynote speakers. Many of our authors use these conferences to gather immediate feedback on their book topic. And let’s not forget about the promotional opportunities. You can be like Art Carpenter and proudly wear your “SAS Author. Ask Me About My Book” button.

If you’re ready to be noticed at a regional conference, our web site gives you an overview of how to become a SAS author. And we’re always here to answer your questions and help you get started.

We’re looking for SAS authors now. And if you submit a book proposal by early October, we will review that proposal, get feedback to you as quickly as possible, and, if your proposal is accepted, get your project started. Then your book can be either sold or promoted at the regional conferences in 2015.

Now’s the time to think about your 2015 conference presentations. Let’s get that book idea started and you’ll already have a talk or two prepared. And, we’ll even give you a fancy Author button to wear, too!

Contact us today to get started.

Post a Comment

SAS author's tip: Combining macro variables with text

This week’s author tip is from Michele Burlew and her new book SAS Macro Programming Made Easy, Third Edition. Burlew chose this tip because she says it’s important to understand how SAS determines where a macro variable reference starts and stops, and often a delimiter is needed to tell SAS when to stop.

We hope you find this tip helpful. You can also read an excerpt from Burlew’s book online.

When you combine macro variable references with text or with other macro variable references, you can create new macro variable references. These new macro variable references are resolved before the SAS language statements in which they are placed are tokenized.

A concatenation operator is not needed to combine macro variables with text. However, periods (.) act as delimiters of macro variable references and might be needed to delimit a macro variable reference that precedes text.

When placing text before a macro variable reference or when combining macro variable references, you do not have to separate the references and text with a delimiter.

%let mosold=4;
%let level=12;
data book&mosold&level;
  set books.ytdsales(where=(month(datesold)=&mosold));
  attrib over&level length=$3 label="Cost > $&level";
  if cost > &level then over&level='YES';
  else over&level='NO';
run;

When you follow a macro variable reference with text, you must place a period at the end of the macro variable reference to terminate the reference. The macro processor recognizes that a period signals the end of a macro variable name and determines that the name of the macro variable is the text between the ampersand and the period. While not required unless you follow a macro variable reference with text, all macro variable references can be terminated with periods.

%let prefix=QUESTION;
proc freq data=books.survey;
  tables &prefix1 &prefix2 &prefix3 &prefix4 &prefix5;
run;

After resolving the macro variable references, the program becomes:

proc freq data=books.survey;
  tables &prefix1 &prefix2 &prefix3 &prefix4 &prefix5;
run;

The program is revised below.  This newer version contains the necessary delimiters that tell the macro processor when the macro variable references end. Now the macro variable references resolve as desired, and the text that follows the references is concatenated to the results of the resolution.

%let prefix=QUESTION;
proc freq data=books.survey;
  tables &prefix.1 &prefix.2 &prefix.3 &prefix.4 &prefix.5;
run;

The macro processor substitutes QUESTION for the &PREFIX macro variable reference. After macro variable resolution, the program becomes:

proc freq data=books.survey;
   tables QUESTION1 QUESTION2 QUESTION3 QUESTION4 QUESTION5;
run;

The excerpt is from SAS Press author Michele Burlew’s book “SAS Macro Programming Made Easy, Third Edition” Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software).

Post a Comment

You, and your book, can stand out at JSM 2015

It’s easy to get lost in the crowd of thousands of Joint Statistical Meeting (JSM) attendees. How can you make yourself stand out in a sea of statisticians? You could prepare an outstanding, standing-room-only presentation, give a half-day workshop, or present a poster. But what if you did all those things, and showed off a SAS book that you authored? SAS Press can help make that happen.

Our web site gives you an overview of how to become a SAS author. And we’re always here to answer your questions and help you get started.

We’re looking for statistical authors now. And if you submit a book proposal by early September, we will review that proposal, get feedback to you as quickly as possible, and, if your proposal is accepted, get your project started. Then your book can be promoted at JSM 2015.

If you’re ready to be a stand-out at JSM, let’s get started. Statistical books from SAS are big hits with our readers. Make sure you’re listed as one of our statistical experts. And together, we’ll get you ready to be the hit of the conference!

Contact us today to get started.

**If you need a little more time, tune in next week to learn what you need to do to have your book ready by the U.S. Regional SAS Users Group Conferences in 2015.**

DSC_7082_small

Post a Comment