Share your SAS analytics story in Las Vegas

89319662Everyone has a story to tell – or so they say! What better place to tell your story than in Vegas! If you have a SAS analytics story then please come and share it with me at the SAS Press booth at Analytics 2014. Do you have an interesting or unusual application of SAS software? Did you have a problem at work which you solved using SAS? We would love to hear your stories and discuss the possibility of developing it into a short book or case study.

We are interested in publishing books covering the myriad of applications of SAS software and user experiences – if you have any publishing ideas and would like more information you could also drop me an email and we can arrange a short meeting.

If you are presenting and think your presentation could make a good book then please email me and I’ll add it to my agenda and we can meet up afterwards for a quick chat.

Even if you have no publishing plans right now, do come and see me and let me know what books you would like to see published, and enjoy a special conference discount on our existing titles.

Post a Comment

The New Normal is Strange

The first time I used the Internet it blew my mind. As a diplomat brat, at any point in time everyone I knew was everywhere but where I was. Thanks the miracles of Gopher, Veronica, IRC and email, the tyranny of distance didn’t seem so oppressive any more. When I managed to track down a friend from high-school half-way around the world, it was like the heavens opened.

It might as well have been magic as far as I understood how it worked. And, even though I couldn’t explain how, I could tell the world would never be the same again.

That’s not to say that people believed me. Most didn’t - after all, who really wanted to use Pine to talk to people around the world when you could just write a letter? The idea of chatting with people online about anything that you were into? Bizarre. And, don’t get me started with trying to use Emacs

Times change though. And, eventually the future caught up with that vague sense of “something” I sensed.

It’s easy to lose track of how much things have changed. It’s been almost forty since Star Wars hit the cinemas. TCP/IP, the lingua franca by which every device now talks to each other, wasn’t even being used on ARPANET.

Today, we’re talking of the Internet of Things. We’re expecting over 26 billion new devices to be connected by 2020. We’re already generating more data than we know what to do with, all of which is ripe for analysis.

Whether we realise it or not, we’re rocketing through another tipping point. What big data implies is not greater insight. It’s disruption.

Systems don’t need to be intelligent to make good decisions. They just need to have well-designed rules by which they can operate. What this data enables is drastically more complex rules, rules which allow ever-increasing levels of automation. And, it’s these rules that are already changing our world.

Flying cars are still a pipe dream. Self-driving cars though? They’re here. And they’re only made possible through data. They’re so good, they’re even being designed to drive faster than the speed limit to improve safety.

Analytics is creating a new world, one that’s very different to the one we’re used to. One that’s very strange.

Picture this: in a world populated with self-driving cars, who needs buses? The very nature of civil engineering changes and with almost total shared vehicular utilisation across the traffic network, all our congestion issues disappear almost overnight. Why take a bus when you can book a car to your house when you need one that’ll take you direct to your destination?

Even better, it’s drastically cheaper. With car pooling, mileage-based pricing, and 24/7 vehicle utilisation being the norm, travel costs would collapse. It’s not hard to see a future where owning a car becomes an anachronism. Much like owning a horse today, it becomes the realm of hobbyists and fanatics. As direct ownership declines, the nature of asset insurance changes drastically as risk premiums bias more and more towards human drivers against our fellow robots, further discouraging car ownership. Unfortunately, that also leads to a collapse of the car insurance, damaging the underwriting business.

This rapid decline in accident rates also sparks a collapse in organ donations, accelerating research into and demand for bio-engineering and 3-D organ printing. Medical general practitioners become largely unemployable thanks to automated & constant real-time analysis of epidemiological trends, drug interactions, and case-level differential diagnosis. Rising levels of unemployment across insurers, taxi drivers, truck drivers, and medical practitioners spark an increase in default rates, creating another credit crunch.

The logistics and taxi industries implode almost overnight, replaced by the retailers that already own the supply chains and intelligent assets needed to route their stock across the country. Much like cloud competing, logistics becomes priced as a time-shared resource, priced dynamically based on space availability across each link in the supply chain.

And all this simply because someone asked, “You know all this data we’re collecting … why don’t we use it for something?”

Futurism is like trying to wrestle an octopus in the dark; even though you’re pretty sure you’ve grabbed onto something important, you’re never quite sure whether you’re winning or losing. Still, the teenager in me is pretty confident of one thing - whether people realise it or not, the world’s changed yet again.

And, the new normal is strange.

____________________________________________________________________________________________________________________________________________

For more work by Evan Stubbs, check out his new book Big Data, Big Innovation: Enabling Competitive Differentiation through Business Analytics.

Post a Comment

Let’s chat about big data and innovation

stubbs “The best data scientists are those that combine deep statistical / data / machine learning skills with domain knowledge.”

“[Most companies] haven't properly addressed the need for cultural change!... There's still this prevailing perception that it's a technology & skills problem.”

“Analytics only ever tells you one of two things—it either confirms what you knew or it suggests that you were wrong.”

“Lots of companies are getting value out of information, analytics, and big data, I think it's more a question of whether they can keep getting *more* value of their investment and capability. I prefer to look at it in terms of potential and relative performance rather than an absolute measure.”

These are just some of the insights Evan Stubbs shared during yesterday’s All Analytics Book Club e-chat about his new book, Big Data, Big Innovation: Enabling Competitive Differentiation through Business Analytics.  Evan is the Chief Analytics Officer for SAS Australia / New Zealand and he sits on the board of the Institute of Analytics Professionals of Australia. His book takes a hands-on approach, addressing not only the practicalities of big data and analytics, but also the culture that needs to exist to be successful. The author of two previous titles, Evan says he wrote this new book “for the ‘decision makers,’” in an effort to answer the question, “How do I innovate?”

As one participant put it, Evan has “taken the discussion to a level beyond where a lot of people seem to be focused, past the ‘you need analytics, accept it’ to ‘here's what you can do with it to move your company forward.’”

Do you think you could benefit from Evan’s expertise? Join the next e-chat on Tuesday, Sept. 2, at 3:00 p.m. ET, where the discussion will focus on “making it real” and “making it happen.” The Book Club will wrap up with a live, on-air interview and Q&A with Evan on Tuesday, Sept. 9, at 3:00 p.m. ET (register now).

You can also get the full story – now at 20 percent off. AllAnalytics community members who order a copy of Big Data, Big Innovation from the SAS store between now and Sept. 21, 2014, will get 20 percent off the retail price along with free shipping. Use promo code A2BCPP. (Store and discount are US only at this time. International community members can find purchasing options listed by country here.)

Post a Comment

SAS Press makes a stop in The Golden State

In just a few short days, I’ll fly cross-country to attend the Western Users of SAS Software conference (WUSS). I make no secret of the fact that I love California: San Diego, San Francisco, Los Angeles, among others, are all great cities to visit.

This year WUSS will be held Sept. 3-5 in San Jose, the third-largest city in California (in fact, according to Wikipedia, it is the tenth-largest city in the United States). I look forward to checking out the city and also meeting and greeting the wonderful attendees at the conference.

I know a lot of veteran SAS Press authors will be there, and I hope to recruit other potential authors to join their ranks. We are currently looking for authors interested in writing both full-length books and shorter pieces. You can visit this link for more information. Stop by and see me at the SAS Press booth. I hope to see you there!

Post a Comment

The Most Unusual Way You’ve Learned JMP

How do you learn best? In your sleep, when the unconscious mind is most receptive to suggestions?

fox1

Calling to the “powers that be” for one of those “aha” moments when everything just sinks in?

Mountain2

Swearing to your parents that you were actually studying when you came up with this plan?

Waterfall3

When I was in school, way back when, my favorite way to learn was in the most serene place possible.

sunset4

Well, in Robert Carver’s latest book, “Practical Data Analysis with JMP®, Second Edition," he makes learning JMP as easy and relaxing as possible. With his help, things like “initial rendering of 3D Scatterplot with Density Ellipsoid” will seem like child’s play.

scatterplot

You can leverage JMPs visual and intuitive environment in the service of *understanding* statistical concepts and still relax…Believe me, Robert will be your HERO!

So, find that “happy and unusual way” to learn, grab Robert Carver’s latest book as an instrument to build your JMP knowledge and kick back for some intellectually stimulating reading.

I’m ready, how about you?

Post a Comment

SAS author’s tip: Wrangling specific statistical values from SAS output

This week’s author tip is from Jack Shostak’s new book SAS Programming in the Pharmaceutical Industry, Second Edition.

If you're interested in this week's free tip and want to learn more about the topic or book, visit our online catalog. You'll find a free book excerpt, example code and data, and more.

General Approach to Obtaining Statistics

The previous sections show you how to extract p-values for a commonly used set of statistical tests. This section describes a general step-by-step approach for getting your statistics from a SAS procedure into data sets for clinical trial table or graph reporting. Here are the steps to follow:

  1. Determine which statistics you need in your table by looking at the listing destination output of your statistical procedure.
  2. Check the SAS procedure syntax to see whether there is an output data set that will provide you with the statistical values that you need. The output data sets from the SAS procedures are usually easier to use than the ODS OUTPUT data sets.
  3. If you cannot find what you need in an output data set from the statistical procedure, use ODS OUTPUT to send your statistics to a data set. To determine the name of the data set object to output, perform an ODS TRACE on your SAS procedure like this:
ods trace on;
proc ...
run;
ods trace off;

Then go to your SAS log to see which “tables” or data sets the SAS procedure makes. Each block of text in your SAS listing output typically translates into a SAS data set in ODS. You can see what each table is called by looking at the “Output Added” blocks in your SAS log. These blocks look something like this:

Output Added:

-------------

Name:       ShortName

Label:      Dataset Label

Template:   3 level name

Path:       2 level name

-------------

4. “ShortName” from step 3 is what your ODS object, and, in this case, data set, is called. Simply wrap an ODS OUTPUT statement around your SAS procedure to create the needed data set:

ods output ShortName = yourdatasetname;
proc ...
run;
ods output close;

The statistics that you need are now in the data set called “yourdatasetname.”

Note that when you obtain statistics from an ODS output data set, the results that you see there may appear different from what you see in your ODS listing destination (LST file).  This is because a SAS procedure may round to a different precision in the ODS listing destination from the precision at which you present your ODS output statistics. The numbers in the data set are the same, but the way they are rounded may make the statistic appear different.

(The following excerpt is from SAS Press author Jack Shostak’s book, “SAS Programming in the Pharmaceutical Industry, Second Edition”. Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (Please note that results may vary depending on your version of SAS software.)

 

Post a Comment

SAS author's tip: Using the %SYSFUNC and %QSYSFUNC macro functions

This week’s author tip is from Michele Burlew and her new book SAS Macro Programming Made Easy, Third Edition. Burlew chose this tip because she says the %SYSFUNC and %QSYSFUNC functions allow you to use SAS language functions in macro programming. Access to these functions greatly increases macro programming power and can simplify writing macro code.

We hope you find this tip useful. You can also read an excerpt from Burlew’s book.

The functions %SYSFUNC and %QSYSFUNC apply SAS programming language functions to text and macro variables in your macro programming. Providing access to the many SAS language functions in your macro programming applications, %SYSFUNC and %QSYSFUNC greatly extend the power of your macro programming.

Since these two functions are macro language functions and the macro facility is a text-handling language, the arguments to the SAS programming language function are not enclosed in quotation marks; it is understood that all arguments are text. Also, the values returned through the use of these two functions are considered text.

Functions cannot be nested within one call to %SYSFUNC and %QSYSFUNC. Each function must have its own %SYSFUNC or %QSYSFUNC call, and these %SYSFUNC and %QSYSFUNC calls can be nested.

Using %SYSFUNC to Format a Date in the TITLE Statement

The TITLE statement in Example 6.5 shows how the elements of a date can be formatted using %SYSFUNC and the DATE SAS language function.

title 
  "Sales for %sysfunc(date(),monname.) %sysfunc(date(),year.)";

On January 30, 2014, the title statement would resolve to

Sales for January 2014

The excerpt is from SAS Press author Michele Burlew’s book “SAS Macro Programming Made Easy, Third Edition” Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software).

Post a Comment

Using Big Data & Analytics to Fight Fraud!

It is estimated that a typical organization loses 5% of its revenues to fraud each year (www.acfe.com).  The total cost of insurance fraud (non-health insurance) in the US is estimated to be more than $40 billion per year (www.fbi.gov).  The advent of Big Data & Analytics has provided new and powerful tools to fight fraud.  In my new book, Analytics in a Big Data World, I discuss fraud detection as one important application area.  Furthermore, I have also recently partnered with SAS to develop a new course on the topic of Fraud Analytics using Supervised, Unsupervised and Social Network Methods.

What are the current challenges in fraud detection?

The first challenge is finding the right data.  Analytical models need data and in a fraud detection setting this is not always that evident.  Collected fraud data are often very skew, with typically less than 1% fraudsters which seriously complicates the detection task.  Also the asymmetric costs of missing fraud versus harassing non-fraudulent customers represent important model difficulties.  Furthermore, fraudsters try to constantly outperform the analytical models such that these models should be permanently monitored and re-configured on an ongoing basis.

What analytical approaches are being used to tackle fraud?

Most of the fraud detection models in use nowadays are expert based models.  When data becomes available, one can start doing analytics.  A first approach is supervised learning which analyses a labelled data set of historically observed fraud behavior.  It can be used to both predict fraud as well as the amount thereof.  Unsupervised learning starts from an unlabeled data set and performs anomaly detection.  Finally, Social network learning analyses fraud behavior in networks of linked entities.  Throughout my research, I have found this approach to be superior to all others!

What are the key characteristics of successful analytical models for fraud detection?

A successful analytical model should first possess a good statistical accuracy in terms of hit rate.  It should detect as many as possible of the fraudsters.  Besides this, analytical models should be interpretable.  By understanding the fraud patterns, we can start developing new fraud prevention strategies.  Finally, the models should also be operationally efficient.  This is especially relevant in, e.g., a credit card fraud setting where a fraud decision needs to be made in a few seconds.

For more information about this topic, I am happy to refer to my new book Analytics in a Big Data World.  I also teach a new course on the topic.

For an interview with me and my PhD student Véronique van Vlasselaer working on social networks for fraud detection, watch this video:

You can read more about my work here www.dataminingapps.com.

Post a Comment

SAS author’s tip: What is Clinical Endpoint Committee data?

This week’s author tip is from Jack Shostak’s new book SAS Programming in the Pharmaceutical Industry, Second Edition.

Shostak has been using SAS for nearly 30 years. In that time, he’s co-authored two other books, Common Statistical Methods for Clinical Research with SAS Examples, Third Edition, and Implementing CDISC Using SAS: An End-to-End Guide.

(The following excerpt is from SAS Press author Jack Shostak’s book, “SAS Programming in the Pharmaceutical Industry, Second Edition”. Copyright © 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (Please note that results may vary depending on your version of SAS software.)

Clinical Endpoint Committee (CEC) Data

It is often the case that the endpoint/event form captures data that are not entirely objective because they contain some level of clinical judgment. For instance, when precisely is a cold cured, was an event truly a myocardial infarction, or did any given event truly occur? The clinical site investigator may decide, using his or her clinical judgment, that a given event occurred, but often it is necessary to have an independent assessment of that event by another physician. This independent review helps to ensure that events are reported in a consistent way across multiple clinical sites for a clinical trial. Usually what happens is that a condition on the regular case report form “triggers” the release of a CEC form to be sent to the CEC. The CEC then takes the CEC form and verifies whether or not an actual event occurred based on the data available in the patient’s clinical records at the given site. A sample CEC form follows:

CEC Data

In this CEC form, “event” would be replaced by some clinical finding such as “myocardial infarction,” “stroke,” “seizure,” or the like. Once again, this form is extremely simplified, and there are usually a number of associated data variables captured that help to support the existence of the event.

The biggest problem for the statistical programmer when using CEC data is reconciling these data against the regular CRF endpoint/event data. This can be a difficult task, especially when you consider that a patient may have more than one event on a given day. Fortunately, because the endpoint/event data are so critical to a clinical trial, the quality of the reconciliation from the CEC form to the CRF form is not often relegated to some form of fuzzy data join. Usually there will be a definitive linkage via a key mapping data set that links the CEC event data to the CRF event data. However, if that key data set does not exist, then the statistical programmer must prepare for some difficult programming. It is also worth noting that the data from the adverse event forms, laboratory forms, and other forms, as well as a specific “event” form, may in fact trigger clinical events. This may add to the complexity of the reconciliation programming.

Post a Comment

Make yourself known at the 2015 regional conferences

ShelleySessoms

The U.S. Regional SAS Users Group Conferences are smaller, intimate conferences where you can show off your SAS expertise. The conference coordinators are always looking for strong presenters and keynote speakers. Many of our authors use these conferences to gather immediate feedback on their book topic. And let’s not forget about the promotional opportunities. You can be like Art Carpenter and proudly wear your “SAS Author. Ask Me About My Book” button.

If you’re ready to be noticed at a regional conference, our web site gives you an overview of how to become a SAS author. And we’re always here to answer your questions and help you get started.

We’re looking for SAS authors now. And if you submit a book proposal by early October, we will review that proposal, get feedback to you as quickly as possible, and, if your proposal is accepted, get your project started. Then your book can be either sold or promoted at the regional conferences in 2015.

Now’s the time to think about your 2015 conference presentations. Let’s get that book idea started and you’ll already have a talk or two prepared. And, we’ll even give you a fancy Author button to wear, too!

Contact us today to get started.

Post a Comment