Controlling our destiny: Real-time, visual analytics can combat the spread of disease

The recent outbreak of the Zaire Ebola virus has garnered much media attention and calls for action at all levels of government. The current outbreak thus far has been the gravest in history, and the CDC’s worst case scenario predicts up to 1.4 million cases by late January (correcting for underreporting.) The epidemic in West Africa has become so widespread that Ebola could become a permanent presence – and thus pose a persistent threat to other parts of the world.

We have seen these pandemics before: Small Pox in 1633, Spanish Flu in 1918, and Syphilis in antiquity. In fact, viral outbreaks throughout history have been so common, and so prevalent, some scientists have hypothesized that viruses played a role in our own evolution.

It is time to evolve again. We now have the tools to slow down and even stop epidemics, and those tools start with analytics. Not the wonky dissertation style analytics that show up in obscure statistical reports. I’m referring to analytics done in real time, by people who are on the front lines fighting the epidemic. To stop Ebola we must have the capability to deploy real-world analysis to non-expert users instantly so they can act on results immediately.

Welcome to the world of visual analytics.

Epidemiologists have referred to it as, “A technique aiding data analysis and decision making that allows for a better understanding of the context of complex systems.” That it certainly is. And it has the potential to make a difference in this epidemic. In the US, rules have gone into effect that all travelers flying in from Liberia, Sierra Leone or Guinea will undergo strict screening procedures. But will that be enough? Perhaps not.

At SAS I have the privilege of interacting with data scientists from a wide variety of disciplines that specialize in the real-time analysis of large datasets. SAS develops tools that enable real-time detection of fraudulent activity, mostly by the financial services industry. These tools combine a wide variety of approaches such as social network analysis, business rules, forecasting and predictive analytics to determine in near-real time where and when fraud happens. These same types of analytics can be deployed in the fight against viral epidemics. A screener detects a traveler with a high temperature. A school nurse finds a fever in a school child. An emergency room sees a spike in feverish patients. Are these cases Ebola? If not, what are they? And more importantly, are they contagious?

By combining data such as flu trends, disease trajectories, and geospatial information together with passport records, financial transactions (such as where and when an airline ticket was purchased), and information gleaned from social networks, it’s possible to build models to predict the cause of that fever. Doing this could not only help stop Ebola, it would also help stop the spread of any contagious disease. Developing this capacity would usher in a new era in our relationship with pathogens. Unlike our ancestors who had to resign themselves to fate, we can, through clinical analytics and rapid diagnostic testing, actively engage and control the viruses that make us.

Post a Comment

A new world of trust and transparency for clinical trial information

On Monday September 29th, the European Ombudsman organized a panel discussion on “International Right to Know Day.” This day was established in 2002 by access to information advocates from around the world. This year, the panel’s theme was “Transparency and public health – how accessible is scientific data?”

This topic was well chosen for a week in which the board of the European Medicines Agency (EMA) published their long-awaited policy on publication of clinical data (1). The panel at the European Parliament consisted of representatives from all stakeholders that gave input to EMA’s draft policy over the preceding years. The European Ombudsman, Emily O’Reilly, opened the discussion by saying that while there is much good coming from the pharmaceutical industry, more trust is needed to convince patients that therapies are working, and the only way to create that trust is by opening up clinical trials results and data.

The actions of the industry will silence the critics

Both Ben Goldacre (physician and author of the book Bad Pharma) and Margrete Auken (European Parliament member and shadow rapporteur in the new 2014 clinical trial regulation of the European Union), expressed a lingering distrust against the pharmaceutical industry in how they are releasing data. Richard Bergström of the European Federation of Pharmaceutical Industries and Associations (EFPIA) presented the great progress in the last year by the European pharmaceutical industry in their drive to release clinical trial data in a controlled manner. “This is unprecedented,” said Bergström, with many EFPIA members going beyond the principles that his organization has laid out. These principles include sharing all information produced during a clinical trial, including Clinical Study Reports (CSR’s) and the complete set of (anonymized) individual patient data (IPD).

The European Medicines Agency releases their transparency policy

The EMA policy describes what clinical trial information will be released, when it will be released, and also that EMA itself intends to make it available to interested researchers. Guido Rasi, the Executive Director of EMA, received most of the attention with the publishing of the EMA policy the same week. Rasi pointed out that EMA was under no legal obligation to release the clinical trial information owned by the pharmaceutical companies, but did so to increase public trust in regulatory decisions about new products. According to Rasi, the policy intends to strike a balance between releasing clinical trial data and the commercial interests of pharmaceutical companies. Meanwhile, GSK, as the first global pharmaceutical company, decided last year to open up all clinical trial information – including anonymized IPD – for external researchers. This pharmaceutical giant is now providing external researchers the ability to apply for access to a clinical trial, host an independent review panel, and access a secure online data and analysis environment (SAS® Clinical Trial Data Transparency) that allows the applicants to gain access and re-analyze the patient-level clinical trial data.

Europe leads the way in transparency of clinical trials

The “EMA policy on publication of clinical data for medicinal products for human use” – as it is titled – will become effective January 1, 2015, and reflects a legal obligation for transparency in the new European Clinical Trials Regulation No 536/2014 adopted in late May 2014. The European regulatory agency will implement this policy step-by-step, release Clinical Study Reports (or parts of them) initially, and IPD (“individual patient data”) might follow as a result of a follow-on policy. EMA will release only certain modules of the CSRs, such as:

  • Clinical overviews, (module 2.5 of ICH E3 guidelines), clinical summaries (module 2.7), and clinical study reports (module 5, 16.1.1 - protocol and protocol amendments, 16.1.2 – sample CRF, and 16.1.9 – statistical methods).

Sponsors can redact commercial confidential information (CCI), and these redactions need to be approved by the EMA. At a later date, the EMA will detail how and when individual patient data will be released. The policy that was released on October 2nd however, defines two levels of access;

  • A simple registration process will provide access to the information in screen-only mode (no print capability).
  • A second level (for academics and non-commercial users) will require proof of identity and enable downloading and saving information.

Towards full transparency of clinical trial information, step-by-step

In my view, the EMA policy is a great step forward that will contribute to a better understanding of the regulatory decision that resulted in approval or rejection of marketing authorization applications (MAAs). It should, however, only be seen as complementary to the industry’s initiatives in providing complete information, including complete (but redacted) CSR’s, blank CRF’s, IPD’s, protocol information and other types of supporting information from historical clinical trials (2). For example, at least 19 organizations are listed on EFPIA’s transparency website and currently, 9 organizations are joining GSK in allowing access to anonymized patient-level data on After an independent review board approves their requests, researchers can access an advanced statistical computing environment and a multi-sponsor repository where they can analyze and compare trials from different sponsors and extract new clinical knowledge about the medicinal products and devices.

If you are an academic researcher, you can now turn to different organizations for information about medical products: to the regulators for regulatory decisions and submitted reports and to pharmaceutical companies for the detailed trial information – including IPD and the ability to re-analyze the data and compare competing or complementary products.

The future of data transparency

I believe that both access and information sharing systems will continue to thrive in the long-term and provide complementary benefits to the public and external researchers. A growing list of pharmaceutical companies are now fully committed to provide detailed trial information and encourage secondary analysis; e.g. they are discussing how to apply clinical data standards such as CDISC for bringing de-identified data together and methods to de-identify the data (together with the help from industry associations like PhUSE and Transcelerate).

I’m hoping that academic trial research centers will now open up their information as well and consider providing centralized access to the data of clinical trials they’re running – preferably in the same multi-sponsor environment as the industry is currently using. While much progress has been made, some maturity and experience will be gathered by all involved stakeholders when researchers start making discoveries. Researchers can now make full use of these different complementary possibilities, start mining the clinical trials for all important confirmatory and secondary findings, and publish high-quality research to further increase the trust of the patients and physicians in medicines and devices approved for use by health care providers. After all, “the right to know”, as the theme of European Parliament panel was, can only be realized if researchers can make sense of the data in an advanced analytical environment.

  1. Publication of clinical reports: EMA adopts landmark policy to take effect on 1 January 2015.
  2. Krumholz et al. 2014, Sea Change in Open Science and Data Sharing Leadership by Industry, Circ Cardiovasc Qual Outcomes. 2014; (7) 499-504.


Post a Comment

Predicting the one percent in health care

Episode analytics is a method of using patient-centric data to define episodes of care. These episodes of care can be used to define standards of care – from both a cost and quality perspective – and then project these standards forward to establish bundled payment budgets and quality targets. This can be considered a global method of controlling costs. But what if episode analytics can be used in a predictive sense to determine the next top spenders?

Health care spending is not equal. For the civilian* population, 20 percent of spend is on behalf of one percent of the population, and five percent of the population is responsible for nearly 50 percent of all spend. These members are easily identifiable through claims analytics, and are often the focus of case management efforts to help control their costs. While these care management efforts are effective, they can’t reverse historical spending – nor can they ameliorate the episodes of care that drove the spending. The question is, can episode analytics be used to identify the episode characteristics that can predict the next one percent in order to practice preventive care?

Because SAS® Episode Analytics is patient-centric, it provides a full view of the episodes of care the patient has experienced. This view, however, is rather unique. Not only is all care included, but it is categorized in several manners. First, the care is associated with all episodes that are appropriate. If a follow-up visit after surgery includes diagnosis codes indicating chronic care, the chronic care episode(s) are associated with the visit, in addition to the surgical episode. This identification is hierarchical in nature as well. If the care initiates an episode of care, it is fully allocated to that episode, but can also be associated – not allocated – to another episode. Additionally, care can equally be split in the allocation. This hierarchical categorization of care is unique and allows insight into connections – or lack thereof – in care.

With SAS® Episode Analytics, you have total cost by member, by condition in the stacked graph at the bottom. The upper right breaks out cost by category (T=typical, C=complication, TC=typical with complication). And the reason for the potentially avoidable complication (or PAC) upper left shows cost by condition.

With SAS® Episode Analytics, you have total cost by member, by condition in the stacked graph at the bottom. The upper right breaks out cost by category (T=typical, C=complication, TC=typical with complication). And the reason for the potentially avoidable complication (or PAC) upper left shows cost by condition.











Another feature of comprehensive episode analytics is categorizing care as typical care or a potentially avoidable complication (or PAC). This is not only a method to quantify quality – but also identify future, undesirable, member health implications. With SAS, these PACs are categorized based on clinical criteria, such as adverse effect of drug or peripheral embolism. There are over 200 categories PACs identifiable today. These complication categories have the full claim history – including not only the procedures but also the diagnoses – behind them.

The combination of hierarchical associations as well as complication categorizations provides a valuable tool to analyzing historical claims. This new insight into member claims history provides new tools for analytics, and predictive engines. These engines can, in turn, be used to predict the members – and providers – that can benefit from future actions.


*Civilian excludes residents of institutions – such as long-term care facilities and penitentiaries, as well as military and other non-civilian members of the population. “Care” reflects personal health care and does not include medical research, public health spending, school health or wellness programs. From “The Concentration of Health Care Spending,” National Institute for Health Care Management (NIHCM) Foundation.


Post a Comment

Will the health data you’re using truly answer your question?

Computer processors have undergone a stable and consistent growth since Alan Turing and his contemporaries invented the first “modern” mechanical computers. One way of quantifying this growth is by Moore’s Law which says that every two years we will double the transistors on integrated circuits. While this is a bit too technical to mean much to me, to Intel that means a new processor generation every two years. I couldn’t find a direct benchmark comparison, but try to remember the cutting edge Pentium III you used in 2000 and compare that to the Intel Haswell chip in your ultra-thin MacBook Air (notwithstanding the high-end quad cores in performance machines.)

The ubiquity of advanced analytics
This growth in computing capability has dramatically and positively changed the face (and pace) of analytics. Concepts like machine learning aren’t just hypotheticals or relegated to academia anymore; they are reality, they are powerful, and they are everywhere. The value we get from using advanced analytics is immense, and now, more than ever, modern tools are highly accessible to a wider array of users. Users may not know (or even need to know) how the wheels turn behind the scenes, but with very simple interfaces they’re able to start those complex wheels turning.

First building block: Data
While all this technology has opened up amazing possibilities with respect to easily accessible insight, we would be loath to forget all of the lessons that traditional statistical methods can provide. While the notion of stating a “formal” hypothesis may seem to be limiting (e.g., why test one thing when I can explore a thousand?), taking the time to formulate a research hypothesis makes you think critically about what you’re doing. One of the most important questions you can ask yourself during this process is whether the health data you’re using is even appropriate to answer the questions you want to consider. Lots of data sources may collect similar data elements, but they collect them in different ways and for different reasons.

The myriad of health care data

The myriad of health care data

For instance, medical diagnoses can be captured from billing claims, EMRs, patient histories or public health surveys (e.g., NHANES). Each of these sources could potentially be used to power similar insights – but they do so with differing qualities and caveats. Claims and EMRs come from an “expert” clinical source and diagnoses may be more accurate, where patient histories may include information outside the view of the treating physician but are based on a patient’s own biased recall. All three of these sources are limited to a self-selecting population and lack the coverage of what a general population survey might represent, though here you are limited by data use restrictions, questionnaire limitations and the bias of those pesky respondents.

The art of statistics
Perhaps the most confusing part, and what makes statistics more of an art than a science, is that all of the above scenarios can be right depending on your needs.

I don’t bring up this issue to deride or lampoon the prevalence and utility of highly accessible analytic tools or those who use them. I’m a strong believer that broader access to these tools will open us up to insights we wouldn’t otherwise uncover. At the same time, we can easily forget that not all insights are created equal. As you look at the results and information you uncover, before you evaluate the impact they may have on your business, first evaluate the underlying quality with which they were created.

An example comes from a former colleague who worked on a study profiling pilots and trying to predict who would make a good pilot. In the end, the only significant factor they found was whether you liked strawberry ice cream. Likely, I would guess that a fear of heights and motion sickness are better indicators that I wouldn’t be a good pilot, but maybe it’s been the ice cream all along.

Post a Comment

It takes all kinds (of data): Going beyond our comfort zone with clinical models

When I’m working with new customers or on a new project, there are a handful of questions I typically ask. These help me set the stage, understand needs, and most importantly – learn the customer’s expectations. Almost always, I spend some time talking about what an acceptable model looks like to them. Does it need to have certain characteristics or can the data speak for itself?

“Let the data speak” is the gist of the typical answer, but that usually isn’t reality. It’s like telling someone to “help yourself to anything in the fridge”; you really don’t mean for him or her to grab the steaks you were planning on eating for dinner. They can have anything they want, inside of a predefined, unspoken set of boundaries.

We want to explore the data, but often, we want the data to speak to us in terms of what we already know. An endocrinologist isn’t likely to accept a model predicting diabetes trajectory that doesn’t include HbA1c. A cardiology researcher is going to want to see a QT interval. And an epidemiologist specializing in pulmonary diseases is going to want FEV.

We convince ourselves – due to research, expert opinion, or simply habit – that models must include certain concepts or be rendered invalid. I definitely advocate for the consideration of these known factors in model creation. They’re not only elements that will help to define a robust model, but given our current clinical knowledge, they represent mechanisms by which we can effect a change.

However, while creating models with such considerations is necessary to provide value in a certain context, I would also raise three counterpoints to this. I challenge you to consider these the next time you start a modeling process:

  1. Health care and medicine (like most industries) are a science and while that carries with it the scientific method and its inherent rigor, it also brings with it fallibility. Unlike mathematics, the sciences represent our best understandings and not necessarily truth. While I doubt the relationship of HbA1c to diabetes will go the way of “phlogiston,” I don’t doubt that a sufficient span of time will make many of our current scientific truths seem equally preposterous.

    A statistical model built on valid and robust data that defies current clinical knowledge may be a statistician’s contribution to science. I’m not saying to throw out current knowledge and create off-the-wall models. But rather, we have an opportunity through the exploration of data to bring up new ideas or challenge old ones.

  2. Highly predictive but clinically illogical models may still have utility, though perhaps not in the traditional sense. A model derived based on magazine subscription history, peanut butter brand-switching habits, and completely devoid of any traditional cardiovascular risk indicators, that can calculate a reliable 30-day risk score for a heart attack has value.

    It doesn’t give us actionable information we can use to mitigate that risk, but it does alert us to its presence – whatever the cause. Often we may not have the luxury of ideal data to derive a model. A patient who hasn’t had a heart attack may have never seen a cardiologist, had an EKG, or even have a recent cholesterol panel or CBC. And, even if you do have this data, how often is it collected? But if Cat Fancy and a recent purchase of a jar of Peter Pan crunchy can send up a red flag, why not listen to it?

  3. Many tests are biased, most people lie (at least a little), and all systems are imperfect. We cannot necessarily assume that a data point which attempts to capture a particular concept is able to do so perfectly. Especially in fields like medicine where our most valuable observations aren’t based on static and easily measured concepts. A cashier can count the rolls of toilet paper you purchase, a bank teller can count the dollars and cents in a transaction, but even the best lab tech can’t count the number of white blood cells in a drop of blood.

    Generally speaking, the data points we use are at best highly correlated to the concepts they represent, and at worst, a set of random values. Perfection cannot be reached and bias is often impossible to mitigate, but if we can have consistent bias, we can still have useful information. We can capture directional trends and consistent results. I may not be willing to believe someone who says they took their prescribed statin 300 of the last 365 days, but if I can assume a consistent trend in bias, then that answer still has value to me (just not necessarily as an accurate measure of adherence).

Modern computing resources are powerful and in many ways our data is plentiful. There is no reason not to explore every model we can, no matter how ridiculous or counter-intuitive it might seem to be at first. From this we might discover something new (or refute something old), or even create a new early warning system for heart attacks. Just as correlation doesn’t imply causation, we should also remember that an element of causation (especially as a part of a highly complex and not fully understood system) doesn’t necessarily give us high correlation.

Remember, a statistical or predictive model is a tool. We can use it in many ways from the detection of a signal amidst the noise or to help us find areas where we can effect a change for better health. Tools can be constructed in many ways, and two that seem similar may have drastically different uses. Understanding how it was made and what it was made for is how we come to use a tool properly and ultimately derive maximum value.

Post a Comment

Do you have holes in your socks? Maybe you can help improve health care!

Sometimes I wear socks with holes in them. More often than I care to admit. Why? Because of a man named Eldon Richardson. Eldon was a Great Depression era electrician – and my maternal grandfather. I hope you have memories like mine, listening to the stories of overcoming hardship with grit and determination. People who lived through the Great Depression thought differently. They were practical – unbelievably practical. They wore socks with holes in them because they focused on more important and more practical things.

Maybe health care can learn a thing or two from holey socks. We could think differently and act practically. What if we, as leaders in health care, took bold measures of practicality: hard-nosed, Great Depression style practicality? Health and health care would advance more quickly than ever before.

Health care is NOT health (according to Lauren Taylor, author of The American Health care Paradox.) Since health is 60 percent socioeconomic/environmental/behavioral, and 20 percent genetic, but only 20 percent health care, it will obviously take more than doctors to improve our health. If the average American spends one hour per year with their primary doctor, but 240 hours per year in a store or online retail setting (Vaughn Kauffman, Principal, Health Industries Advisory Services, PwC), we could learn from retailers to help Americans develop behaviors that improve their health. Or consider that Americans check their smartphones over 100 times per day (Dr. Joseph Kvedar, Director, Center for Connected Health, Partners Health care) – what a great place to interject ideas to improve their health!

Sure, improving health and health care is a challenge, but an insignificant one compared to what grandpa Ed and millions more faced 80 years ago. If they did it, so can we. But it will take more than a new pair of socks! We’ll have to think differently and act practically!

How do we do that in the modern world? By using what’s available to us now to make progress just as our grandparents did – such as changing our thinking to be analytically driven – a proven approach across most other industries. Using our data to discover insight, predict the future based on the past, deploy insights we gain from data to drive proactive action, and monitor the results for the continual improvement of health care.

Post a Comment

EHR systems should enable the triple aim, not prevent it

A recent news headline read, “Bipartisan committee wants government-subsidized electronic records systems scrutinized for ‘information blocking.’” *

The question before the US Senate Appropriations Committee is whether taxpayer-funded EHR software solutions are now preventing the unrestricted exchange of medical records between health care organizations. If this is in fact the case, this undermines the Affordable Care Act and is a considerable waste of taxpayers’ money… and that money is considerable. The US$27B authorized as financial incentives for providers that implement EHR systems has driven broad adoption. However, one of the main goals behind broad EHR adoption was to expose the health care data that historically has been buried in the paper charts in filing cabinets.

Source: CDC/NCHS

Click to view a larger image.

The question before the Senate Appropriations Committee should be much broader. It's not just if the EHR systems are “preventing or inhibiting” unrestricted exchange of medical records. It's also whether these systems are enabling providers to readily store, access and mine the breadth and depth of all of the patient data generated within their own system. Think lab values, medical device data, and unstructured data as examples. Additionally, to be clear, we are not talking about securing access to the EHR data through a multi-week or multi-month consulting engagement by which the EHR vendor extracts the data from the provider’s EHR system and then delivers that data to the provider for their exploration. This is certainly feasible and may be a lucrative business for the EHR vendors, but it undermines the intent of the EHR systems. We know what e-Patient Dave would say, “Give me my darn data!”

The US health care system needs unencumbered real-time access to all of the data in the EHR systems – and this includes date/time stamps and facility/clinician signatures on all elements of a patient record. This will enable insights to be mined from the data under the premise that the data is freely accessible, and it enables data to be funneled back into the EHR and appended to individual patient records.

The promise of the triple aim relies in-part on the ability to leverage medical and other data to have a holistic picture of patients to identify what treatment approach is best for which patient at lowest possible cost. However, if data is inaccessible, realizing the goal of the triple aim will be in jeopardy.

If you work in health care, make sure that the data of your patients is easily accessible to you, your colleagues, and other providers in the care continuum of your patients for whatever purposes associated with improving care, decreasing costs and improving your patients’ experience.

If you work outside health care in the US, call the bipartisan appropriations committee. These are your tax dollars at risk of not delivering on the goal of improved care at a lower total cost for all of us.

Hopefully, the senate appropriates committee is coached on the fact that it is not just data “exchange.” It is about much more, including:

  • Ensuring that the data within the EHRs is of robust quality and can be easily exposed for the purposes of combining with data outside the EHR;
  • Enabling the mining of combined data sets, both structured and unstructured, to capture insights about patient outcomes;
  • Surfacing insights as to how clinical variability impacts patients; and
  • Facilitating the appending of data to patient records such that externally generated data (e.g. patient scores) can be embedded back into the patient workflow.

As the 2009 American Recovery and Reinvestment Act authorizes a net $27 billion in spending to support EHR adoption through 2017, lawmakers must fund the program each year. These are our tax dollars at-work or not-at-work due to data being potentially locked into the vaults of EHR software solutions. We need to see the EHR systems delivering on their true potential to enable improved care and lower costs, versus being the most sophisticated and expense filing cabinets in history.

* USF Health, Morsani College of Medicine, University of South Florida, “Senate Committee Calls for EHR Interoperability Investigation, Aug. 5, 2014.

Post a Comment

Health analytics - Rapidly gaining ground

Having long ago witnessed the power of analytics to improve performance, efficiency, cost and quality of online banking and investment services, I have been an advocate and evangelist of its power to do the same in health care. That’s one reason why I’m excited about the recent tidal wave of news, articles, blogs, announcements and public dialogue about the value of analytics in health care.

The recent Health Affairs article captured my attention partly because it was authored exclusively by industry clinicians and academicians, not by technology vendors. The article, Big Data In Health Care: Using Analytics To Identify And Manage High-Risk And High-Cost Patients^ acknowledges “unprecedented opportunities” to use big data to reduce the costs of health care. Perhaps most importantly, the authors identify six specific opportunities where analytics could and should be used to reduce cost:

  1. Identify early – and proactively manage – high-cost patients.
  2. Tailor interventions to reduce readmissions.
  3. Estimate risk to improve triage.
  4. Filter out noise to detect valid signals of decompensation.
  5. Predict severity of illness to prevent adverse events.
  6. Optimize treatment for diseases affecting multiple organ systems. 

It's encouraging to see health care executives acknowledging the need for analytical competency in their organizations; to see the US Congress acknowledging the need for data transparency and interoperability; to hear clinicians asking for analytically-derived decision support tools; to watch prestigious academic organizations expanding advanced degree programs in health informatics and biostatistics; and to hear health IT organizations demanding interoperability of data between EMR (electronic medical records) and their other systems.Health Analytics

I'm delighted by the article above, and by the early wins and impressive results generated in these six areas by friends and colleagues who are using advanced analytics to surface insights in their organizations across the globe. For example, our friends at the UNC School of Medicine are exploring the utility of big data for predicting exacerbation in diabetic patients, an innovation with the potential to simultaneously tackle the items in the list above: 1 (high-cost patients), 2 (tailor interventions) and 5 (predict to prevent).

Another example is the work being done at the Department of Orthopedic Surgery at Denmark's Lillebælt Hospital to use text analytics in automated clinical audits to detect and correct errors. The Lillebælt innovation demonstrates the efficiency gains made possible only through automation and the power to prevent patient injury at a scale which would otherwise be cost-prohibitive.

Perhaps the most exciting news of late is the announcement that Dignity Health is partnering with SAS to build a cloud-based big data analytics platform to enable value-based healthcare. In my opinion, this announcement represents a systemwide commitment to adopting health analytics as a core competency and puts Dignity Health on the road to realizing value in all six of the areas mentioned by Bates, and many more too numerous to list.

These are leading indicators that health care is modernizing and, I’m confident, will ultimately showcase the power of analytics to improve health care. The bottom line: Advanced health analytics is gaining ground in the industry and is picking up speed as more and more providers realize The Power to Know®.


^ Bates, D.W., Saria, S., Ohno-Machado, L., Shah, A., and Escobar, G. (July 2014). "Big Data In Health Care: Using Analytics To Identify And Manage High-Risk And High-Cost Patients," Health Affairs, 33, no.7.

Post a Comment

The value of big data – Part 3: “Big something”

I seem to write quite a few blogs downplaying the idea of big data; but to be quite honest, buzzwords like that tend to annoy me. They take attention away from the underlying problems and the more we use these terms, the less real meaning they seem to have. Saying “I have big data” seems to be a reflex, much like that embarrassing moment when the person behind the ticket counter tells you to enjoy the movie and you say “Thanks, you too”. You aren’t quite thinking about what you are saying but you mean well when you do it.

While the term big data is traditionally defined in a relative manner (allowing everyone to share in the joy of having it), I think it should be reserved for specific things. If, for instance, you have big data because you bought a lot more data than you can consume because you had money allocated for it, you have a “big spender.” If you have big data but you still don’t know how to analyze and utilize the data you’ve had for the last five years, then you have a “big dreamer.” And finally, if you have big data and you don’t have any idea how you got there, well, then you simply have a “big problem.”166539242_300x169_72dpi

Big data should be reserved for when otherwise carefully curated and managed sources of data and analytics have some kind of fundamental paradigm shift which changes their volume, variety, velocity and/or value. When your previously well-managed EMR analytics group is now able to use text mining and can look at 10 years of unstructured data, you have big data. When your data provider is now able to give you daily feeds of data instead of quarterly, then you have big data. When your genetic assay test drops from $400 to $4 and you can collect 100x as much information as you could before, you have big data.

This doesn’t mean that the people with a “big spender,” “big dreamer,” or “big problem” don’t have real issues that advances in analytics and technology may be able to resolve. But rather, the problems they need to solve are of a different nature. While it may seem like we all are swimming in the same big data pool, it’s important to keep in mind where you dove in from before you start trying to swim.

Post a Comment

Desiderata for enterprise health analytics in the 21st century

With apologies and acknowledgments to Dr. James Cimino, whose landmark paper on controlled medical terminologies still sets a challenging bar for vocabulary developers, standards organizations and vendors, I humbly propose a set of new desiderata for analytic systems in health care. These desiderata are, by definition, a list of highly desirable attributes that organizations should consider as a whole as they lay out their health analytics strategy – rather than adopting a piecemeal approach. They form the foundation for the collaboration that we at SAS have underway with Dignity Health.

The problem with today’s business intelligence infrastructure is that it was never conceived of as a true enterprise analytics platform, and definitely wasn’t architected for the big data needs of today or tomorrow. Many, in fact probably most, health care delivery organizations have allowed their analytic infrastructure to evolve in what a charitable person might describe as controlled anarchy. There has always been some level of demand for executive dashboards which led to IT investment in home grown, centralized, monolithic and relational database-centric enterprise data warehouses (EDWs) with one or more online analytical processing-type systems (such as Crystal Reports, Cognos or BusinessObjects) grafted on top to create the end-user-facing reports. Over time, departmental reporting systems have continued to grow up like weeds; data integration and data quality has become a mini-village that can never keep up with end-user demands. Something has to change. We’re working with Dignity Health to showcase what an advanced enterprise analytics architecture looks like and the transformations that it can enable.

Here are the desiderata that you should consider as you develop your analytic strategy:

  1. Define your analytic core platform and standardize. As organizations mature, they begin to standardize on the suite of enterprise applications they will use. This helps to control processes and reduces the complexity and ambiguity associated with having multiple systems of record. As with other enterprise applications such as electronic health record (EHR), you need to define those processes that require high levels of centralized control and those that can be configured locally. For EHR it’s important to have a single architecture for enterprise orders management, rules, results reporting and documentation engines, with support for local adaptability. Similarly with enterprise analytics, it’s important to have a single architecture for data integration, data quality, data storage, enterprise dashboards and report generation – as well as forecasting, predictive modelling, machine learning and optimization.
  2. Wrap your EDW with Hadoop. We’re entering an era where it’s easier to store everything than decide which data to throw away. Hadoop is an example of a technology that anticipates and enables this new era of data abundance. Use it as a staging area and ensure that your data quality and data transformation strategy incorporates and leverages Hadoop as a highly cost-effective storage and massively scalable query environment.
  3. Assume mobile and web as primary interaction. Although a small number of folks enjoy being glued to their computer, most don’t. Plan for this by making sure that your enterprise analytic tools are web-based and can be used from anywhere on any device that supports a web browser.
  4. Develop purpose-specific analytic marts. You don’t need all the data all the time. Pick the data you need for specific use cases and pull it into optimized analytic marts. Refresh the marts automatically based on rules, and apply any remaining transformation, cleansing and data augmentation routines on the way inbound to the mart.
  5. Leverage cloud for storage and Analytics as a Service (AaaS). Cloud-based analytic platforms will become more and more pervasive due to the price/performance advantage. There’s a reason that other industries are flocking to cloud-based enterprise storage and computing capacity, and the same dynamics hold true in health care. If your strategy doesn’t include a cloud-based component, you’re going to pay too much and be forced to innovate at a very slow pace.
  6. Adopt emerging standards for data integration. Analytic insights are moving away from purely retrospective dashboards and moving to real-time notification and alerting. Getting data to your analytic engine in a timely fashion becomes essential; therefore, look to emerging standards like FHIR, SPARQL and SMART as ways to provide two-way integration of your analytic engine with workflow-based applications.
  7. Establish a knowledge management architecture. Over time, your enterprise analytic architecture will become full of rules, reports, simulations and predictive models. These all need to be curated in a managed fashion to allow you to inventory and track the lifecycle of your knowledge assets. Ideally, you should be able to include other knowledge assets (such as order sets, rules and documentation templates), as well as your analytic assets.
  8. Support decentralization and democratization. Although you’ll want to control certain aspects of enterprise analytics through some form of Center of Excellence, it will be important for you to provide controlled access by regional and point-of-service teams to innovate at the periphery without having to provide change requests to a centralized team. Centralized models never can scale to meet demand, and local teams need to be given some guardrails within which to operate. Make sure to have this defined and managed tightly.
  9. Create a social layer. Analytics aren’t static reports any more. The expectation from your users is that they can interact, comment and share the insights that they develop and that are provided to them. Folks expect a two-way communication with report and predictive model creators and they don’t want to wait to schedule a meeting to discuss it. Overlay a portal layer that encourages and anticipates a community of learning.
  10. Make it easily actionable. If analytics are just static or drill-down reports or static risk scores, users will start to ignore them. Analytic insights should be thought of as decision support; and, the well-learned rules from EHRs apply to analytics too. Provide the insights in the context of my workflow, make it easy to understand what is being communicated, and make it easily actionable – allow users to take recommended actions rather than trying to guess what they might need to do next.

Thanks for reading, and please let me know what you think. Do these desiderata resonate with you? Are we missing anything essential? Or is this a reasonable baseline for organizations to get started?

We’ll be sure to update you as our collaboration with Dignity Health progresses.

Post a Comment
  • About this blog

    Welcome to the SAS Health and Life Sciences blog. We explore how the health care ecosystem – providers, payers, pharmaceutical firms, regulators and consumers – can collaboratively use information and analytics to transform health quality, cost and outcomes.
  • Subscribe to this blog

    Enter your email address:

    Other subscription options

  • Archives