Analytics & electronic medical records can support breakthroughs in university medical research

I was recently interviewed for an article about the use of analytics in medical research at colleges and universities. It’s not surprising this topic is gaining attention.  As a result of Meaningful Use, the ongoing digitization of medical records has created unprecedented opportunities for university researchers to make breakthroughs in preventative medicine, drug efficacy and safety, and better patient outcomes.

Research at institutions of higher education is critical to advances in medicine, and SAS has been a valuable tool in those efforts for decades. For example, Duke University’s Cancer Institute is studying patient side effects from colon cancer treatment.  Using mobile device-enabled software to enter data, physicians can ask colon cancer patients about the side effects and symptoms they experience and record their responses. This direct patient interaction provides robust data, which can supplement the information from large patient data registries. With analytics from SAS, they can analyze all their data to better manage and anticipate adverse patient effects caused by different cancer treatments.

Often, academic researchers are not aware they have access to SAS through a university license. Many schools have a large master license that gives access to a large bundle of software. Interested researchers should check with their IT department.

As more and more medical data is digitized, more challenging questions can be asked.  Clinicians and researchers will want to go beyond summary statistics and traditional analytics and use more sophisticated analyses.  As the data volume explodes and the complexity of analyses grows, new methods and architectures will need to be applied to solve complex problems in a timely manner.  SAS is leading the industry with technology for High Performance Analytics (HPA), including cost-effective ways to implement that technology.

HPA is essential for the most data intensive analyses. For example, true drug safety surveillance involves calculating signals on many drugs and on many adverse events or categories of events.  For example, if you need to perform a separate analysis for 100 different drugs (plus combinations of brands and generics) with 100 different adverse event terms (plus high-level classifications) each along with multiple covariates, you would need to run an analysis overnight to arrive at reliable, complete results. HPA can reduce that to minutes through multiple approaches such as parallel processing, in-database and in-memory analytics.

In addition to the challenges of performance, the growing access to electronic healthcare data reveals a huge data quality challenge. Terminologies are commonly not standardized and unstructured data (text) appears everywhere and remains a largely untapped, valuable resource. For instance, pathology information typed into free fields can be analyzed using text mining and natural language processing to surface more data about the pathologists observations about a patient. By combining extracted pathology concepts with EMR data, we can have an even more comprehensive view of patterns in disease, treatment and prevention.

Ultimately, we want doctors to know what treatment is likely to have the most benefit, and fewest adverse effects, based on a patient’s symptoms, demographic information, medications, etc. The data can reveal what has been most effective for similar patients, and doctors can have that information at their fingertips. Instead of just relying on memory or published best practices, doctors could access their own data to make more knowledgeable treatment decisions. This would help eliminate more unnecessary treatments, which would not just reduce risk, but costs, as well.

The growth of EMR and other medical data also benefits pharmaceutical companies. Clinical trials data is very thorough, but could not approach the scope and volume of observational data in the field.  Doctor and hospital reports could reveal that a specific demographic is experiencing adverse effects to a medication, or that a drug is less effective with a certain population.

EMR and unstructured data, analytics and HPA, and dedicated researchers in universities and the private sector hold the key to a new age of preventative medicine and drug efficacy. I’ve presented a couple examples, but I would love to hear from readers about the possibilities.


  1. Mark Morreale Mark Morreale
    Posted June 14, 2012 at 3:21 pm | Permalink

    Eric I agree that these will be good tools and would like to see them start to use them for Randomized studies. This would propel these tools to the highest level and maybe eliminate some of the expensive prospective studies. People look at me funny when I suggest this but I believe it can be done.

  2. charlie ward
    Posted June 15, 2012 at 11:12 am | Permalink

    First of all thank you for the article Eric! I found it while searching for physical therapy documentation software. I think that EMR technology is the wave of the future in a technological society. It stream lines everything and makes everything so much easier. It also allows patients to check their records which I think is great. While their may be some risks, I think overall its worth it. Thanks again for the read!

  3. Douglas Dame
    Posted June 15, 2012 at 1:58 pm | Permalink

    It should be noted that most research universities and academic medical centers take HIPAA protections VERY seriously. Wide-scale rummaging through patient databases, searching for valuable nuggets, in the way this article suggests, is not allowable under federal privacy regulations.

    Research touching identifiable patient data must be approved by an Institutional [Research] Review Board (IRB) following guidelines for research on human subjects. The IRB is required to carefully balance the possible benefit of the proposed research vs the risks to the subjects. Any access to patient data that is granted by the IRB will specify exactly what kinds of data can be seen, under a "mininum necessary" standard that considers the relevance of requested data fields and the number of patients to be included, will specify exactly who is authorized to see any identifiable data, specify how long they can use it, and specifies what will happen to any identifiable data extracts when the researchers are done. (And then typically someone other than the researchers themselves will actually extract the approved data, and hand it to the researchers.)

    "Free rummaging" will be possible only when the large, production Meanngful Use patient care databases are cloned into a de-identified environment .. and think how hard it will be to safely scrub all text documents to 99.999999999% clean of any identifiable patient details ... or query tools are developed that are smart about recognizing protected data elements, AND can create throw-away surrogate keys on the fly, shift dates by random amounts, etc. People are definitely working on the first, and probably the second as well.

  4. Eric Brinsfield Eric Brinsfield
    Posted June 19, 2012 at 11:19 am | Permalink

    Thanks for all of the comments. Douglass is, of course, absolutely right. We have a huge responsibility to patient privacy. Whether de-identified or consented, the IRB will make the final call on who gains access the data and what data authorized.

    On another note, I just noticed my slip of the keyboard (twice). It is preventive medicine; not preventative.

  5. Posted December 14, 2012 at 11:02 pm | Permalink

    Dear Eric

    I read your article "Automated Drug Safety Signal Detection with Guided Analysis"

    as I am researching on automated vaccine safety signal detection and prediction. Currently, I have 5 years of data for adverse event reporting following immunisation in Victoria, Australia. I am still researching on what additional data is available to add value to it.

    I have been using Base SAS (now 9.3 version) for the past 7 years and recently I have been to a group presentation of different SAS products.

    I wish to have your views and recommendations about choosing which SAS products to achieve this.

    I look forward to hearing from you.


Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  • About this blog

    Welcome to the SAS Health and Life Sciences blog. We explore how the health care ecosystem – providers, payers, pharmaceutical firms, regulators and consumers – can collaboratively use information and analytics to transform health quality, cost and outcomes.
  • Subscribe to this blog

    Enter your email address:

    Other subscription options

  • Archives