What an IT project manager should know about analytics projects


Imagine this situation: You go to you doctor and tell her that "something hurts in the lower chest area." Now, based on this short description of your pain, you expect her to come up with a precise diagnosis and a specific therapy suggestion, including treatment time and associated cost.

Sound realistic? Would you even trust a doctor who would come up with such a complete diagnosis with so little information?

Well, this is very similar to the situation that an analytic consultant (the "doctor") often faces when talking to someone responsible for an IT project (the "patient").

If you're an IT project manger, you are typically expected to:

  • Define a precise project timeline.
  • Specify the project's objective and develop clear KPIs that define project success.
  • Include milestones at several time points.
  • Plan capacity and allocate ressources accordingly.
  • Staff the project team with people of different roles and skills,.
  • Break down the project scope into separate, well-defined tasks that can be assigned to individual project team members.

In fact, certain project methodologies like SCRUM help you do all this in a very efficient and flexible way and have proven quite successful for agile software development. If your organization has completed IT projects - like setting up a data warehouse or a business intelligence reporting application - in the past, and you've recently invested in analytics technology, you might plan to deal with them in the same way, especially when doing a first proof-of-concept or pilot study.

Certainly, there's no reason why IT projects involving analytics should not benefit from project management. However, there are few things that make analytics projects different. If you are an IT project manager with no prior exposure to analytics, you may have to watch out for these. And you should be aware that an analyst might not be able to answer related questions easily.

What's so special about analytics projects?

Same same but different: four recommendations when it comes about analytics projects

I see four main aspects that make analytics projects differ from other types of IT projects, and I offer these four tips as a result.

1) Allow time for building domain expertise
Translating a particular business question into something that an analytic consultant can use for a modeling approach will require not only involvement from IT, but also from the particular business function that's sponsoring the project.

Unless this very same type of project has been done many times before, expect and allow for some time for the analytic consultants to acquire the necessary domain expertise to tackle the problem. This will involve exchanges with people from the business units, probably more than one single initial kick-off workshop, and a final presentation of the end results.

Plan for several intermediate feedback sessions, as it is unlikely that some analytic consultant - even if there is domain expertise - will have a complete picture of a problem after one single exchange with the business.

Also, allocate some time for research activities. Depending on previous project experience, an analytic consultant will need time to dig into the specific industry and business topics. When compared to designing a reporting application or migrating a system to a newer release, for example, analytics frequently requires a much deeper understanding of the business issue at hand. It might also require getting familiar with new modeling techniques that are suitable to solve a particular problem. In a sense, this makes an analytics project similar to academic science.

2) Limit the scope and try to be realistic
Many times, there are very high expectations on the added value of analytics. This leads project sponsors to phrase the project's goal in very high-level terms such as "Find out the most important drivers of my revenue." or "What factors influence customer churn?" Then they expect some hyper-intelligent, data mining algorithm to provide a straight answer.

It's not that easy. Typically, the business issue to solve has to be broken down into different questions that are easier to handle and can be tackled by a particular form of algorithm. A better way to frame a business question would go like this: "I want to understand which particular segment of my existing customer base had a high rate of churn during the last 3 months, and I want to use this information to predict who is likely to leave future."

This more defined request gives the analytic consultant a better idea of what kind of technique (predictive modeling) to use, even if there are still many things left to be put into concrete terms, including the question of how to define a model's target variable, the particular time period for building a data set for model training, what type of data attributes to use to describe the customer, and so on.

Formulating the requirements for the necessary data and checking for their availability will also be easier with a more precise question or request. In contrast, just telling the consultant, "I have this data warehouse and I want you to find any interesting patterns in it," would result in a very cumbersome exercise.

Finally, be realistic as to what can be accomplished within the expected time frame of the project, and manage the expectations of the project sponsors. Throwing the toughest challenge of most strategic importance at the analyst might not be the easiest path to success. This predictive analytics blog post from MIT Sloan Management Review has a nice view on this issue.

3) Be aware of the iterative nature of analytic modeling
When it comes to analytic modeling, there are many dependencies that make it hard for an analytic consultant to suggest how long it will take to build a model. First of all, a model will never be final or optimal in a sense that it cannot be improved upon. Keep in mind the quote, "All models are wrong, but some are useful," issued by the famous statistician G.E. Box.

Models are abstractions from reality, and the resulting model quality - in terms of accuracy, robustness, and so on - will always be a compromise. So, instead of asking how long it will take to build the model, allocate a certain time budget to develop such a model and accept the resulting quality as the best that you can do within the given time. An analytic consultant might need to go through repeated steps of exploring the data, finding anomalies, deciding on variable transformations, estimating model coefficients and looking at the results (such as fit statistics or residuals in a regression model) to decide if the model is good enough.

Again, be realistic. For example, you might be able to have a customer segmentation (cluster analysis) or churn prediction model developed in just a few days, but I doubt that the results will be very robust. So, it's not a question of whether you need three or 30 days, but more about what level of model accuracy are you willing to compromise on. Finally, model accuracy will be effected by data availability and data quality, which cannot be known prior to the exploratory analysis phase and which brings me to the last point.

4) Account for times to fix data quality issues and other data related issues
The requirements for any decent model are quite high when it comes to availability and quality of data. In an analytic project, it is not uncommon to find anomalies and data errors when you start exploring the data. Many times, such issues can be discussed and resolved together with input from business and/or somebody responsible for managing the data - but you have to allow for time and feedback sessions on this topic. In other words, don't expect data to be ready for modeling after an initial data management phase and never to be touched again thereafter. Allow for additional data management resources to fix any pending issues after a first exploratory phase.

Also, when you assign people with specific skills to given tasks, be aware that terms like data management or transformations can mean different things for different people. For somebody working on a data warehouse project, this might involve things like building ETL (Extract, Transform, Load) jobs to repeatedly populate a given set of tables in a database. These types of activities can be planned in advance. For an analytic person, on the other hand, the type of data aggregation or variable transformation to use will be ad-hoc decisions, depending on the contents of the data, and, again, the process might require iterative steps.

Since these tasks are so specialized, they cannot be transferred easily to an ETL programmer, as that person wouldn't know, for example, when it's better to do a variance stabilizing transformation (like a logarithmic transformation) vs. binning of a continuous variable. Also, standard ETL tools and SQL are typically not best suited to help with this. So, expect that there are some data management tasks that can only be done by the analytic consultant and don't try to allocate the necessary ressources to an ETL specialist just because that person is easier to come by.

If, after reading this, you get the impression that analytics projects are more or less unmanageable, that certainly not the case - or my intention. Quite, the contrary, I have found that analytics projects need project management more than anything else. So, don't be scared. Just be aware that we (analytic consultants) are a little bit different ...


About Author

Stefan Ahrens

Sr Solutions Architect

Stefan Ahrens hat an der Westfälischen Wilhelms-Universität Münster Volkswirtschaftslehre mit den Schwerpunkten Statistik und Ökonometrie studiert und ist seit November 2003 als Solution Architect im Competence Center Analytics bei SAS Institute Deutschland tätig. Seine Tätigkeitsschwerpunkte liegen aktuell bei den Themen Statistische Datenanalyse, Data Mining, Forecasting und Betrugserkennung für verschiedene Branchen. Vor seiner Tätigkeit bei SAS Institute war bei StatSoft, einem Hersteller für Statistik-Software, und bei Research International, einem Marktforschungsunternehmen, jeweils als Statistiker und analytischer Berater tätig.

1 Comment

Back to Top