Data visualization can revolutionize the way that statisticians accomplish their modeling work, especially in the early phases of an analytics project. In this post, we'll look at how data visualization can improve the initial stages of the analytical lifecycle – problem identification and data preparation – and how SAS Visual Analytics plays a role.
During the problem identification phase, the business problem is formulated or defined in analytical terms that can be addressed with operational data. After considerable effort, it's not unusual for the modeler to find out the available data does n0t support a solution to the given problem. In other words, once an analytics project begins, it's not clear whether there is enough data available to support the predictive analysis. This often requires statisticians to “go back to the drawing board” to determine if sufficient and appropriate data can come from the Enterprise Data Warehouse (EDW).
Managers typically assume that all the data is in the EDW and ready for use. While this may be true for traditional reports – such as standard, ad-hoc, alarm and pivot reports – advanced analytics has different consumptive data needs, and many times the needed data has not been properly staged in the EDW. Determining data suitability requires rapid exploration of large amounts of data to identify the feasibility of solving the problem with the data at hand. Tools that query and visualize massive amounts of data quickly can improve the process.
Without data visualization, analytic staffers don't have a way to easily visualize and query data. Much effort is spent on passing data requests to IT partners. In addition to the extra cycles and resources spent, traditional attempts at data visualization can cause bottlenecks especially in the case of big data. Because of the iterative nature of answering complex questions with big data, statisticians inevitably find “dead ends” on a regular basis. So the initial data request may not be exactly what is needed. You can imagine the ensuing frustration and wasted effort that can occur in trying to locate the right data.
Fortunately, SAS Visual Analytics gives the statistician more independence by providing access to ALL of the data, rather than having to incrementally go back to IT for new and more data. This results in a more agile and less complex way to initiate an analytic project.
Once the statistician confirms that the data has merit for solving the problem, the next task is addressing the analytical nature of the problem. At this point, the researcher may propose to solve the problem with segmentation, predictive models or forecasts.
Regardless of the statistical strategy, it is likely that some data transformation will be needed. Specific statistical techniques require special types of data, or data that is organized in a particular manner. For example, time series data needs to have time interval uniformity and often needs time or business aggregations, as well as adequately defined targets.
The job of the statistician is to find signals from a “sea of data” that help determine an outcome with a reasonable level of certainty. These signals are buried in the data, and they need to be uncovered, sometimes through simple correlations from the raw data, but more often than not from ingenious transformations.
Again, the statistician needs to try several paths that may lead to dead ends. So the quicker the data definition task is accomplished, the sooner the statistician can build the proper inputs that contain plausible data elements, which he can investigate with rigorous statistical methods to determine where the real signals are.
Without data visualization, statisticians are hampered from trying different strategies, and the over-reliance on IT results in delays and unnecessary work cycles. With SAS Visual Analytics, the statistician has the independence to go to the next step of model development and start trying different statistical strategies to solve the problem on his own.
How does a statistician complete these tasks without SAS Visual Analytics? Probably using samples and several SAS tools such as Base SAS, JMP and SAS Enterprise Guide. With SAS Visual Analytics, there is no need to sample. Working with all the data reduces the risk of finding data intricacies that require going back to the drawing board, and thereby expedites the analytical lifecycle to bring better business outcomes faster.
Make the statistician’s job easier. Try a demo of SAS Visual Analytics.