Editor's note: This is the first in a new series of SAS Innovation posts that highlight external technical publications by SAS authors and showcase new innovations from SAS.
Multi-task learning trains multiple tasks simultaneously and leverages the shared information between related tasks to improve the generalization and performance of machine learning models. Unlike conventional machine learning frameworks that solve a single task with one learner, multi-task learning uses the relationships between the tasks to improve results.
Multi-task learning has shown promising performance in predictive modeling. For example, the technique works well for predicting medical outcomes for even complex progressions of disease.
However, missing data still pose a challenge in many of multi-task applications. Conventional missing data handling methods can reduce usable data sizes or distort the covariance structure of the data. Both of these can result in inaccurate inference.
The paper, "Multi-Task Learning with Incomplete Data for Healthcare," was originally presented at the Workshop on Machine Learning for Medicine and Healthcare at the 2018 KDD Conference. It proposes a new method to tackle the challenge of missing features under the multi-task learning framework. The proposed method is effective for prediction and model estimation when missing data is present.
The authors of the paper are Xin J. Hunt, Saba Emrani, Ilknur Kabul and Jorge Silva. They use the Alzheimer’s disease progression dataset to illustrate their technique, but this method can be used in any industry.