New method for handling incomplete data in multi-task learning

0

Editor's note: This is the first in a new series of SAS Innovation posts that highlight external technical publications by SAS authors and showcase new innovations from SAS. 

Multi-task learning trains multiple tasks simultaneously and leverages the shared information between related tasks to improve the generalization and performance of machine learning models. Unlike conventional machine learning frameworks that solve a single task with one learner, multi-task learning uses the relationships between the tasks to improve results.

Multi-task learning has shown promising performance in predictive modeling. For example, the technique works well for predicting medical outcomes for even complex progressions of disease.

However, missing data still pose a challenge in many of multi-task applications. Conventional missing data handling methods can reduce usable data sizes or distort the covariance structure of the data. Both of these can result in inaccurate inference.

The paper, "Multi-Task Learning with Incomplete Data for Healthcare," was originally presented at the Workshop on Machine Learning for Medicine and Healthcare at the 2018 KDD Conference. It proposes a new method to tackle the challenge of missing features under the multi-task learning framework. The proposed method is effective for prediction and model estimation when missing data is present.

The authors of the paper are Xin J. Hunt, Saba Emrani, Ilknur Kabul and Jorge Silva. They use the Alzheimer’s disease progression dataset to illustrate their technique, but this method can be used in any industry.

Share

About Author

Brett Wujek

Sr. Manager, Product Strategy, Next Generation AI Technologies

Dr. Brett Wujek leads Product Strategy for next-generation AI technologies at SAS. He and his team work to establish the vision and roadmap for products incorporating the latest AI technologies, including generative AI and synthetic data generation. Brett previously worked as a data scientist in the machine learning R&D division at SAS. Prior to that role, he led the development of engineering simulation and design software at Dassault Systèmes. His formal background is in design optimization methodologies. He earned his PhD from the University of Notre Dame for his work developing efficient algorithms for multidisciplinary design optimization.

Related Posts

Comments are closed.

Back to Top