Using Ensembles And Distillation To Optimize The Deployment Of Deep Learning Models For The Classification Of Electronic Cancer Pathology Reports

Document Type

Article

Publication Date

10-1-2022

Publication Title

JAMIA Open

Abstract

Objective: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. Materials and Methods: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. Results: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. Discussion: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. Conclusions: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required.

Volume

5

Issue

3

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Recommended Citation

De Angeli, Kevin; Gao, Shang; Blanchard, Andrew; Durbin, Eric B.; Wu, Xiao Cheng; Stroup, Antoinette; Doherty, Jennifer; Schwartz, Stephen M.; Wiggins, Charles; Coyle, Linda; Penberthy, Lynne; Tourassi, Georgia; and Yoon, Hong Jun, "Using Ensembles And Distillation To Optimize The Deployment Of Deep Learning Models For The Classification Of Electronic Cancer Pathology Reports" (2022). School of Public Health Faculty Publications. 81.
https://digitalscholar.lsuhsc.edu/soph_facpubs/81
10.1093/jamiaopen/ooac075

Using Ensembles And Distillation To Optimize The Deployment Of Deep Learning Models For The Classification Of Electronic Cancer Pathology Reports

Document Type

Publication Date

Publication Title

Abstract

Volume

Issue

Creative Commons License

Recommended Citation

Included in

DOI

Search

Browse

Author Corner

Links

School of Public Health Faculty Publications

Using Ensembles And Distillation To Optimize The Deployment Of Deep Learning Models For The Classification Of Electronic Cancer Pathology Reports

Authors

Document Type

Publication Date

Publication Title

Abstract

Volume

Issue

Creative Commons License

Recommended Citation

Included in

Share

DOI

Search

Browse

Author Corner

Links