PhD Candidate Univ. of Pennsylvania Philadelphia, Pennsylvania, United States
Disclosure(s):
Thomas M. Westbrook: No financial relationships to disclose
Introduction/Rationale: High-throughput plasma proteomics has significant potential as a tool for monitoring immune checkpoint inhibitor (ICI) therapy and predicting immune-mediated adverse event (irAE) side effects. However, the high dimensionality of proteomic features relative to sample sizes in typical clinical cohorts limits robust analysis of the associations between proteins and irAEs. We hypothesized that a deep learning autoencoder trained on population-scale data could learn compressed representations of the plasma proteome, enabling improved sample stratification in smaller cohort settings.
Methods: We trained a masked autoencoder (MAE) on Olink proteomic data from over 50,000 UK Biobank participants to learn compressed proteomic embeddings. This pre-trained model was then applied to an independent clinical cohort of ICI-treated patients with longitudinal plasma proteomics samples paired with irAE phenotyping. The embeddings were used to predict the onset of irAEs and to identify proteomic signals of active irAEs. They were then compared to models trained on raw proteomic data.
Results: In irAE prediction and identification tasks, the MAE-derived embeddings demonstrated significantly improved robustness compared to models using raw protein levels. Models trained on pre-treatment samples were predictive of subsequent irAE development, identifying high-risk patients at a potentially clinically useful timepoint.
Conclusion: These results demonstrate that plasma proteomics has the potential to improve our understanding of individuals at risk for irAEs and that transfer learning from population-scale data can overcome sample size limitations in clinical ICI cohorts. Future work will focus on further expanding the biological interpretability of the latent features.