(144) Non-linear Generalizable Machine Learning Re-Analysis of Plasma Proteomics and Single-Cell Immune Network for Classification of COVID-19 Severity
Associate Professor Lynn University Boca Raton, Florida, United States
Disclosure(s):
Felix E. Rivera-Mariani, PhD: No financial relationships to disclose
Introduction/Rationale: Severe respiratory viral infections (i.e., flu, COVID-19) trigger complex cellular and protein immune dysregulation. Linear models may overlook non-linear immune patterns important for risk stratification. We re-analyzed plasma omics and single-cell immune data to benchmark a non-linear machine-learning pipeline, derived mechanistic markers, and position an approach extensible to environmental-viral immunology cohorts.
Methods: From ImmProt (study ID SDY2129, https://www.immport.org), we merged patient characteristics, plasma proteomics, and CyTOF data for 81 COVID-19 cases (Mild, Moderate, Severe). After median imputation, scaling, and one-hot encoding, features with high missingness, low variance, or strong collinearity were removed; 150 features were retained. We evaluated Random Forest, Gradient Boosting, and Elastic Net in a stratified train/test split and nested cross-validation. We validated and interpreted results with PCA, ROC curves, confusion matrices, and RF importance.
Results: RF captured non-linear interactions and outperformed GB and EN (accuracy = 83%, AUC = 0.94). Cross model drivers increased with severity, including LGALS1 (galentcin-1), CCL7 (MCP-3), TNFRS10A, FURIN, PLAUR (in Olink NPX), and stimulus-responsive STAT/ERK/NF-κB axes. (CyTOF), while pDA resting measures declined—findings concordant with original network biology (dysregulated JAK/STAT, MAPK/mTOR, NF-κB) but via a compact, prefiltered pipeline. Mechanistically, LGALS1 suggests T-cell dampening and stromal remodeling; CCL7 indicates monocyte recruitment and lung immunopathology.
Conclusion: This ImmPort re-analysis corroborates published biology and shows that a non-linear ML pipeline enables effective three-class severity classification with interpretable, cross-platform markers—positioned for respiratory viral immunology beyond COVID-19. Future steps involve batch correction, pathway analysis, a small marker panel, external validation, and integration into environmental–viral severity modeling efforts.