Graduate Student Stanford Univ. Menlo Park, California, United States
Disclosure(s):
Holly McCann: No financial relationships to disclose
Introduction/Rationale: Inferring molecular signatures of disease progression usually requires costly and time-consuming longitudinal samples. Single-cohort transcriptomic studies also suffer from the curse of dimensionality, as the number of genes profiled is much larger than the number of samples. The availability of numerous independent, heterogeneous datasets in public databases presents a unique opportunity to model disease progression trajectories by leveraging cross-sectional data that collectively capture all stages of a disease.
Methods: We introduce TIDEPOOL, a method that uses a disease-defining gene signature to define consensus disease trajectories within independent, heterogeneous datasets.
Results: To demonstrate the utility of TIDEPOOL, we applied it to 24 bulk transcriptomic datasets comprising 2,222 blood samples from patients with viral or bacterial infections of varying severities. We identified two diverging trajectories separating patients with severe outcomes from those with non-severe outcomes. We found 54 genes, clustered into four gene modules, that differentiate these two trajectories. We validated that these genes also differentiate patients with severe and non-severe outcomes in 1,913 samples of patients with infection from 10 external datasets (AUROC = 0.80). Through single-cell analysis of an integrated whole blood dataset, we identified that 3 of these gene modules were highly expressed in neutrophils and the other in B and dendritic cells. In-depth analysis of the 3 neutrophil-associated modules in the Single-Cell Atlas of Human Neutrophils (SCAHN) showed that the modules originate from different neutrophil subtypes including both protective and detrimental neutrophils.
Conclusion: Using the novel TIDEPOOL framework, we have identified a 54-gene signature that distinguishes infectious disease patients with severe and non-severe outcomes. Many of these genes originate from neutrophils, including multiple neutrophil subsets which are associated with both favorable and adverse outcomes.