Graduate Student Researcher North Carolina State Univ. Col. of Vet. Med. Raleigh, North Carolina, United States
Disclosure(s):
Ethan Smith: No relevant disclosure to display
Introduction/Rationale: Single-cell RNA-seq (scRNA-seq) analyses rely on accurate gene annotations, a challenge for many species with less completely curated genomes. Rhesus macaque (RM), a widely used model for human biomedical research, is one such case where missing gene annotations have hindered the study of immune responses. We aim to develop a computational framework to identify and reintegrate missing gene features, improving immune response characterization in RM.
Methods: In a preliminary analysis, we computationally searched in a RM peripheral blood mononuclear cell (PBMC) scRNA-seq dataset from a kidney allograft study for unannotated but transcriptionally active regions (uTARs). We then performed cell clustering twice, once on annotated-gene expression and again on uTAR expression, and assessed uTAR expression for cell-type specificity and association with immune-related pathways.
Results: We identified >5,500 uTARs, indicating that numerous features—e.g., long non-coding RNAs or alternative transcripts of existing genes—are missing from current RM annotations. uTARs exhibit cell-type-specific expression and, when used to group cells, separate major cell types, paralleling cell clustering using annotated rhesus genes. These findings illustrate substantial gaps in the RM genome annotation and highlight the biological relevance of these missing genes or transcripts.
Conclusion: uTARs likely harbor many previously unannotated genes or transcripts that are involved in immune regulation in RM. Ongoing work will denoise the signals in scRNA-seq data and prioritize a subset of uTARs as candidate transcriptional regulators. We will also infer regulatory relationships between candidate regulators and downstream targets. Our immediate goal is to identify drivers of transplant rejection. However, this framework is broadly applicable to scRNA-seq datasets across species and experimental contexts.