PhD Public Seminar: SHUAI GUO, MS
When & Where
June 19
3:00 PM - 4:00 PM
UTHH MD Anderson Cancer Center, IMC12.3312/3313 and via Zoom (View in Google Map)
Contact
- Joy Lademora
- 7135009872
- [email protected]
Event Description
Integrating Bulk and Single-Cell Transcriptomics to Investigate Intratumor Heterogeneity
Shuai Guo, MS (Advisor: Wenyi Wang, PhD)
Cancers are heterogeneous mixtures of tumor and surrounding cells, where each component comprises multiple distinct sub-types and/or states. Understanding the cell-type-specific contributions is critical for advancing cancer biology, yet high-throughput expression profiles from tumor tissues only represent combined signals from all diverse cellular sources. Bulk deconvolution with single-cell/nucleus (sc/sn) RNA-seq data has emerged as a powerful approach to dissect both cellular composition and cell-type-specific expression patterns, yet the technological discrepancy across sequencing platforms limits accuracy.
To systematically evaluate the impact of platform discrepancies on bulk deconvolution, we first generated a benchmark dataset of 24 healthy retinas with paired bulk and snRNA-seq on the same sn aliquots, ensuring technological discrepancy as the main confounding factor. Additionally, we collected Hippen et al. benchmark data with matched bulk and sc data from 7 high-grade serous ovarian carcinoma (HGSC) patients. Analysis of two datasets revealed significant technological effects on expression profiles between paired bulk and sn/sc data, indicating that existing deconvolution methods are compromised because their assumption of representative sc/sn-derived references is violated.
To fill this gap, we then introduce DeMixSC, a three-tier framework leverages the well-matched benchmark data to handle technological discrepancy. First, DeMixSC uses benchmark data to identify and adjust genes with high inter-platform discrepancy. Second, to deconvolve large unmatched bulk data, DeMixSC aligns the target bulk cohort with the benchmark bulk data to generalize the detected discrepancy patterns. Last, DeMixSC estimates cell type proportions using weighted nonnegative least-squares with two innovations: a partitioned loss for discrepancy genes and a new weight considers both expression magnitude and variance.
We validated DeMixSC's performance with existing methods using both retina and HGSC benchmark data. In the retina benchmark data, DeMixSC achieved the lowest root mean squared error (RMSE, 0.03) and highest Spearman's correlation (0.86), largely outperforming other methods (RMSE: 0.11-0.25, correlation: 0.31-0.49). For the HGSC data, DeMixSC also showed improved accuracy (RMSE: 0.09, correlation: 0.72) than others (RMSE: 0.13-0.19, correlation: 0.27-0.49). When applied to an aged macular degeneration cohort (n=453), DeMixSC revealed biologically meaningful changes including decreases in photoreceptors and horizontal cells, alongside increases in glial cells. In a HGSC cohort (n=30) with neoadjuvant chemotherapy, DeMixSC identified significant differences in epithelial cells across treatment response groups and revealed increased infiltration of macrophage in poor responders (validated by immunofluorescence, Spearman’s correlation: 0.82). The DeMixSC R package and DeMixSC-on-Web ShinyApp are freely available at https://github.com/wwylab/, with ready-to-use benchmark data for immediate deconvolution of any retina and ovarian cancer bulk samples.
Lastly, we analyzed snRNA-seq data from 54 metastatic prostate tumor samples collected from a prospective clinical trial. We highlight the analytical challenges of applying snRNA-seq in complex tumor settings and underscores the complementary strengths and limitations of single-cell and bulk transcriptomics. These findings motivate the need for integrative frameworks that account for both technical artifacts and biological heterogeneity.
Advisory Committee:
- Wenyi Wang, PhD, Chair
- Ana Aparicio, md
- Jianjun Gao, MD, PhD
- Nicholas Navin, PhD
- John Paul Shen, MD
- Peng Wei, PhD
Join via Zoom (Please contact Mr. Shuai Guo for his Zoom meeting information).
Integrating Bulk and Single-Cell Transcriptomics to Investigate Intratumor Heterogeneity
Shuai Guo, MS (Advisor: Wenyi Wang, PhD)
Cancers are heterogeneous mixtures of tumor and surrounding cells, where each component comprises multiple distinct sub-types and/or states. Understanding the cell-type-specific contributions is critical for advancing cancer biology, yet high-throughput expression profiles from tumor tissues only represent combined signals from all diverse cellular sources. Bulk deconvolution with single-cell/nucleus (sc/sn) RNA-seq data has emerged as a powerful approach to dissect both cellular composition and cell-type-specific expression patterns, yet the technological discrepancy across sequencing platforms limits accuracy.
To systematically evaluate the impact of platform discrepancies on bulk deconvolution, we first generated a benchmark dataset of 24 healthy retinas with paired bulk and snRNA-seq on the same sn aliquots, ensuring technological discrepancy as the main confounding factor. Additionally, we collected Hippen et al. benchmark data with matched bulk and sc data from 7 high-grade serous ovarian carcinoma (HGSC) patients. Analysis of two datasets revealed significant technological effects on expression profiles between paired bulk and sn/sc data, indicating that existing deconvolution methods are compromised because their assumption of representative sc/sn-derived references is violated.
To fill this gap, we then introduce DeMixSC, a three-tier framework leverages the well-matched benchmark data to handle technological discrepancy. First, DeMixSC uses benchmark data to identify and adjust genes with high inter-platform discrepancy. Second, to deconvolve large unmatched bulk data, DeMixSC aligns the target bulk cohort with the benchmark bulk data to generalize the detected discrepancy patterns. Last, DeMixSC estimates cell type proportions using weighted nonnegative least-squares with two innovations: a partitioned loss for discrepancy genes and a new weight considers both expression magnitude and variance.
We validated DeMixSC's performance with existing methods using both retina and HGSC benchmark data. In the retina benchmark data, DeMixSC achieved the lowest root mean squared error (RMSE, 0.03) and highest Spearman's correlation (0.86), largely outperforming other methods (RMSE: 0.11-0.25, correlation: 0.31-0.49). For the HGSC data, DeMixSC also showed improved accuracy (RMSE: 0.09, correlation: 0.72) than others (RMSE: 0.13-0.19, correlation: 0.27-0.49). When applied to an aged macular degeneration cohort (n=453), DeMixSC revealed biologically meaningful changes including decreases in photoreceptors and horizontal cells, alongside increases in glial cells. In a HGSC cohort (n=30) with neoadjuvant chemotherapy, DeMixSC identified significant differences in epithelial cells across treatment response groups and revealed increased infiltration of macrophage in poor responders (validated by immunofluorescence, Spearman’s correlation: 0.82). The DeMixSC R package and DeMixSC-on-Web ShinyApp are freely available at https://github.com/wwylab/, with ready-to-use benchmark data for immediate deconvolution of any retina and ovarian cancer bulk samples.
Lastly, we analyzed snRNA-seq data from 54 metastatic prostate tumor samples collected from a prospective clinical trial. We highlight the analytical challenges of applying snRNA-seq in complex tumor settings and underscores the complementary strengths and limitations of single-cell and bulk transcriptomics. These findings motivate the need for integrative frameworks that account for both technical artifacts and biological heterogeneity.
Advisory Committee:
- Wenyi Wang, PhD, Chair
- Ana Aparicio, md
- Jianjun Gao, MD, PhD
- Nicholas Navin, PhD
- John Paul Shen, MD
- Peng Wei, PhD
Join via Zoom (Please contact Mr. Shuai Guo for his Zoom meeting information).
", "startDate":"2025-6-19", "endDate":"2025-6-19", "startTime":"15:00", "endTime":"16:00", "location":"UTHH MD Anderson Cancer Center, IMC12.3312/3313 and via Zoom", "label":"Add to Calendar", "options":[ "Apple", "Google", "iCal", "Microsoft365", "MicrosoftTeams", "Yahoo" ], "timeZone":"America/Chicago", "trigger":"click", "inline":true, "listStyle":"modal", "iCalFileName":"Reminder-Event" }