The University of Texas Health Science Center at Houston
School of Public Health
Department of Epidemiology, Human Genetics & Environmental Sciences
I work on statistical genetics, computational biology, bioinformatics, and sequence data analysis. My research is focused on development of computational and statistical methods for analysis of massive data to understand genetics and biology of complex traits. I have been working on the analysis of large-scale next-generation sequencing data, for which I developed statistical models and software pipelines for detecting sample contamination, variant discovery, machine-learning based variant filtering, and genotyping of structural variations. I also work on genetics of diabetes, obesity, and related traits of Mexican American pedigrees. In the ongoing projects, I am working on models that can incorporate heterogeneous genetic and epigenetic data (DNA sequences, methylation, expression level/RNA sequences, Hi-C data) for comprehensive modeling of high-dimensional phenotypes including metabolomics and microbiome profiles. During a tutorial, students will have chances to work on analyses of large-scale sequence data. Students will learn how to process raw sequencing data to extract variant-level information, and how to find association of those variations with common and complex phenotypes. Students will also learn basics of statistical genetics and population genetics, and how to apply certain statistical tools for problems in human genetics. I expect students to learn how to utilize existing tools and software on the linux-based cloud computing systems, and also to develop programming skills to implement his/her own method.
Education & Training
Ph.D. - The University of Texas at Austin - 2010