First component of my research focuses on development of statistical models and tools for analysis of biological data. More specifically, I focus on genetic and epigenetic data such as DNA mutations, histone modifications, and DNA methylation. I develop visualization, quality control, and data integration pipelines. I look for complex patterns in the data that will serve as biomarkers for diagnosis, prognosis, or treatment of the diseases. In other cases, these patterns may reveal novel biological insight into the system that we are working with. I have special interest in understanding the epigenetic landscape of human cells. For this, I work a lot with functional genomics datasets from for example, ENCODE, GTEx, Roadmap Epigenome Project.

Second component focuses purely on data science aspect of biological data analysis. In this topic, I current work a lot on privacy of genetic and genomic data. These tools enable us to extract more and higher quality knowledge from genetic and biological data while we respect privacy of the individuals. I am interested in development of computational tools and augment those with regulatory frameworks. I make heavy use of publicly available datasets and databases such as TCGA, GTEx, GEO, and 1000 Genomes Project.

A new student that starts to work in my lab would run and develop the existing software pipelines, which consist mainly of NGS data analysis tools. If time permits, students will develop new analysis and visualization tools for analysis of genetic and epigenetic data. Most current projects are related to understanding aspects of functional genomics data.


