Dr. Elmer V. Bernstam
The University of Texas Health Science Center at Houston
School of Biomedical Informatics and
McGovern Medical School - Department of Internal Medicine
Our laboratory focuses on problems related to meaning in biomedical data. As an example, clinical data are increasingly being collected via electronic medical records (EMRs) into clinical data repositories (or warehouses). These data are potentially useful for research, quality improvement, biosurveillance and other purposes. However, since these data were collected for other purposes (e.g., clinical care, billing, etc.), repurposing is not straightforward. For example, information useful for research is generally in free text notes, rather than structured billing codes. Thus, natural language processing and concept extraction are important to our work. We also have projects in consumer informatics (e.g., how can we guide health care consumers to accurate information online, privacy in the context of personalized medicine) and information retrieval (e.g., MEDLINE searching).
Projects/Techniques: Projects in our lab generally focus on extracting meaning from (large) data set. Examples include:
1. High-throughput phenotyping: identifying patients with a particular condition (e.g., breast cancer) within a large clinical database. We have applied a variety of techniques including graph algorithms, vector space models (adapted from information retrieval) and others.
2. Understanding the balance between research subject privacy and utility. A great deal of effort is devoted to maintaining the privacy of subjects in clinical and translational research. We are attempting to understand attitudes and expectations related to privacy using surveys and interviews.
3. Guiding consumers to accurate health information online. How do we help non-clinicians identify accurate information? Published tools for evaluating online information do not seem to correlate with accuracy. However, we found that inaccurate information posted to online forums is rapidly and reliably identified (and often corrected) by subsequent postings. Thus, online information “self-corrects.”