Skip to main content

Joan E. Bailey-Wilson, Ph.D.

Co-Chief and Senior Investigator, Computational and Statistical Genomics Branch
Head, Statistical Genetics Section

Scientific Summary

Dr. Bailey-Wilson, Head of the Statistical Genetics Section (SGS), is actively studying a range of diseases, including lung cancer, prostate cancer, myopia and other eye diseases, autism and cleft lip and palate.  Trained in statistical genetics, she is interested in understanding the genetics of complex diseases and developing novel methodologies to disentangle the roles that genes and environment play in disease causation.

She has been particularly interested in lung cancer since the early 1980s, when very few scientists believed there might be a genetic link to the condition. Today, significantly more data support the idea that there are susceptibility alleles for one or more unknown genes that dramatically increase certain smokers' risk of developing lung cancer.  In a collaboration called the Genetic Epidemiology of Lung Cancer Consortium (GELCC), Dr. Bailey-Wilson and others recently narrowed down the location of a potential lung-cancer gene to a region of chromosome 6, and showed that RGS17 is a tumor suppressor gene in this region that shows association with lung cancer risk in highly aggregated lung cancer families.  With her collaborators, she is using dense genotyping panels and next-generation DNA sequencing in the GELCC's set of highly-aggregated lung cancer families, their family-history-positives cases, and age-gender-smoking matched controls to search for causal variants in additional lung cancer susceptibility loci. 

Dr. Bailey-Wilson has used similar approaches to locate other cancer-related genes. For example, she and her collaborators published evidence that genes involved in prostate cancer reside on specific regions of chromosomes 1, 8, 17 and X. These findings have been replicated, and three candidate genes with rare variants that appear to increase prostate cancer risk have been cloned: RNASEL (HPC1), which encodes ribonuclease L, MSRI, which encodes the macrophage scavenger receptor 1 and HOXB13, which encodes the homeobox B13 protein. Dr. Bailey-Wilson is focusing on identifying additional susceptibility genes for these and other cancers in ongoing studies. At present, she is collaborating with the International Consortium for Prostate Cancer Genetics on a whole exome sequencing study of families with strong family history of prostate cancer and is the lead statistician on the study of the ICPCG's African-American families.

Dr. Bailey-Wilson is also applying these next-generation sequencing tools to her studies of highly aggregated non-syndromic oral cleft families from the Syrian Arab Republic (2 to 17 affected individuals per family) and to a set of multiplex autism spectrum disease families. In the autism study, a subset of families in which at least one child with autism also has abnormal cholesterol levels are of particular interest.

To keep pace with the analysis of the exponentially increasing number of genetic markers, Dr. Bailey-Wilson also develops and tests novel computational methods. Until relatively recently, fewer than 100 of these "signposts" along the genome had been identified. Now, there are millions of known markers and genome sequencing has drastically increased the number of variants to be analyzed in a single study. Her group is especially interested in using machine learning methods to detect causal variants that have limited or no marginal effects on risk of a disease but which do show strong interaction effects (with either other genetic variants or with environmental risk factors). She is also working to address the effects of linkage disequilibrium, or the nonrandom association of closely spaced loci, on genetic interaction tests and machine learning methods. Linkage disequilibrium can be caused by a low frequency of recombinations between two loci when they are very close together on a chromosome. The closer two loci are, the more likely they are to exhibit linkage disequilibrium. Thus, markers that are only 100 kb apart display significantly greater linkage disequilibrium than markers that are 100-5,000 kb apart. Because standard linkage analysis methods typically assume no linkage disequilibrium exists between loci, Dr. Bailey-Wilson's group has developed approaches to streamline these methods to study sets of dense genetic markers. She is using association methods that take advantage of linkage disequilibrium data, HapMap data, and the sequence of the human genome to determine the location of genetic loci that increase risk for various diseases. She has used these and other analytical methods to determine, for example, whether alleles at specific marker loci are transmitted along with a disease through generations in families with several affected members.

Dr. Bailey-Wilson has also used statistical methods to determine the marker alleles that people with a specific disease carry more frequently - and disease-free people carry less frequently - than can be explained by chance. This work has helped to greatly reduce the number of target regions that investigators need to search for potential disease-related genes. Her group is also developing approaches to mitigate the increased false-positive evidence of epistatic interaction that can be observed when strong LD exists between variants within a single genetic locus that all have a marginal effect on the trait.

Statistical Genetics Section Members

Emily Holzinger, Ph.D., Postdoctoral Fellow
Deyana Lewis, Ph.D., Postdoctoral Fellow
Qing Li, Ph.D., Research Fellow
Candace Middlebrooks, Ph.D., Postdoctoral Fellow
Anthony Musolf, PhD,  Postdoctoral Fellow
Brian Perry, M.Phys., Contractor

Top of page

Last Updated: June 27, 20146