NHGRI logo
Generic Bases

Data Science

updated: May 10, 2022


Data science involves the study of large, complex data sets that arise from various types of research projects. With respect to genomic studies, such work requires expertise in quantitative scientific disciplines such as bioinformatics, computational biology and biostatistics.


Over the last two to three decades, the fields of genetics and especially genomics have become incredibly data-rich and data-intensive. In fact, it is nearly impossible to conduct genetics or genomics research today without analyzing very large amounts of data using highly sophisticated computational tools. This is why the phrase ‘data science’ is commonly used in genetics and genomics research. Data science is a very broad area that involves the use of tools of bioinformatics, computational biology, biostatistics, and other quantitative disciplines to extract knowledge from data. Genetics and genomics are not the only area of science or society that now use the tools of data science routinely – just think about neuroscience, imaging technologies, astronomy, and electronic medical records as well as financial information and political polling. In fact, we now live in a very data science-oriented world!

Eric Green
Eric Green, M.D., Ph.D.


National Human Genome Research Institute