Artificial intelligence and machine learning becoming pervasive at NHGRI and in genomics
Happy New Year to you and yours!
We start off the new year with good budgetary news and profound gratitude to the U.S. Congress for providing NIH and NHGRI a healthy Fiscal Year 2023 budget. Specifically, NIH received a roughly 5.6% increase, while NHGRI received a roughly 3.8% increase. The latter will help provide the fuel we need for bringing another year of spectacular NHGRI-supported genomics breakthroughs!
Meanwhile, let’s also start the new year with some simple questions. First: what is 2,023 minus 2,003? Answer: 20! Why this calculation and what is significant about the number 20? Well, 2023 will bring the 20th anniversary of the completion of the Human Genome Project — in April, to be precise. Second: do you think that NHGRI will make a big deal out of this 20th anniversary? Answer: you bet your base pairs we will! Stay tuned and start designing your DNA-themed party hat to wear at any various April 2023 celebratory events. Just imagine that — 20 years since the end of the Human Genome Project!
Finally, I encourage you to register for and watch the upcoming roundtable on the promise and perils of social genomics, which will be held on Wednesday, January 25, 2023. The event will feature academic and scholarly perspectives about multiple issues related to complex areas of social and behavioral genomics.
All the best,
Artificial intelligence and machine learning becoming pervasive at NHGRI and in genomics
These days, artificial intelligence (AI) and machine learning (ML) are everywhere. From self-driving cars to facial recognition to photograph-refinement software—AI and ML are being applied to a variety of fields in myriad ways.
AI and ML are also becoming omnipresent throughout NHGRI and in genomics. Within the NHGRI Intramural Research Program, several researchers have been using AI to characterize genomic disorders and learn how to communicate genomic risk. For example, Benjamin Solomon, M.D., and his colleagues are using these computational tools to recognize and classify rare genetic skin conditions. Oleg Shchelochkov, M.D., recently led a study using ML to find biomarkers associated with mild and severe forms of propionic acidemia, a genetic condition that affects digestion of proteins and fats. Currently, Susan Persky, Ph.D., is using AI tools to understand how different ways of communicating genomic information can influence peoples’ behavior. These experiences reveal the wide applicability of AI and ML in different areas of contemporary genomics research, including considerations of many ethical, legal, and social implications.
The use of AI and ML is also increasing across the NHGRI portfolio of extramural grants, which fund both basic and clinical genomics research. For example, NHGRI currently supports extramural research that aims to improve the accuracy of DNA sequencing by using AI and ML to identify the correct DNA base using the raw signals generated by the sequencing process. With other grants, researchers are using AI to predict the functional influences of genomic variants, with a focus on gene expression. Lastly, ML is being used for improving the methods for calculating polygenic risk scores, which reflect an individual's genetic risk for a trait or disease.
ML represents a subcomponent of AI, and beyond its use in genomics research, ML is helping historians probe the history of genomics. Spencer Hong, a graduate student in Luís Nunes Amaral’s research group at Northwestern University, uses ML to categorize and analyze materials in NHGRI’s historical document archive, which contains over two million pages of emails, letters, reports, and other documents pertaining to the history of genomics and NHGRI. ML tools analyze and characterize these diverse documents much faster than humans can manually, making ML valuable for studying the history of genomics and NHGRI’s role in shaping the field.
NHGRI has hosted events to engage researchers and the public about the roles of AI and ML in genomics. For example, in December 2022, NHGRI held a Twitter Q&A with Hong, who answered questions about his research. In 2021, the institute convened a workshop entitled “Machine Learning in Genomics: Tools, Resources, Clinical Applications, and Ethics Workshop” that identified key areas in genomics ripe for ML applications. That event also helped to define NHGRI’s unique role in ML research in both basic genomic science and genomic medicine.
At the NIH level, the Common Fund recently launched the Bridge to Artificial Intelligence (Bridge2AI) Program, which aims to propel biomedical research forward by setting the stage for widespread adoption of AI to tackle complex biomedical challenges beyond human intuition. The goals of Bridge2AI are to generate flagship AI-ready datasets centered around biomedical grand challenge questions; to emphasize ethical AI best practices for biomedicine; and to establish diverse groups who span the boundary between AI and healthcare research. Bridge2AI includes projects with major genomics components.
Ten years ago, it was hard to imagine that digital voice-assistant devices could readily play music or provide the weather forecast on command without first typing in a request on a computer. In the next few years, it will be fascinating to watch how AI and ML reveal new insights about the human genome and advance the use of genomics in medicine.
Global Biodata Coalition designates first set of most critical data resources
The Global Biodata Coalition (GBC) recently announced its first set of designated Global Core Biodata Resources (GCBRs)—specifically, 37 data resources found by an international peer-review panel to be critical for the worldwide life science and biomedical research community, and the long term preservation of biological data. NHGRI supports 16 of these resources including the Alliance of Genome Resources, ClinGen, FlyBase, GENCODE, GO, gnomAD, GWAS Catalog, HGNC, MGD, PharmGKB, Reactome, SGD, UCSC Genome Browser, Uniprot, WormBase and ZFIN. GBC was established in 2020 as a group of international research funders working to understand the global biodata resource ecosystem and to move towards more internationally coordinated, sustainable, and streamlined mechanisms that support that ecosystem. The GCBRs provide free and open access to their data, are used extensively by the research community world-wide, are mature, comprehensive, and authoritative in their field, provide high scientific quality and a professional standard of service. The selected 37 GCBRs will now be studied more thoroughly with the goal of developing a long-term sustainability plan to ensure their continued availability by researchers around the world.
New NIH research funding opportunities to expand the use of All of Us data
NHGRI is working with the All of Us Research Program to support new research using All of Us data. The mission of the All of Us Research Program is to accelerate health research and medical breakthroughs by building the most diverse biomedical data resource of its kind. NIH has released two new funding announcements for applications that make direct use of the rich data that are available through the cloud-based All of Us Researcher Workbench. Small grants to enhance the use of the All of Us Research Program’s data (R03) will fund grants to analyze data in the Researcher Workbench, using standard tools and methods. Additionally, Enhancing the use of the All of Us Research Program’s data (R21) will fund grants to develop new tools for analyzing data in the Researcher Workbench, with the goal of making the new tools broadly available. All of Us intends to commit up to $2 million in fiscal year 2023 (FY23) to fund up to 12 awards, subject to availability of funds and the quality of the applications. Additional NIH institutes and offices, including NHGRI, intend to commit funding from FY23 for applications from either funding opportunity that are within the scope of their missions. Letters of Intent will be accepted until January 30, 2023, and complete applications are due by March 1, 2023.
Genomics powerhouses retire from NHGRI
NHGRI is in the midst of a small surge in recent (or soon-to-happen) retirements of genomics titans. Those retirements include Intramural Research Program Senior Scientist Jim Mullikin, Ph.D., and Extramural Program Directors Elise Feingold, Ph.D., Michael Smith, Ph.D., and Joy Boyer. Jim worked on the Human Genome Project, served as Director of the NIH Intramural Sequencing Center, and contributed to many large-scale genome-sequencing projects in his 25 years at NHGRI. Over her 30 years at NHGRI, Elise initiated and led the Encyclopedia of DNA Elements (ENCODE) project, which sought to decipher the catalog of functional elements in the human genome. During his 10 years at NHGRI, Michael led the Genome Technology Program and coordinated the Institute’s Small Business Program. For over 26 years, Joy provided key leadership with NHGRI’s Ethical, Legal, and Social Implications (ELSI) Research Program. These four individuals leave very large shoes to fill!
New spotlight features investigator-initiated research
The Genomics Landscape has added a new spotlight section featuring investigator-initiated genomics research. NHGRI is well-known for its high-profile consortium-based science, starting with the Human Genome Project. However, NHGRI also has a very robust portfolio of investigator-initiated research, which is funded as part of NHGRI’s Extramural Research Program and Intramural Research Program. Investigator-initiated research reflects ideas from the researchers themselves that emerge through the natural processes of scientific inquiry, as opposed to projects defined by the research funder that aim to achieve an overarching goal, as is often the case with consortium-based efforts. Starting this month (see below), each The Genomics Landscape issue will highlight a recent, open-access publication reporting the work of an investigator-initiated research project in a new section entitled “Genomics Research Spotlight.”
Rare and common genetic determinants of metabolic individuality and their effects on human health
Surendran et al.
Nat Med 28: 2321–2332, 2022. PMC9671801
Among its many roles, human DNA encodes for proteins that are responsible for metabolizing (e.g., chemically processing or breaking down) molecules such as carbohydrates, fats, medications, and even other proteins. Human DNA, however, is not identical among all people, and the small differences among human genomes (i.e., genomic variants) can lead to differences in how people metabolize (e.g., chemically process) innate and foreign substances. In some cases, these metabolic changes can impact physical traits such as risk of disease or drug toxicity. In this paper (Surendren et al., 2022), researchers identified variants in over 300 regions of peoples’ DNA that resulted in measured differences in metabolism. For many of these regions, they were able to identify a likely causal gene that encodes the metabolizing protein. These findings provide new insights with relevance to health, including discovery of proteins that are potentially related to adverse drug effects or other health outcomes.
This research was supported by multiple sources in the UK and the US, including a grant from the NHGRI Extramural Research Program to Eric Gamazon, Ph.D., who is a faculty member in the Vanderbilt Genetics Institute; that grant is part of the Investigator-Initiated Research in Computational Genomics and Data Science Grant Program (PAR-18-844).
NIH and NHGRI News
About The Genomics Landscape
A monthly update from the NHGRI Director on activities and accomplishments from the institute and the field of genomics.
Last updated: January 5, 2023