Global Biodata Coalition coordinates worldwide funding of data resources
A very important anniversary occurred a couple of weeks ago. July 20, 2022, marked the 200th anniversary of Gregor Mendel’s birth. Gregor Mendel was an Augustinian monk who grew peas in the garden of the St. Thomas’s Abbey in Brno, Czech Republic, where he lived. Breeding and studying those peas led him to discover the fundamental laws of inheritance. While the significance of his work was not recognized until the 20th century, it paved the way for future research in genetics, including the discovery of DNA.
Turning our attention to NIH news – last year, Michael Gottesman, M.D., announced his plans to step down as NIH Deputy Director for Intramural Research after ~28 years in the position. As someone who has worked closely with Michael starting with my arrival at NHGRI in 1994, I would like to give my own shout out for his contributions to NIH. His leadership and stewardship of NIH’s broader Intramural Research Program has been exemplary, all performed by one of the nicest human beings on this planet! A million thanks to an NIH legend – including for being Acting Director of the then-named National Center for Human Genome Research in the early 1990’s! If you would like to learn more about Michael, I would point you to his two-part oral history (part 1 and part 2) produced by the NHGRI History of Genomics Program and a recent fireside chat that I had with him. Recently, Nina Schor, M.D. was appointed the Acting NIH Deputy Director for Intramural Research, taking the leadership baton from Michael. I look forward to working with Nina in this new role.
All the best,
Global Biodata Coalition coordinates worldwide funding of data resources
Starting with the Human Genome Project, the field of genomics has been at the forefront of efforts to develop robust and readily available data resources (e.g., databases and knowledgebases). A number of these resources) now provide a fundamental infrastructure for biomedical and life science research. These data resources often connect and integrate the ever-growing amounts of genomic data with other data types to maximize the power of the analyses being performed and the knowledge being extracted. For example, as the genomes of the selected set of classic model organisms (e.g., yeast, fruit fly, nematode, and mouse) were sequenced during the Human Genome Project, an immediate need for accompanying data resources emerged. This gave rise to the well-known “model organism databases,” such as the Saccharomyces Genome Database, FlyBase, WormBase, and Mouse Genome Informatics, respectively. Data resources that enhanced the utility of the generated human genome sequence then followed, such as the UCSC Genome Browser.
Over time, these prototypic data resources and others like them have grown in both size and complexity, requiring substantial ongoing financial support from research funders such as NHGRI and other components of NIH. Meanwhile, as biomedical and life science research increasingly becomes data-intensive, researchers from around the world continue to increase their dependence on these data resources, with some resources accessed by millions of users each month. Unfortunately, the funding for these resources is not unlimited. In many cases, the scientific leaders of these data resources have encountered the realities of finite and highly demanded research funds, revealing the “fragile nature” of their funding support. This situation is made even more complex by the demographics associated with funding data resources; the majority of the support comes from the United States and European countries, whereas the aggregate use of the data resources is widely distributed worldwide. This striking imbalance co-exists with the ongoing strong consensus to provide free access to all these data resources for researchers around the world to use.
Some have expressed concerns about the sustainability of data resources associated with biomedical and life science research (i.e., biodata resources), often pointing to the fragmented and somewhat haphazard nature of their financial support. For example, a 2015 Nature perspective written by NIH leaders (specifically, Phil Bourne, Ph.D., Jon Lorsch, Ph.D., and Eric Green, M.D., Ph.D.) highlighted the complexities of the big-data ecosystem. The paper specifically pointed to the problem with coordinating the funding of biodata resources, stating that “(c)hanges to funding practices need to extend across both agency and international borders. Data generation and maintenance are typically funded nationally, but the data are used internationally. As a result, we need to develop more equitable funding models. The first step is for funding agencies to communicate more effectively about data science problems and seek collaborative solutions.”
The Nature perspective caught the attention of Warwick Anderson, D.Phil., then the Secretary General of the International Human Frontier Science Program Organization, a well-established agency that supports global research programs. Dr. Anderson appreciated the lack of effective coordination among research funders when it came to the support for biodata resources worldwide. He joined forces with NHGRI Director Eric Green to convene a meeting of experts in 2016 that explored how to improve this problematic situation. One of the major outputs of that meeting was a call for the creation of a coalition of research funders who would work to better coordinate the funding of biodata resources that are vital to biomedical and life science research, so as to ensure their long-term stability and sustainability.
These efforts gave rise to the establishment of the Global Biodata Coalition (GBC) in late 2020, a group of research funders that are now working to understand the global biodata resource infrastructure and to move towards more internationally coordinated, sustainable, and streamlined mechanisms that support the biodata ecosystem. The GBC is currently supported by 11 public and charitable funders (including NHGRI on behalf of NIH) from three continents and has established a formal governance structure, which includes a recently appointed Executive Director (Guy Cochrane, Ph.D.), a Board of Funders, a Scientific Advisory Committee, and a small Secretariat. Funds to support GBC’s activities are managed by the International Human Frontier Science Program Organization.
Like many things, progress with GBC’s efforts was slowed by the COVID-19 pandemic, but there is now significant momentum on several fronts. Specifically, the GBC is actively: (1) conducting an inventory of the world’s biodata resources, which will provide a better understanding of the composition and characteristics of the complete set of such resources; (2) overseeing the identification of an initial set of the most widely used and key biodata resources (likely <50) that will be designated as Global Core Biodata Resources and will become the focus of more-intense attention in exploring sustainability options; (3) investigating how international coordination of open-access data policies could be better coordinated, supported, and implemented at a global level; (4) exploring ways that funders in different countries can better coordinate their efforts, so as to ensure the long-term sustainability of biodata resources; and (5) establishing an international biodata resources forum that will be regularly convened via plenary meetings (once international travel resumes more broadly).
To appreciate what the GBC is trying to accomplish, it is important to recognize the striking lack of communication that previously existed among the research funders that support the various biodata resources. This lack of dialogue and coordination was occurring in the face of all researchers becoming increasingly reliant on the fundamental infrastructure provided by these resources. At its core, the GBC aims to catalyze that coordination on a global level, thereby helping to establish a truly international strategic vision for supporting the broader biodata ecosystem and ensuring its long-term central role in biomedical and life science research. As a strong supporter of GBC and its mission, NHGRI is delighted to be providing both leadership and financial support for this young organization.
Lecture TODAY discusses history of genomics told through machine learning
Today (August 4, 2022) at 1 p.m. ET, the NHGRI History of Genomics Program is holding a virtual lecture entitled "The history of genomics told through machine learning: a celebration of 10 years of the NHGRI history program." Thanks to the meticulous nature of Human Genome Program architects and the researchers involved in major genomics initiatives that have followed, NHGRI has an archive of hundreds of thousands of scanned documents related to the growth of genomics over the last few decades. The depth and breadth of this resource makes it unique within NIH and the larger scientific community and provides a wealth of information for history scholars. For the last two years, machine learning experts have been exploring this dynamic archive, revealing new knowledge and insights about the field of genomics. In this lecture, researchers from the Amaral Lab at Northwestern University will present their working in using machine learning to analyze the archive in an effort to better understand how a major funding institute like NHGRI has helped to shape genomics. The lecture will showcase the power of these tools and detail some of the group’s discoveries. This event is open to the public and registration is required. The symposium will be recorded and later archived on GenomeTV.
Updated resource toolkit integrates genomics into practice settings
The Method for Introducing a New Competency: Genomics (MINC) Toolkit is the only resource of its kind that targets the largest healthcare community in the world, the nursing community, for guiding the clinical implementation of genomics. But this resource is not just for nurses. All healthcare providers, educators, and administrators interested in integrating genomics into their practice settings are invited to use these free resources. The toolkit compiles information learned and developed in a genomic implementation study aimed at increasing nursing capacity to integrate genomics into patient care. Topics include whom to have on the clinical implementation team, how to measure and self-assess, what resources to use and customize, and how to sustain and expand the genomics implementation program. The MINC resources include an overview guide with foundational genomics information and links to genomic nursing competencies; a measurement tool to evaluate needs and barriers called the “Champion Needs Assessment Survey”; and links to the Genetic/Genomic Nursing Practice Survey, a validated instrument to measure nurse genomic knowledge, attitudes, receptivity, confidence, social system, and adoption. These resources are expected to expand in the future, with a disease-specific implementation study that builds upon the MINC foundation.
Genomic Medicine XIV meeting aims to address genomic medicine implementation challenges
On August 31 and September 1, 2022, the NHGRI Genomic Medicine Working Group will host the Genomic Medicine XIV meeting entitled “Genomic Learning Healthcare Systems.” The meeting’s objectives include the exploration of real-world examples of how genomic learning healthcare systems (gLHS) apply implementation, evaluation, adjustment, and updated implementation practices across delivery systems; the examination of barriers and identification of potential solutions, with a focus on lessons learned from effective gLHS and their potential transportability to other settings; and the determination of ways that solutions can be developed and shared and collaborations can be formed to facilitate research on implementation of gLHS. The meeting is open to the public and registration is required by August 17, 2022.
Genomic Data Sharing Spotlight
Did you know that the goal of researchers contributing to the Human Genome Project (HGP) was to submit their sequence data into a public database within 24 hours of being generated? This target was documented during a strategy meeting in Bermuda in 1997, which is why this expectation is often referred to as the “Bermuda Principles.” The legacy of rapid data sharing within the HGP, as well as other large collaborative genomics projects, remains today. Open, quick, and quality data sharing—though not without its challenges—is an ideal that scientists, funders, journals, and many other stakeholders strive for across the biomedical research landscape. To read more about the origins of the Bermuda Principles, see here.
The NIH, inspired by the success of the HGP and other data-sharing efforts, has been advancing its data-sharing policies over the last two decades. NHGRI, as a leader in genomics, has also developed policies and practices that go beyond the baseline set by NIH. For instance, NHGRI expects researchers to leverage data standards and provide comprehensive metadata and phenotypic data for data shared per NIH data-sharing policies. The upcoming NIH Data Management and Sharing (DMS) Policy makes it more important than ever to ensure that shared data are maximally useful to the broader scientific community. NIH will be holding a webinar series in August and September focused implementing this policy.To learn more about the upcoming policy, see sharing.nih.gov and register now for the two-part interactive series to learn what the DMS policy means for you!
COVID-19 News and Research
Requests for Information
Extramural Loan Repayment Program for clinical researchers (LRP-CR), pediatric research (LRP-PR), health disparities research (LRP-HDR), individuals from disadvantaged backgrounds (LRP-IDB), contraception and infertility research (LRP-CIR) and research in emerging areas critical to human health (LRP-REACH)
NIH and NHGRI News
About The Genomics Landscape
A monthly update from the NHGRI Director on activities and accomplishments from the institute and the field of genomics.
Last updated: August 4, 2022