By Sheena Faherty
Science Writer, NHGRI
The National Human Genome Research Institute's ENCyclopedia Of DNA Elements (ENCODE) Project has spent 13 years building a catalog of all the functional elements in the human genome sequence, and making it available to scientists worldwide for the study of human health and disease.
On February 9, 2017, to highlight the announcement of new funding awards, ENCODE program directors in the Division of Genome Sciences at NHGRI, Drs. Elise Feingold, Mike Pazin, and Dan Gilchrist, and ENCODE researchers from the University of California, San Francisco, Drs. Nadav Ahituv and Yin Shen, turned to Reddit - a social news website and discussion forum - to answer questions from the Reddit community. Here, we recap the event (or you can check out Reddit Science's page to see the whole "Ask Me Anything!")
Overall, the team answered over fifty questions related to the future of genomics, the ENCODE Project, and "dark matter" DNA, among others. For example, the ENCODE team gave their thoughts about how DNA sequencing is a powerful way to answer numerous biological questions.
Check out a few of the questions and answers below, or vist the Reddit AMA on the ENCODE Project, and stay tuned for more NHGRI AMAs in the future.
Have you ever come across some sequencing data that just didn't make any sense? Most likely a contaminant or some other boring explanation, but is there something that just sticks in the back of your head after all these years as something that could be biologically cool?
When I was working in the lab, I encountered this kind of sequencing data every day! But seriously, two things that make genomics so powerful (and fun): First, with one experiment an entire genome's worth of data are collected. There are all kinds of things in the data, just waiting to be found. Second, when researchers make this digital data publicly available, either through projects like ENCODE or resources like GEO, any scientist can access it and use it to address their own research questions. Genomic data are tops for hypothesis generating! -Dan
Not me personally, but I like that some labs (not part of ENCODE) have looked at ENCODE samples from cells immortalized with a virus (EBV) and analyzed the viral DNA in the data (ENCODE analyzed the human DNA) - Mike
On my end, not sequencing data, but specific sequences in the human genome. In the past we have studied sequences called ultraconserved elements, which are one of the most evolutionarily conserved sequences in the genome. When we removed four noncoding ultraconserved elements independently in the mouse, we got what looked like viable 'normal' mice. We were expecting the mice to die or have some sort of selective phenotype we could see. So, still a big mystery in my mind as to why these sequences are so conserved. - Nadav
CRISPR-Cas9, a gene editing tool which can be programmed to target specific stretches of genetic code and edit DNA with precision, was also a topic of discussion. The Reddit community was curious as to the impact CRISPR will have on the field of genomics.
Thanks for the AMA, and good luck with ENCODE. I'm interested in various fields of -omics, cancer, embryonic growth factors, and enjoyed the work of Biava. I'm fascinated by the impact CRISPER is having, and am curious what kind of tool/technology is still missing that would improve/accelerate your work?
Good question. The CRISPR tool has really revolutionized the way we study DNA function. With CRISPR we are now pretty good at dissecting DNA function on a large scale using cell lines like embryonic stem cells or immortalized cell lines. However, it is still hard to do gene editing directly in a physiologically relevant way in primary cells - those taken directly from living animals - or in vivo. The major bottle-neck is to have access to those cell types and be able to deliver the genome editing tool into those cells efficiently. -Yin
The Reddit community was also interested in knowing about the 98 percent of the genome that doesn't code for proteins and why studying these genomic regions is important for our understanding of how the genome works.
Previously we have held the belief that a lot of this 'dark matter' DNA was useless ("junk DNA") and it's only been more recently in the last five to ten years that we have realised a lot of what we previously thought was junk actually has function. Based on what you are doing how much of our DNA would you reckon is actually junk and how much of our DNA actually has a function? Further to this why do we have junk DNA to begin with, why doesn't our body get rid of DNA we have no use for?
Great question! Only 2% of our genome are genes that code for protein. Around 45% of our genome is actually made of what's called repeats, many of them viruses that were inserted into our genome. Various cool studies show that several of them have adapted new functions that made them 'stay' in our genome - like becoming parts of other genes or adopting a gene regulatory function (instructing genes when, where and at what levels to turn on). As for the remaining 53%, we see that a lot of it has regulatory function and other functions which we still don't know and which are fascinating in my mind to uncover. - Nadav
The ENCODE team also gave advice to ambitious college students who are looking for summer internship opportunities at the National Institutes of Health.
do you guys need interns for summer 2017? i'm a junior studying genomics and molecular genetics at michigan state university.
There are opportunities for summer internships at NIH https://www.training.nih.gov/programs/sip and more information can be found on NHGRI's training page: https://www.genome.gov/10000212/training-programs//training-programs/. Also see Elise's response to username "NotAProgramAnalyst" for information about our Program Analyst program at NHGRI for you to consider after you graduate. Good luck! -Team
Posted: February 17, 2017