David Haussler on Surprises in The Human Genome Project
  David Haussler     Biography    
Recorded: 08 May 2012

I think one of the big surprises, actually didn’t come about until we had also read the mouse genome. Now finally we had two different mammals that we could compare, separated by about seventy-five million years of evolution, back to the common ancestor and during that time we can actually look at the action of evolution on DNA. The study of evolution from a molecular point of view has been a decades old field, but never had the data we suddenly had – whole genome data to look at, systematically look at the difference in the bases. And it turns out that when you have that much data, you can look at the different parts of the genome, in particular the protein coding genes and all of the other regions outside of the protein coding genes and you can recognize which sections of the genome have been under the influence of natural selection, in particular the dominant signal that you see in natural selection is lack of change because random mutations would be deleterious to the function and hence the fitness of the organism. So, method - region of the genome that don’t change much between human and the mouse are the ones that are going to be functionally important and most people expected those to be essentially the protein coding regions but we found that there were three times as many regions outside of the protein coding regions that were strongly selected and hence functional. So, as much of the genome that’s functional because it’s code, codes for protein, there’s three times as much out there that’s functional and not coding for proteins.

And that was exciting! I think this was something that was…was – it opened up a new field in which you could think about studying these elements, which weren’t genes but were clearly important to the organism and we, we strongly believe that they regulate the genes, but they are harder to study than the protein coding regions so we don’t know as much about how they work. There -they seem to be packed with transcription factor binding sites, places where proteins interact with the DNA in order to regulate the activities that go on in the cell; the gene regulation. And now it’s become a major area to understand how those regions work, what, how these little snippets of DNA regulating the genes – where do they come from, how do they evolve. One of the other surprises we heard about this – earlier on – I shouldn’t recognize that because it’s going to be out of context. But, one of the other surprises we found later on is that some of these regions come from transposes so transposable elements and elements that jump around the genome, actually carry regulatory elements with them, so that was another area of surprise.

David Haussler (born 1953) is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome. He is a Howard Hughes Medical Institute Investigator, professor of biomolecular engineering and director of the Center for Biomolecular Science and Engineering at the University of California, Santa Cruz, director of the California Institute for Quantitative Biosciences (QB3) on the UC Santa Cruz campus, and a consulting professor at Stanford University School of Medicine and UC San Francisco Biopharmaceutical Sciences Department.