"And don't criticize
What you can't understand"
Last week, the ENCODE project (ENCyclopedia Of Dna Elements) released a tremendous amount of new information about our genomes. The results of literally hundreds of millions of experiments using the most current "high throughput" technologies provided the data for over a dozen scientific papers in the journals Nature and Genome Research. The conclusions about organization and expression of the human genome were so significant that they were the topic of a front-page story in The New York Times.
The massive collaborative project examined how our genomes are copied into RNA, interact with regulatory proteins, and are compacted in chromatin, which organizes the genome for cellular differentiation. ENCODE examined DNA from dozens of cell types to find out if the results changed in specific ways from one kind of cell to another. Cell type specificity provides a strong indication that the data are biologically relevant.
ENCODE described their most striking finding as follows:
"One of the more remarkable findings described in the consortium's 'entrée' paper is that 80% of the genome contains elements linked to biochemical functions, dispatching the widely held view that the human genome is mostly 'junk DNA'. The authors report that the space between genes is filled with enhancers (regulatory DNA elements), promoters (the sites at which DNA's transcription into RNA is initiated) and numerous previously overlooked regions that encode RNA transcripts that are not translated into proteins but might have regulatory roles. Of note, these results show that many DNA variants previously correlated with certain diseases lie within or very near non-coding functional DNA elements, providing new leads for linking genetic variation and disease."
In other words, the old idea of the genome as a string of genes interspersed with unimportant noncoding DNA is no longer tenable. Many eminent scientists had opined that the noncoding DNA, much of it repeated at many different locations, is nothing more than "junk DNA." ENCODE revealed that most (and probably just about all) of this noncoding and repetitive DNA contained essential regulatory information. Moreover, much of it was also copied into RNA with additional but still unknown functions.
I had a longstanding, personal interest in the repetitive part of our genomes (up to as much as two-thirds of all our DNA) because it is composed of mobile genetic elements. I first discovered these elements in bacteria in my thesis research in 1968. I remember being scientifically offended by a 1980 article from Francis Crick and Leslie Orgel describing this DNA as "selfish" and functionless.
My interest in the roles of repetitive and mobile DNA has continued since my thesis more than four decades ago. The initial sequencing of the human genome in 2001 found over 40% to be mobile repeats spread throughout our genomes, thirty times more than protein-coding DNA.
In 2005, I published two articles on the functional importance of repetitive DNA with Rick von Sternberg. The major article was entitled "Why repetitive DNA is essential to genome function."
These articles with Rick are important to me (and to this blog) for two reasons. The first is that shortly after we submitted them, Rick became a momentary celebrity of the Intelligent Design movement. Critics have taken my co-authorship with Rick as an excuse for "guilt-by-association" claims that I have some ID or Creationist agenda, an allegation with no basis in anything I have written.
The second reason the two articles with Rick are important is because they were, frankly, prescient, anticipating the recent ENCODE results. Our basic idea was that the genome is a highly sophisticated information storage organelle. Just like electronic data storage devices, the genome must be highly formatted by generic (i.e. repeated) signals that make it possible to access the stored information when and where it will be useful.
The abstract of our paper tells the story:
"ABSTRACT: There are clear theoretical reasons and many well-documented examples which show that repetitive DNA is essential for genome function. Generic repeated signals in the DNA are necessary to format expression of unique coding sequence files and to organise additional functions essential for genome replication and accurate transmission to progeny cells. Repetitive DNA sequence elements are also fundamental to the cooperative molecular interactions forming nucleoprotein complexes. Here, we review the surprising abundance of repetitive DNA in many genomes, describe its structural diversity, and discuss dozens of cases where the functional importance of repetitive elements has been studied in molecular detail. In particular, the fact that repeat elements serve either as initiators or boundaries for heterochromatin domains and provide a significant fraction of scaffolding/matrix attachment regions (S/MARs) suggests that the repetitive component of the genome plays a major architectonic role in higher order physical structuring. Employing an information science model, the 'functionalist ' perspective on repetitive DNA leads to new ways of thinking about the systemic organisation of cellular genomes and provides several novel possibilities involving repeat elements in evolutionarily significant genome reorganisation. These ideas may facilitate the interpretation of comparisons between sequenced genomes, where the repetitive DNA component is often greater than the coding sequence component."
Although we could not predict in detail all the ways repeated DNA would serve genome functions, I think our statements stand up well in light of the recent data. Without knowing the specifics, we were correct in asserting that the genome had to be highly formatted to serve as the marvelous information organelle it is in every living cell and organism.
So, while Rick's choice of evolutionary philosophies is different from mine, I am grateful to him for doing so much work on a paper that remains a source of justified scientific pride. Thinking of the genome informatically and of mobile DNA as a potent force for genome organization are central to the arguments presented on this blog and in my book.
When evidence comes to light that portions of DNA, previously considered to be useless leftovers of failed evolutionary experiments, might have some function (biological activity has been detected) scientists looking forward to uncovering yet more information on how life operates will of course eagerly look forward to what possible 'functions' might be discovered. They will obviously NOT say "this 'activity' does not prove any 'function' so I'll just stick with my earlier view of of this as "junk". Indeed, had this been the view of all biologists, the biological activity would never have been discovered, and no 'function' would ever be discovered. This is obviously not how science progresses.
Yet, some of the commentators on this blog seem emotionally committed to maintaining the 'junk' paradigm. It's obvious that they will be truly disappointed if 'function' is found in the future for these DNA sequences. It has been said that the concept "God did it" is a science stopper. Certainly the assertion "it's all junk", even after some biological activity has been found, is also a science stopper.
You criticize my statement "it's all junk". This sentence contains a pronoun, "it". If we are going to make sense of any sentence containing a pronoun it is essential that we know what the pronoun is referring to. It is called the 'antecedent' of the pronoun.
If you go back to my comment you will see that the antecedent is: "portions of DNA, previously considered to be useless leftovers of failed evolutionary experiments".
Are there people who assert that portions of DNA, previously considered (before the ENCODE data) to be useless leftovers of failed evolutionary experiments, are "junk"? Yes.
Grossly misreading a comment then sharply criticizing it with your usual pejorative vocabulary (as you have done many times before) does nothing to enhance your credibility.
You gave important reference:
"If you look at the Abstract of our 2005 paper, you'll notice that we talk about the physical organization of the genome. The enrichment of S/MAR nuclear lamina attachment signals in LINE element repeats (13% of the 2001 draft genome) is particularly striking in this regard. There's definitely a whole lot of folding and unfolding going on."
May I quote a bit from your article for matter under discussion? People seldom realize how stunning the complexity is in DNA and its environment. Your article could give food for thought..
"Diverse genomic functions associated with retroelements
Table 1 presents over 30 examples where functional activity
has been assigned to a particular retroelement. The genome
functions range from providing promoter and enhancer activity
to modulating transcript elongation, targeting mRNA to specific
tissues, stimulating mRNA translation, providing replication
origin recognition sequences, contributing to pericentromeric
heterochromatin, serving as telomere caps, nucleating
heterochromatin in chromosome arms, supplying chromatin
boundary signals, and providing S/MAR attachment sites.
The list is far from exhaustive."
As on intelligent cell (IC) -theorist I expect that quantum mechanical phenomena govern many biological processes..
Thanks for quoting the details, some of which I had forgotten. Yeah, there's a lot of information about genome function in the repeats, and it grows all the time. That's why the stubbornness of the junk DNA crowd is so perplexing. But refusal to see what's right of front of you is all too human. I just wish they could behave with more civility and less unscientific certainty.
The Human Genome contains 750MB, a similar amount of data as a CD. That small space holds most instructions needed to build a human body, obviously a complex and wonderful machine.
If 90% of our DNA is junk, then the useful instructions only take up 75MB.
How is that even possible, if Mac OS needs 5,000MB and Windows 8 needs 16,000MB?
Does anybody believe Mac OS or Windows is more capable than the human body?
This one fact makes me suspect the vast majority is used for something, even if for evolutionary purposes. Perhaps some code only gets activated in certain situations.
So until the Junk DNA crowd can explain to us in precise detail the DNA transcription and differentiation of every tissue type in the human body, reproducing the whole process intact, they're in NO position to assure anybody it's junk. They are abdicating their responsibilities as scientists.
Meanwhile the scientists who are doing their jobs are those who assume it's NOT junk, and look for functionality and understand every bit of it. Just like the ENCODE project has done. To do anything else is the genetic equivalent of burning the library of Alexandria.
http://www.ncbi.nlm.nih.gov/pubmed/19016882
As we have stated over and over, the pufferfish has 1/8 the genome size of humans.
At the cellular level, our cells are very similar in biochemical complexity to that of fish, and that is where most of eukaryotic complexity lies.
At the anatomical level, land vertebrates have about twice as many tissue types as fish.
Thus, there is no way we need a genome size even twice that of a pufferfish, that is, 1/4 the size of the human genome. That is not proof that the other 3/4 is necessarily junk, but it does demolish the argument that junk is impossible.
"This one fact makes me suspect the vast majority is used for something"
But we know that at most 5-10% of the human genome is conserved; at least 90% accumulates mutations at the rate of neutral mutations-- that is, apparently not favored, nor disfavored, by natural selection. If you compare different humans, in their introns, they differ by major deletions. So most intronic DNA we KNOW is junk; a minority is functional.
Pareto says that you can often get rid of the least effective half of a system and still retain roughly 90% of the performance you had before.
But Pareto also tells us that the least important 10% has some effect. Even if it's just a 1% difference in output. Me, I'll take every 1% advantage I can get. After all, most races are won by seconds or even milliseconds.
So here's my question for you, Diogenes:
May I have permission to delete 50% of YOUR genome?
I promise to only delete the parts that you consider to be junk.
A brief history of the status of transposable elements: from junk DNA to major players in evolution.
Biémont C. Genetics; 2010 Dec;186(4):1085-93.
Here is the video released by Nature Publishing Group (Publisher of the most prestigious scientific journal in the world!): http://www.youtube.com/watch?v=Y3V2thsJ1Wc
Notice the following statement:
"Striking overall result that encode project reports is that they can assign a biochemical function to 80% of human genome. The reason why this is striking is because not such a long time ago we still considered that vast proportion of human genome was simply junk."
Thank you for making this comment. The argument that the disagreement is about "media hype" is a red herring. The ENCODE group itself expressed how unexpected their results were if one subscribed to the "junk DNA" hypothesis. That is what I quoted from NATURE.
ENCODE has show the repetitive component of the genome is not inert and displays activity in a cell-type specific fashion. How much of that activity turns out to be "functional" remains to be seen. We already know that functionality is involved in those cases where the repetitive DNA activity is associated with an inherited dysfunction or a compensatory change in a recognized functional region, such as an exon.
All the ENCODE work gives us minimum involvement values for the phenomena they have studied. They have found cell-specific transcriptional or regulatory activity in 80% of the genome. Exactly what most of that activity means remains to be determined, but it was definitely unexpected on the junk DNA hypothesis.
I'm confident in what the future will tell us about repetitive DNA. Note that the ENCODE results do not apply at all to the evolutionary roles of these mobile repeats, which have been documented in other work (http://www.huffingtonpost.com/james-a-shapiro/more-evidence-on-the-real_b_1158228.html).
I just want to thank you again Dr. Shapiro for having the courage to follow the evidence where ever it leads. Today, over 300 people attended a seminar in our church and learned about advances in science which included a brief discussion of Dr Shapiro and his evidence for N.G.E, in an effort to bring us into the 21st century of evolutionary science and thinking.
Many who attended the seminar (which included some anti-evolutionists) commented afterwards to me that your theory illustrated a magnificence to evolution that they have never heard of before! in fact, the predominant comment was that if the general public was just taught about:
*Horozontal Gene Transfer
*Epigenetics
*symbiogenesis
*Natural Genetic Engineering
There would be a much greater acceptance of evolution in the U.S and more specifically, the churches!
300 down, 300 million to go! (hehehe)
.
IF you explain in engineering detail how it actually works... instead of expecting them to throw their brain away and believe that life is just an accumulation of blind chance and random copying errors. That's a proposition which defies all experience, common sense and reason. Hey, some of those folks can be dogmatic but they're not stupid.
300 million to go, indeed.
Thanks for backing up a point I have been trying to make in many of my blogs. The best way to deal with Creationist, ID and other challenges to evolution is not to circle the wagons around conventional wisdom but rather to present all the excitement of current evolution science. That will both remove the stigma of unscientific dogmatism from evolution advocates and pique the curiosity of people with genuine questions about how such a complex process could possibly work.
http://www.huffingtonpost.com/michael-white/media-genome-science_b_1881788.html
Much to his credit, he points out the fundamental flaws in much of the scientific journalism pertaining to it:
"Influenced by misleading press releases and statements by scientists, story after story suggested that debunking junk DNA was the main result of the ENCODE studies. These stories failed us all in three major ways: they distorted the science done before ENCODE, they obscured the real significance of the ENCODE project, and most crucially, they mislead the public on how science really works."
'Thanks. Then it makes a lot of sense to me. They want to stay stuck in the 1970s' you wrote.
You're welcome. It makes a lot of sense to me too, I think: evolution is very much a struggle for junk! ;-)
Are you saying you don't believe in evolution as a real process? The statement "evolution is very much a struggle for junk!" is perplexing. Please clarify.
i meant it jokingly, hence the ;-)
may be i should have said: all theories of evolution are very much struggling for junk:
junk or not junk, that's the question!
or
what biological functionality is there in all this biochemical activity?
I wasn't surprise to read the following in his article:
"Scientists also discovered that our genomes contain parasitic, virus-like elements called "transposons" that have the ability copy themselves within our cells. This DNA ecosystem makes our genomes more like a jungle than a precision machine. At the latest count, transposon-derived DNA makes up at least half of our genome. The transposon-derived sequences in our genomes do not have to be explained by invoking some useful function for it. There is no mystery here: this DNA is there because it can replicate."
http://arstechnica.com/staff/2012/09/most-of-what-you-read-was-wrong-how-press-releases-rewrote-scientific-history/
http://arstechnica.com/science/2012/09/cataloging-the-controlled-chaos-of-the-human-genome/
Perhaps their meaning is not at all regulative. Their importance on the cell will be centered around huge number of their physical and chemical features, see e.g. "DNA water" or "DNA physics".
If you look at the Abstract of our 2005 paper, you'll notice that we talk about the physical organization of the genome. The enrichment of S/MAR nuclear lamina attachment signals in LINE element repeats (13% of the 2001 draft genome) is particularly striking in this regard. There's definitely a whole lot of folding and unfolding going on.
"When we understand the roles of the repeats in these expanded genomes compared to the smaller related ones, we may have a better idea on how to answer your question."
"we may have a better idea"? May you? And who's "we"? You say you concocted this hypothesis years ago-- never came up with a way to test it--and you don't suggest a test now. When will you suggest one? Your hypothesis is not self-contradiction, then it's just question-begging.
The ENCODE consortium actually demonstrated that ONLY 9% of the human genome is functionally constrained as to sequence. That's 9%, not 80%. So your hypothesis was disproven by ENCODE.
Under the most optimistic scenario, multiplying up ENCODE's results, perhaps 20% of the human genome may someday be shown to be functionally constrained as to sequence. So at least 80% of the human genome is not functionally constrained as to sequence. Call it "Junk" or call it whatever you like, it's not going away and your hypothesis is disproven.
"The ENCODE consortium actually demonstrated that ONLY 9% of the human genome is functionally constrained as to sequence. That's 9%, not 80%. So your hypothesis was disproven by ENCODE. "
I'm not sure what you mean by "constrained," but I suspect you have your facts wrong. Please quote the paper where this claim is substantiated. Constraint, whatever that means, is not the only evidence for functionality, and I would like to know more clearly what you have in mind.
It's an essential part of the definition of "function" that is relevant to the Junk DNA hypothesis.
In the abstract of the ENCODE summary paper, where Birney used the 80% figure, note that 76% of that was due to them defining "functional" as including any transcription, even if a sequence is transcribed at low levels, even if the RNA sequence is later degraded, even if the RNA is repetitive, non-conserved, etc.
But Birney's inclusion of "transcribed" in his definition of "functional" in the ENCODE abstract is fine for him, but it's irrelevant to Junk DNA as Comings, Ohno and those guys from the 1970's defined it. They defined "Junk" knowing that much of it might be transcribed.
By Birney's super-broad definition, random DNA sequences with a retroviral-originated promoter in front of them would all be considered "functional".
But on his private blog, Birney also considered a traditional definition of "function" and wrote:
"However, on the other end of the scale – using very strict, classical definitions of “functional” like bound motifs and DNaseI footprints; places where we are very confident that there is a specific DNA:protein contact... we see a cumulative occupation of 8% of the genome. With the exons... that number goes up to 9%." [http://genomeinformatician.blogspot.com/2012/09/encode-my-own-thoughts.html] He later speculates that with more experiments, they may get it up to 20%.
Being non-transcribed was NOT an essential aspect of any Junk DNA (there are multiple positive arguments for the existence of Junk, about 12+ different positive arguments.) Back in 1972, when David E. Comings used the first published occurrence of the phrase "Junk DNA" (a bit before Sesumu Ohno), he knew at least 25% of the mouse genome was transcribed which was much more than its coding regions. Comings explicitly defined Junk DNA so that much of it could be transcribed. (See T. Ryan Gregory's wonderful blog resource on 1970's and 80's science, e.g.: http://www.genomicron.evolverzone.com/2012/09/encode-2012-vs-comings-1972/.)
So "transcribed" or "not transcribed", by itself, is irrelevant to the Junk DNA hypothesis, but it was included by Birney in the ENCODE abstract to push the number up to 80%.
On his blog Birney explained why he uses the super-broad definition:
"We use the bigger number [80%] because it brings home the impact of this work to a much wider audience. But we are in fact using an accurate, well-defined figure when we say that 80% of the genome has specific biological activity." [http://genomeinformatician.blogspot.com/2012/09/encode-my-own-thoughts.html]
Translation: fool the public, generate buzz. Birney is not accurately describing what he wrote anyway. In the ENCODE summary, he actually wrote 80% had "specific biological function." Now he walked it back, changing it to 80% "specific biological activity."
Are all swans white?
I fear I'm a bear of little brain. I fail to see the point of your question. Maybe you can explain your problem with my sentence more clearly. We provided both "theoretical reasons" and dozens of "well-documented examples" for repeat DNA functionality. Read the paper.
James Damiano, for over sixteen years has been fighting in federal court to retain the
rights to his Intellectual Property from Bob Dylan, for the theft of his entire song catalog, which took.
thirty-seven years of his life to write.
On June 16, 2009 the following letter was to Bob Dylan's Attorney Orin Snyder written by James Damiano's Attorney Robert Church, regarding boxes of James Damiano's songs produced to Orin Snyder during discovery.
There were approximately fifteen to twenty-five boxes filled with anywhere from 200 to 400 finished and unfinished songs in each box - thirty seven years of writing, that were never returned.
When listening to Dylan, don't be surprised if your listening to lyrics and music by James Damiano.
Dylan to date, refuses to return the song catalog. http://www.jamesdamiano.yolasite.com/
This blog is not the right place to discuss who wrote the lyrics to the songs Dylan sang. I was just using the title and some of the lines because they fit so well with what's happening in evolution debates. I'd suggest making the comments on a blog about intellectual property rights and how they are often abused.