"If nothing in biology makes sense except in the light of evolution, ...the modern view of disease holds no meaning whatsoever." -Nick Lane

Monday, March 8, 2010

Junk DNA: how much really is junk?

Junk DNA:

The question of how much of our junk DNA may turn out to be functional is a fascinating topic.  I wrote a paper on this topic about 5 years ago based largely on the work of John Mattick.  Mattick argues that because most of the genome is actually transcribed, and because it seems to be transcribed in a programmatic way (ie some noncoding areas are transcribed in a pattern dissimilar to adjacent genes) then most of the genome is probably functional.  When I wrote the paper, I agreed with this idea but I have a more nuanced position today I think.  A few weeks before I finished my report, a paper was published in nature showing that a mouse that had many noncoding regions deleted from its genome was clinically equivalent to other mice.  Among the noncoding regions that were deleted from the mouse's genome were areas referred to as "ultra conserved regions."  Some UCR's are more conserved in vertebrates than some of the most constant proteins.  If these sequences were conserved so pristinely by natural selection, they must have a crucial function.  But, this experiment seemed to show that was not the case.  But, there is still a mystery here.  If these sequences were not crucial for the survival of the mouse, why were they conserved over the course of 100 million years of evolution?

Going back to my notes of Power, Sex, Suicide by Nick Lane, on page 187 he talks about how the extra DNA in eukaryotes may function as raw material for new genes.  Bacteria were under selection for trim genomes because fast division was important.  It's not that eukaryotes are under selection for bigger genomes so that they can later evolve new genes.  Natural selection cannot and does not plan ahead.  Therefore, the tendency toward bigger genomes is more likely because mutations that add something, anything to the genome are more likely to be neutral than ones that delete something.  Therefore, if eukaryotes are not under selection for small genomes, then they will tend to collect junk over time.

Why our 'junk DNA' may be useful after all:
Pearson, Aria   New Scientist; 7/14/2007, Vol. 195 Issue 2612, p42-45, 4p

Finally a balanced and reasonable article on the subject of junk DNA.  Pearson points out that we have actually known about functional noncoding DNA since the early 70s.  In Mattick's writings, he accuses the scientific establishment of completely ignoring noncoding regions for decades.  From what I've been reading, I don't think that this is actually the case.  The reason why science has focused on coding regions in because they are better understood and mutations in them are more easily pinpointed. 

The article compares junk DNA to bloatware on new computers.  Some of it may appear to be doing something and appear to be functional but whether or not it is doing us any good is debatable.  This is an interesting analogy and I can't help but think it must be true in many cases.

Many researchers believe that most noncoding RNAs that are transcribed are really just noise that is generated by the transcription of nearby genes. However, Gingeras argues that this can't explain a lot of non coding transcription because many non coding RNAs are transcribed in areas where there are no genes nearby.  I wonder if a lot of this is like the bloatware analogy.  Perhaps something is happening but that doesn't mean that whatever is happening is necessarily benefiting the organism.

One interesting observation that Mattick makes is that long noncoding RNAs transcribed in the mouse brain are transcribed differently than the genes that they are closest to.  This suggests that their transcription is not an accident but must be controlled programmatically. 

What this article shows is that the debate is not about whether or not any sequences outside of protein coding sequences are functional.  We have known that many are for decades. The debate is about how much will end up being functional.  Mattick argues that more than 50% of the genome is functional and perhaps up to 80% or 90% based on the amount of the genome that is transcribed.  Other researchers put the estimate at below 5% of the genome and there are plenty in between.

This is a fascinating debate.  To me it seems unlikely that Mattick is right.  He cannot explain why some species such as the puffer fish can get away with such a trim genome or why some relatively simple organisms such as some species of amoeba have so much junk DNA that their genomes are actually larger than humans.

But, at the same time, even if the most conservative biologists are right, if less than 5% of the genome is functional, that is still a lot of functional non coding DNA that we don't understand yet!

This paper talks about the experiment that deleted ultra conserved regions from the mouse.  One explanation that was given was that perhaps the effect of these regions is subtle.  If the sequence made the mouse just 1% more likely to survive then it would be preserved.  I don't like this explanation.  If the effect was really that subtle, then it would be more likely to be able to evolve over time.  If these sequences are more preserved then protein coding genes then subtle effects do not explain their preservation.

Another explanation given by Kelly Frazer is that redundancy could be built in and there were other regions that compensated for the ones that were deleted.  I don't like this explanation either.  If there really is this much redundancy then how could the region be more conserved than protein coding genes?  I don't see how natural selection could keep these regions so pristine if there is redundancy in the system.

This is a great mystery to me because I don't agree with any of the explanations put forward.  There are many potential angles for my thesis here.


Sean Carroll:
Scientific American May 2008: Regulating Evolution

This article in Scientific American by one of my favorite authors Sean Carroll (Endless Forms Most Beautiful) explores the evolution of "enhancers" which he refers to as "switches."  Eukaryotes uniquely promote transcription of their genes through these switches which can appear long before, long after or even with in introns!  They are hard to detect experimentally which is why many genetic mutations have been determined to be regulatory in nature even though the exact mutation remains elusive.
The article explores some examples of phenotypes that can be modified by these regulatory sequences without affecting how the gene is expressed in other parts of the body or in other life stages.
One of the main source articles that Carroll references is Wray's paper in nature which I review below:

Gregory Wray:  March 2007 Nature Reviews Genetics The evolutionary Significance of cis-regulatory mutations
Wray argues that cis-regulatory and protein coding mutations may be phenotypically distinct.  He gives 2 reasons for this.
1. Each allele in a diploid organism is transcribed independently.  Therefore, mutations in regulatory regions tend to be co-dominant whereas structural mutations may not be visible to natural selection till genetic drift takes place to the point where there are significant numbers organisms homozygous for the mutation.
2. Cis-regulatory mutations may only affect the organism in particular tissue or life stage whereas structural mutation will affect organism everywhere protein is expressed.  There is potential for alternative splicing to cushion this affect but clear examples of this are rare.

One example he discusses is lactose tolerance in adults in northern Europeans.  The switch that enabled this was actually located inside a intron in the gene that was affected.  This is interesting because Mattick talks a lot about the potential for introns to evolve function.  This is a great example of one that did.

Another example comes from comparisons of microarrays of gene expression in the brains of humans and chimps.  The levels are expression are different for more than 10% of the genes that are expressed.  The author points out that this is actually probably an underestimate.  It is unknown where in the genome the regulatory sequences that affected this expression is encoded.  Mattick has argued that trans-regulation may be at work, that is regulatory sequences no where near the genes being expressed.

Another example is the increased expression of prodynorphin in humans relative to other primates.  This gene is involved in emotional status and perception of pain.  A mutation in the regulatory sequence of this gene is responsible for the increased expression in humans.


I have a lot of material to digest here.  There are a lot of angles I can take this.  Mattick may be wrong that most of the genome is functional, but from what I am reading that may not matter.  Even if only a small percentage is functional that doesn't change the fact that noncoding sequences may be the main drivers of eukaryotic evolution.

No comments:

Post a Comment