DNA methylation
DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence. When located in a gene promoter, DNA methylation typically acts to repress gene transcription. In mammals, DNA methylation is essential for normal development and is associated with a number of key processes including genomic imprinting, X-chromosome inactivation, repression of transposable elements, aging, and carcinogenesis.
As of 2016, two nucleobases have been found on which natural, enzymatic DNA methylation takes place: adenine and cytosine. The modified bases are N6-methyladenine, 5-methylcytosine and N4-methylcytosine.
Unmodified base | ||||||||
Adenine, A | Cytosine, C | |||||||
Modified forms | ||||||||
N6-Methyladenine, 6mA | 5-Methylcytosine, 5mC | N4-Methylcytosine, 4mC | ||||||
Two of DNA's four bases, cytosine and adenine, can be methylated. Cytosine methylation is widespread in both eukaryotes and prokaryotes, even though the rate of cytosine DNA methylation can differ greatly between species: 14% of cytosines are methylated in Arabidopsis thaliana, 4% to 8% in Physarum, 7.6% in Mus musculus, 2.3% in Escherichia coli, 0.03% in Drosophila, 0.006% in Dictyostelium and virtually none (0.0002 to 0.0003%) in Caenorhabditis or fungi such as Saccharomyces cerevisiae and S. pombe (but not N. crassa).: 3699 Adenine methylation has been observed in bacterial, plant, and recently in mammalian DNA, but has received considerably less attention.
Methylation of cytosine to form 5-methylcytosine occurs at the same 5 position on the pyrimidine ring where the DNA base thymine's methyl group is located; the same position distinguishes thymine from the analogous RNA base uracil, which has no methyl group. Spontaneous deamination of 5-methylcytosine converts it to thymine. This results in a T:G mismatch. Repair mechanisms then correct it back to the original C:G pair; alternatively, they may substitute A for G, turning the original C:G pair into a T:A pair, effectively changing a base and introducing a mutation. This misincorporated base will not be corrected during DNA replication as thymine is a DNA base. If the mismatch is not repaired and the cell enters the cell cycle the strand carrying the T will be complemented by an A in one of the daughter cells, such that the mutation becomes permanent. The near-universal use of thymine exclusively in DNA and uracil exclusively in RNA may have evolved as an error-control mechanism, to facilitate the removal of uracils generated by the spontaneous deamination of cytosine. DNA methylation as well as many of its contemporary DNA methyltransferases have been thought to evolve from early world primitive RNA methylation activity and is supported by several lines of evidence.
In plants and other organisms, DNA methylation is found in three different sequence contexts: CG (or CpG), CHG or CHH (where H correspond to A, T or C). In mammals however, DNA methylation is almost exclusively found in CpG dinucleotides, with the cytosines on both strands being usually methylated. Non-CpG methylation can however be observed in embryonic stem cells, and has also been indicated in neural development. Furthermore, non-CpG methylation has also been observed in hematopoietic progenitor cells, and it occurred mainly in a CpApC sequence context.