# Central Dogma

The Central Dogma of Molecular Biology states, that DNA is copied by the process of replication, read out to RNA by transcription and translated to protein by translation. The overall process of proteinproduction of a certain gene is also called gene expression. One says a certain gene gets expressed, when its protein is produced.

## DNA Replication

The central dogma postulated by Francis Crick was first verified by Arthur Kornberg. He purified the enzyme that was thought to carry out replication and tried to replicate DNA in vitro. To this end, he mixed

• deoxyribonucleotides (dATP, dCTP, dGTP, dTTP)
• template strand
• primer strand (small peace of single stranded DNA that provides an $OH$ group to start DNA polymerase activity)

\begin{align} \text{Primer }5'-pT-pA-pC-pG-pT-pA-\text{added nucleotides by DNA polymerase}\hspace{0.6cm}\\ \text{Template}3'-pA-pT-pG-pC-pA-pT-pT-pA-pG-pG-pC-pT-\dots-5' \end{align} and could show that the enzyme he found called DNA Polymerase I was able to build in the subsequent nucleotides according to the template, in the direction of the 5' to the 3' end. Later it was shown, that this enzyme is not the only DNA Polymerase and that DNA Polymerase II and III exist and that the latter is used in replication.

In an organism there is a lot more going on. First the double stranded DNA has to be split into single strands, this is done by the enzyme helicase. The two strands are then stabilized by single strand binding proteins, that prevent the strands from reannealing. On the so called leading strand DNA polymerase can then start to polymeraize the second strand from the template according to the rules of base pairing, but on the lagging strand continuously primers have to be added, since replication always proceeds from 5' to 3'. This is done by DNA primase. The primer fragments are named after the person Okazaki, who first verified this mechanism and are therefore also called Okazaki fragments. Addition of Okazaki fragments creates another problem, where the polymerase finishes polymerization at a primer, there is no bond between the nucleotides. But there was found a protein called DNA ligase, that ligates the nucleotides by catalyzing a covalent bond. Since the DNA has a helix structure, the replicated DNA strands are twisted around each other. They have to be unwound by the enzyme Topoisomerase, which cuts the DNA unwinds it and ligates it together again. Some antibiotics the quinolones hinder the DNA from replicating by binding to Topoisomerase and preventing the ligase domain from working, this leads to cuts in the DNA and finally to cell death. The same mechanism is used to prevent rapidly dividing cancer cells from replicating. Watch this video to get a vivid picture of the replication process.

The primer regions that enabled replication by providing an $OH$ group to the DNA polymerase on the lagging strand have to be replaced after the poymerisation step. This creates the so called end replication problem at the 5' end of the lagging strand, since there is no $OH$ group left from the newly synthesised strands.

The single stranded DNA overhang at the 3' end is called telomer and is a G rich region. The DNA would be shortened by each replication if there wouldn't be the enzyme telomerase, which extends the telomer by adding additional nucleotides according to a template included in the telomerase. The second strand can in turn be synthesised by DNA polymerase that carries a primase as one of its subunits. The DNA is completely copied and no information is lost.

This video gives a nice illustration of the telomerase machinery, which was discovered by Carol W. Greider and Elizabeth Blackburn in 1984, they recieved the nobel prize in 2009.

The DNA polymerase not only adds nucleotides it has also a proofreading mechanism built in. If a wrong nucleotide was built in by accident the polymerase removes it. In this manner only one error after $\approx 10^8$ nucleotides occurs. Without proofreading there would be an error at about every $\approx 10^3$th nucleotide, so the accuracy is increased by a factor of $10^5$ by the proofreading machinery.

## Transcription

Transcription is the process of making RNA from a DNA template. The DNA polymer contains a specific sequence of letters that is called the promoter, followed by another sequence, the gene.

The promoter is the region where a molecule called transcription factor can bind. Binding of the transcirption facter increases the affinity for the binding of RNA polymerase (RNAP) and initiates transcription, thus this process is called transcription initiation.

Like in the replication process, the helicase splits the DNA in two single strands. Thereby one differs between the template strand, which is the non coding strand and the coding strand.

The RNA polymerase synthesises the messenger RNA (mRNA) from the template strand in a process called elongation. The only difference in the sequence of DNA and mRNA is that in mRNA instead of thymin the base uracil is used. Since the mRNA is complementary to the template strand, the mRNA sequence is the same as the coding sequence of the DNA, except that thymin is replace by uracil. Videos [1][2].

## Translation

Translation is the process of making a protein from a mRNA template. The information on the mRNA is arranged in so called codons. These are a tripplet of bases. Each of this codon from the mRNA sequence is translated into a aminoacid. This is done by an adapter called transfer RNA (tRNA) that consists of an anti-codon that binds the mRNA and carries a specific amino acid according to the codon.

The machinery that catalyses this reaction is the ribosome. The ribosome slides along the mRNA while continuously tRNAs enter. Each subsequent tRNA adds an aminoacid in this manner the digital code of the mRNA is translated into an aminoacid sequence that folds into a specific form determined by the aminoacid sequence, thus forming a protein that can serve a specific function.

The following table summerizes the codon - protein translation

U C A G UUU Phe F Phenylalanine UUC Phe F Phenylalanine UUA Leu L Leucine UUG Leu L Leucine UCU Ser S Serine UCC Ser S Serine UCA Ser S Serine UCG Ser S Serine UAU Tyr Y Tyrosine UAC Tyr Y Tyrosine UAA Ochre (Stop) UAG Amber (Stop) UGU Cys C Cysteine UGC Cys C Cysteine UGA Opal (Stop) UGG Trp W Tryptophan CUU Leu L Leucine CUC Leu L Leucine CUA Leu L Leucine CUG Leu L Leucine CCU Pro P Proline CCC Pro P Proline CCA Pro P Proline CCG Pro P Proline CAU His H Histidine CAC His H Histidine CAA Gln Q Glutamine CAG Gln Q Glutamine CGU Arg R Arginine CGC Arg R Arginine CGA Arg R Arginine CGG Arg R Arginine AUU Ile I Isoleucine AUC Ile I Isoleucine AUA Ile I Isoleucine AUG Met M Methionine, Start ACU Thr T Threonine ACC Thr T Threonine ACA Thr T Threonine ACG Thr T Threonine AAU Asn N Asparagine AAC Asn N Asparagine AAA Lys K Lysine AAG Lys K Lysine AGU Ser S Serine AGC Ser S Serine AGA Arg R Arginine AGG Arg R Arginine GUU Val V Valine GUC Val V Valine GUA Val V Valine GUG Val V Valine GCU Ala A Alanine GCC Ala A Alanine GCA Ala A Alanine GCG Ala A Alanine GAU Asp D Aspartic acid GAC Asp D Aspartic acid GAA Glu E Glutamic acid GAG Glu E Glutamic acid GGU Gly G Glycine GGC Gly G Glycine GGA Gly G Glycine GGG Gly G Glycine

Video Lectures:

• MIT: Eric Lander, Robert Weinberg - Introduction to biology Lecture 11