How many phosphates are in the 5' end of a DNA strand?

How many phosphates are in the 5' end of a DNA strand?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I know that the pyrophosphate of a NTP is hydrolysed so that it can create enough energy for the synthesis of DNA. But if we want to get specific we can know for sure that the 3' end of the DNA strand has got one phosphate because the phosphate is having a bond with another nucleotide. But what could be the reason for the 5' end to have only one phosphate and not 3? Could it be the energy providing or that the structure has to be consistent or something else that I can't think of?

The DNA polymerase can only extend a primer and therefore almost all lifeforms have a primase (which is a type of RNA polymerase) that synthesizes RNA primers at the replication origins that the DNA polymerase can extend.

As you guessed, the 5' end would indeed have a triphosphate.

Organism Template Sequence Primer Synthesized

Herpes simplex-1 3'-GPyPy pppPuPu T4 3'-TTG pppAC T7 3'-CTG pppAC E. coli 3'-GTC pppAG S. aureus 3'-ATPy pppAPu S. aureus 3'-ATPy pppAPu A. aeolicus 3'-CCC, CCG, CGC pppGG, pppGC, pppCG B. stearothermophilus 3'-ATPy pppAPu Human 3'-PyNN pppPuNN Calf 3'-PyNN pppPuNNInitiation sites used by primase from different sources. Pu = purine, Py = pyrimidine, N = any base ppp = the triphosphate on the 5'-end of the primer. Those bases that are required but do not code for a base in the primer (“cryptic” nucleotides) are underlined and in bold.

Kuchta and Stengel (2010)

In the final stages of replication, these RNA primers are removed by RNAseH and DNA polymerase I in bacteria and the gaps are filled by the latter (Cooper (2000) The Cell). In eukaryotes, the primer is removed by the action of Fen1 and Dna2 exonucleases and the gaps are filled by DNA polymerase δ (Burgers 2009).

In PCR, it is quite straightforward: the product will have whatever 5' modifications the primers have (usually none).

DNA Replication

In order to determine which of these models was true, the following experiment was performed: The original DNA strand was labelled with the heavy isotope of nitrogen, N-15. This DNA was allowed to go through one round of replication with N-14, and then the mixture was centrifuged so that the heavier DNA would form a band lower in the tube, and the intermediate (one N-15 strand and one N-14 strand) and light DNA (all N-14) would appear as a band higher in the tube. The expected results for each model were:

The actual results were as expected for the semiconservative model and thus Watson and Crick's suspicion was borne out.

Biochemical Mechanism of DNA Replication

The Enzymes of DNA Replication

  1. Topoisomerase is responsible for initiation of the unwinding of the DNA. The tension holding the helix in its coiled and supercoiled structure can be broken by nicking a single strand of DNA. Try this with string. Twist two strings together, holding both the top and the bottom. If you cut only one of the two strings, the tension of the twisting is released and the strings untwist.

The Replication Fork

Why can DNA polymerase only act from 5' to 3'? The reason is the relative stability of each end of DNA. A triphosphate is required to provide energy for the bond between a newly attached nucleotide and the growing DNA strand. However, this triphosphate is very unstable and can easily break into a monophosphate and an inorganic pyrophosphate, which floats away into cell. At the 5' end of the DNA, this triphosphate can easily break, so if a strand has been sitting in the cell for a while, it would not be able to attach new nucleotides to the 5' end once the phosphate had broken off. On the other hand, the 3' end only has a hydroxyl group, so as long as new nucleotide triphosphate are always brought by DNA polymerase, synthesis of a new strand can continue no matter how long the 3' end has remained free.

This presents a problem, since one strand of the double helix is 5' to 3' , and the other one is 3' to 5'. How can DNA polymerase synthesize new copies of the 5' to 3' strand, if it can only travel in one direction? This strand is called the lagging strand , and DNA polymerase makes a second copy of this strand in spurts, called Okazaki fragments , as shown in the diagram. The other strand can proceed with synthesis directly, from 5' to 3', as the helix unwinds. This is the leading strand .

How many phosphates are in the 5' end of a DNA strand? - Biology

By the end of this section, you will be able to do the following:

  • Describe the structure of DNA
  • Explain the Sanger method of DNA sequencing
  • Discuss the similarities and differences between eukaryotic and prokaryotic DNA

The building blocks of DNA are nucleotides. The important components of the nucleotide are a nitrogenous (nitrogen-bearing) base, a 5-carbon sugar (pentose), and a phosphate group ((Figure)). The nucleotide is named depending on the nitrogenous base. The nitrogenous base can be a purine such as adenine (A) and guanine (G), or a pyrimidine such as cytosine (C) and thymine (T).

Art Connection

Figure 1. The purines have a double ring structure with a six-membered ring fused to a five-membered ring. Pyrimidines are smaller in size they have a single six-membered ring structure.

The images above illustrate the five bases of DNA and RNA. Examine the images and explain why these are called “nitrogenous bases.” How are the purines different from the pyrimidines? How is one purine or pyrimidine different from another, e.g., adenine from guanine? How is a nucleoside different from a nucleotide?

The purines have a double ring structure with a six-membered ring fused to a five-membered ring. Pyrimidines are smaller in size they have a single six-membered ring structure.

The sugar is deoxyribose in DNA and ribose in RNA. The carbon atoms of the five-carbon sugar are numbered 1′, 2′, 3′, 4′, and 5′ (1′ is read as “one prime”). The phosphate, which makes DNA and RNA acidic, is connected to the 5′ carbon of the sugar by the formation of an ester linkage between phosphoric acid and the 5′-OH group (an ester is an acid + an alcohol). In DNA nucleotides, the 3′ carbon of the sugar deoxyribose is attached to a hydroxyl (OH) group. In RNA nucleotides, the 2′ carbon of the sugar ribose also contains a hydroxyl group. The base is attached to the 1’carbon of the sugar.

The nucleotides combine with each other to produce phosphodiester bonds. The phosphate residue attached to the 5′ carbon of the sugar of one nucleotide forms a second ester linkage with the hydroxyl group of the 3′ carbon of the sugar of the next nucleotide, thereby forming a 5′-3′ phosphodiester bond. In a polynucleotide, one end of the chain has a free 5′ phosphate, and the other end has a free 3′-OH. These are called the 5′ and 3′ ends of the chain.

In the 1950s, Francis Crick and James Watson worked together to determine the structure of DNA at the University of Cambridge, England. Other scientists like Linus Pauling and Maurice Wilkins were also actively exploring this field. Pauling previously had discovered the secondary structure of proteins using X-ray crystallography. In Wilkins’ lab, researcher Rosalind Franklin was using X-ray diffraction methods to understand the structure of DNA. Watson and Crick were able to piece together the puzzle of the DNA molecule on the basis of Franklin’s data because Crick had also studied X-ray diffraction ((Figure)). In 1962, James Watson, Francis Crick, and Maurice Wilkins were awarded the Nobel Prize in Medicine. Unfortunately, by then Franklin had died, and Nobel prizes are not awarded posthumously.

Figure 2. The work of pioneering scientists (a) James Watson, Francis Crick, and Maclyn McCarty led to our present day understanding of DNA. Scientist Rosalind Franklin discovered (b) the X-ray diffraction pattern of DNA, which helped to elucidate its double-helix structure. (credit a: modification of work by Marjorie McCarty, Public Library of Science)

Watson and Crick proposed that DNA is made up of two strands that are twisted around each other to form a right-handed helix. Base pairing takes place between a purine and pyrimidine on opposite strands, so that A pairs with T, and G pairs with C (suggested by Chargaff’s Rules). Thus, adenine and thymine are complementary base pairs, and cytosine and guanine are also complementary base pairs. The base pairs are stabilized by hydrogen bonds: adenine and thymine form two hydrogen bonds and cytosine and guanine form three hydrogen bonds. The two strands are anti-parallel in nature that is, the 3′ end of one strand faces the 5′ end of the other strand. The sugar and phosphate of the nucleotides form the backbone of the structure, whereas the nitrogenous bases are stacked inside, like the rungs of a ladder. Each base pair is separated from the next base pair by a distance of 0.34 nm, and each turn of the helix measures 3.4 nm. Therefore, 10 base pairs are present per turn of the helix. The diameter of the DNA double-helix is 2 nm, and it is uniform throughout. Only the pairing between a purine and pyrimidine and the antiparallel orientation of the two DNA strands can explain the uniform diameter. The twisting of the two strands around each other results in the formation of uniformly spaced major and minor grooves ((Figure)).

Figure 3. DNA has (a) a double helix structure and (b) phosphodiester bonds the dotted lines between Thymine and Adenine and Guanine and Cytosine represent hydrogen bonds. The (c) major and minor grooves are binding sites for DNA binding proteins during processes such as transcription (the copying of RNA from DNA) and replication.

DNA Sequencing Techniques

Until the 1990s, the sequencing of DNA (reading the sequence of DNA) was a relatively expensive and long process. Using radiolabeled nucleotides also compounded the problem through safety concerns. With currently available technology and automated machines, the process is cheaper, safer, and can be completed in a matter of hours. Fred Sanger developed the sequencing method used for the human genome sequencing project, which is widely used today ((Figure)).

Link to Learning

Visit this site to watch a video explaining the DNA sequence-reading technique that resulted from Sanger’s work.

The sequencing method is known as the dideoxy chain termination method. The method is based on the use of chain terminators, the dideoxynucleotides (ddNTPs). The ddNTPSs differ from the deoxynucleotides by the lack of a free 3′ OH group on the five-carbon sugar. If a ddNTP is added to a growing DNA strand, the chain cannot be extended any further because the free 3′ OH group needed to add another nucleotide is not available. By using a predetermined ratio of deoxyribonucleotides to dideoxynucleotides, it is possible to generate DNA fragments of different sizes.

Figure 4. In Frederick Sanger’s dideoxy chain termination method, dye-labeled dideoxynucleotides are used to generate DNA fragments that terminate at different points. The DNA is separated by capillary electrophoresis (not defined) on the basis of size, and from the order of fragments formed, the DNA sequence can be read. The DNA sequence readout is shown on an electropherogram (not defined) that is generated by a laser scanner.

The DNA sample to be sequenced is denatured (separated into two strands by heating it to high temperatures). The DNA is divided into four tubes in which a primer, DNA polymerase, and all four nucleoside triphosphates (A, T, G, and C) are added. In addition, limited quantities of one of the four dideoxynucleoside triphosphates (ddCTP, ddATP, ddGTP, and ddTTP) are added to each tube respectively. The tubes are labeled as A, T, G, and C according to the ddNTP added. For detection purposes, each of the four dideoxynucleotides carries a different fluorescent label. Chain elongation continues until a fluorescent dideoxy nucleotide is incorporated, after which no further elongation takes place. After the reaction is over, electrophoresis is performed. Even a difference in length of a single base can be detected. The sequence is read from a laser scanner that detects the fluorescent marker of each fragment. For his work on DNA sequencing, Sanger received a Nobel Prize in Chemistry in 1980.

Link to Learning

Sanger’s genome sequencing has led to a race to sequence human genomes at rapid speed and low cost, often referred to as the $1000-in-one-day sequence. Learn more by selecting the Sequencing at Speed animation here.

Gel electrophoresis is a technique used to separate DNA fragments of different sizes. Usually the gel is made of a chemical called agarose (a polysaccharide polymer extracted from seaweed that is high in galactose residues). Agarose powder is added to a buffer and heated. After cooling, the gel solution is poured into a casting tray. Once the gel has solidified, the DNA is loaded on the gel and electric current is applied. The DNA has a net negative charge and moves from the negative electrode toward the positive electrode. The electric current is applied for sufficient time to let the DNA separate according to size the smallest fragments will be farthest from the well (where the DNA was loaded), and the heavier molecular weight fragments will be closest to the well. Once the DNA is separated, the gel is stained with a DNA-specific dye for viewing it ((Figure)).

Figure 5. DNA can be separated on the basis of size using gel electrophoresis. (credit: James Jacob, Tompkins Cortland Community College)

Evolution Connection

Neanderthal Genome: How Are We Related?

The first draft sequence of the Neanderthal genome was recently published by Richard E. Green et al. in 2010. [1] Neanderthals are the closest ancestors of present-day humans. They were known to have lived in Europe and Western Asia (and now, perhaps, in Northern Africa) before they disappeared from fossil records approximately 30,000 years ago. Green’s team studied almost 40,000-year-old fossil remains that were selected from sites across the world. Extremely sophisticated means of sample preparation and DNA sequencing were employed because of the fragile nature of the bones and heavy microbial contamination. In their study, the scientists were able to sequence some four billion base pairs. The Neanderthal sequence was compared with that of present-day humans from across the world. After comparing the sequences, the researchers found that the Neanderthal genome had 2 to 3 percent greater similarity to people living outside Africa than to people in Africa. While current theories have suggested that all present-day humans can be traced to a small ancestral population in Africa, the data from the Neanderthal genome suggest some interbreeding between Neanderthals and early modern humans.

Green and his colleagues also discovered DNA segments among people in Europe and Asia that are more similar to Neanderthal sequences than to other contemporary human sequences. Another interesting observation was that Neanderthals are as closely related to people from Papua New Guinea as to those from China or France. This is surprising because Neanderthal fossil remains have been located only in Europe and West Asia. Most likely, genetic exchange took place between Neanderthals and modern humans as modern humans emerged out of Africa, before the divergence of Europeans, East Asians, and Papua New Guineans.

Several genes seem to have undergone changes from Neanderthals during the evolution of present-day humans. These genes are involved in cranial structure, metabolism, skin morphology, and cognitive development. One of the genes that is of particular interest is RUNX2, which is different in modern day humans and Neanderthals. This gene is responsible for the prominent frontal bone, bell-shaped rib cage, and dental differences seen in Neanderthals. It is speculated that an evolutionary change in RUNX2 was important in the origin of modern-day humans, and this affected the cranium and the upper body.

Link to Learning

Watch Svante Pääbo’s talk explaining the Neanderthal genome research at the 2011 annual TED (Technology, Entertainment, Design) conference.

DNA Packaging in Cells

Prokaryotes are much simpler than eukaryotes in many of their features ((Figure)). Most prokaryotes contain a single, circular chromosome that is found in an area of the cytoplasm called the nucleoid region.

Art Connection

Figure 6. A eukaryote contains a well-defined nucleus, whereas in prokaryotes, the chromosome lies in the cytoplasm in an area called the nucleoid.

In eukaryotic cells, DNA and RNA synthesis occur in a separate compartment from protein synthesis. In prokaryotic cells, both processes occur together. What advantages might there be to separating the processes? What advantages might there be to having them occur together?

The size of the genome in one of the most well-studied prokaryotes, E.coli, is 4.6 million base pairs (approximately 1.1 mm, if cut and stretched out). So how does this fit inside a small bacterial cell? The DNA is twisted by what is known as supercoiling. Supercoiling suggests that DNA is either “under-wound” (less than one turn of the helix per 10 base pairs) or “over-wound” (more than 1 turn per 10 base pairs) from its normal relaxed state. Some proteins are known to be involved in the supercoiling other proteins and enzymes such as DNA gyrase help in maintaining the supercoiled structure.

Eukaryotes, whose chromosomes each consist of a linear DNA molecule, employ a different type of packing strategy to fit their DNA inside the nucleus ((Figure)). At the most basic level, DNA is wrapped around proteins known as histones to form structures called nucleosomes. The histones are evolutionarily conserved proteins that are rich in basic amino acids and form an octamer composed of two molecules of each of four different histones. The DNA (remember, it is negatively charged because of the phosphate groups) is wrapped tightly around the histone core. This nucleosome is linked to the next one with the help of a linker DNA. This is also known as the “beads on a string” structure. With the help of a fifth histone, a string of nucleosomes is further compacted into a 30-nm fiber, which is the diameter of the structure. Metaphase chromosomes are even further condensed by association with scaffolding proteins. At the metaphase stage, the chromosomes are at their most compact, approximately 700 nm in width.

In interphase, eukaryotic chromosomes have two distinct regions that can be distinguished by staining. The tightly packaged region is known as heterochromatin, and the less dense region is known as euchromatin. Heterochromatin usually contains genes that are not expressed, and is found in the regions of the centromere and telomeres. The euchromatin usually contains genes that are transcribed, with DNA packaged around nucleosomes but not further compacted.

Figure 7. These figures illustrate the compaction of the eukaryotic chromosome.

Section Summary

The currently accepted model of the double-helix structure of DNA was proposed by Watson and Crick. Some of the salient features are that the two strands that make up the double helix have complementary base sequences and anti-parallel orientations. Alternating deoxyribose sugars and phosphates form the backbone of the structure, and the nitrogenous bases are stacked like rungs inside. The diameter of the double helix, 2 nm, is uniform throughout. A purine always pairs with a pyrimidine A pairs with T, and G pairs with C. One turn of the helix has 10 base pairs. Prokaryotes are much simpler than eukaryotes in many of their features. Most prokaryotes contain a single, circular chromosome. In general, eukaryotic chromosomes contain a linear DNA molecule packaged into nucleosomes, and have two distinct regions that can be distinguished by staining, reflecting different states of packaging and compaction.

Art Connections

(Figure) In eukaryotic cells, DNA and RNA synthesis occur in a separate compartment from protein synthesis. In prokaryotic cells, both processes occur together. What advantages might there be to separating the processes? What advantages might there be to having them occur together?

(Figure) Compartmentalization enables a eukaryotic cell to divide processes into discrete steps so it can build more complex protein and RNA products. But there is an advantage to having a single compartment as well: RNA and protein synthesis occurs much more quickly in a prokaryotic cell.

Review Questions

DNA double helix does not have which of the following?

  1. antiparallel configuration
  2. complementary base pairing
  3. major and minor grooves
  4. uracil

In eukaryotes, what is the DNA wrapped around?

Free Response

Provide a brief summary of the Sanger sequencing method.

The template DNA strand is mixed with a DNA polymerase, a primer, the 4 deoxynucleotides, and a limiting concentration of 4 dideoxynucleotides. DNA polymerase synthesizes a strand complementary to the template. Incorporation of ddNTPs at different locations results in DNA fragments that have terminated at every possible base in the template. These fragments are separated by gel electrophoresis and visualized by a laser detector to determine the sequence of bases.

Describe the structure and complementary base pairing of DNA.

DNA has two strands in anti-parallel orientation. The sugar-phosphate linkages form a backbone on the outside, and the bases are paired on the inside: A with T, and G with C, like rungs on a spiral ladder.

Prokaryotes have a single circular chromosome while eukaryotes have linear chromosomes. Describe one advantage and one disadvantage to the eukaryotic genome packaging compared to the prokaryotes.

Advantage: The linear arrangement of the eukaryotic chromosome allows more DNA to be packed by tightly winding it around histones. More genetic material means that the organism can encode more information into a single cell. This eventually allowed some eukaryotes to develop into multicellular organisms with cell specialization.

3.5 Nucleic Acids

By the end of this section, you will be able to do the following:

  • Describe nucleic acids' structure and define the two types of nucleic acids
  • Explain DNA's structure and role
  • Explain RNA's structure and roles

Nucleic acids are the most important macromolecules for the continuity of life. They carry the cell's genetic blueprint and carry instructions for its functioning.


The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) . DNA is the genetic material in all living organisms, ranging from single-celled bacteria to multicellular mammals. It is in the nucleus of eukaryotes and in the organelles, chloroplasts, and mitochondria. In prokaryotes, the DNA is not enclosed in a membranous envelope.

The cell's entire genetic content is its genome, and the study of genomes is genomics. In eukaryotic cells but not in prokaryotes, DNA forms a complex with histone proteins to form chromatin, the substance of eukaryotic chromosomes. A chromosome may contain tens of thousands of genes. Many genes contain the information to make protein products. Other genes code for RNA products. DNA controls all of the cellular activities by turning the genes “on” or “off.”

The other type of nucleic acid, RNA, is mostly involved in protein synthesis. The DNA molecules never leave the nucleus but instead use an intermediary to communicate with the rest of the cell. This intermediary is the messenger RNA (mRNA) . Other types of RNA—like rRNA, tRNA, and microRNA—are involved in protein synthesis and its regulation.

DNA and RNA are comprised of monomers that scientists call nucleotides . The nucleotides combine with each other to form a polynucleotide , DNA or RNA. Three components comprise each nucleotide: a nitrogenous base, a pentose (five-carbon) sugar, and a phosphate group (Figure 3.31). Each nitrogenous base in a nucleotide is attached to a sugar molecule, which is attached to one or more phosphate groups.

The nitrogenous bases, important components of nucleotides, are organic molecules and are so named because they contain carbon and nitrogen. They are bases because they contain an amino group that has the potential of binding an extra hydrogen, and thus decreasing the hydrogen ion concentration in its environment, making it more basic. Each nucleotide in DNA contains one of four possible nitrogenous bases: adenine (A), guanine (G) cytosine (C), and thymine (T).

Scientists classify adenine and guanine as purines . The purine's primary structure is two carbon-nitrogen rings. Scientists classify cytosine, thymine, and uracil as pyrimidines which have a single carbon-nitrogen ring as their primary structure (Figure 3.31). Each of these basic carbon-nitrogen rings has different functional groups attached to it. In molecular biology shorthand, we know the nitrogenous bases by their symbols A, T, G, C, and U. DNA contains A, T, G, and C whereas, RNA contains A, U, G, and C.

The pentose sugar in DNA is deoxyribose, and in RNA, the sugar is ribose (Figure 3.31). The difference between the sugars is the presence of the hydroxyl group on the ribose's second carbon and hydrogen on the deoxyribose's second carbon. The carbon atoms of the sugar molecule are numbered as 1′, 2′, 3′, 4′, and 5′ (1′ is read as “one prime”). The phosphate residue attaches to the hydroxyl group of the 5′ carbon of one sugar and the hydroxyl group of the 3′ carbon of the sugar of the next nucleotide, which forms a 5′–3′ phosphodiester linkage. A simple dehydration reaction like the other linkages connecting monomers in macromolecules does not form the phosphodiester linkage. Its formation involves removing two phosphate groups. A polynucleotide may have thousands of such phosphodiester linkages.

DNA Double-Helix Structure

DNA has a double-helix structure (Figure 3.32). The sugar and phosphate lie on the outside of the helix, forming the DNA's backbone. The nitrogenous bases are stacked in the interior, like a pair of staircase steps. Hydrogen bonds bind the pairs to each other. Every base pair in the double helix is separated from the next base pair by 0.34 nm. The helix's two strands run in opposite directions, meaning that the 5′ carbon end of one strand will face the 3′ carbon end of its matching strand. (Scientists call this an antiparallel orientation and is important to DNA replication and in many nucleic acid interactions.)

Only certain types of base pairing are allowed. For example, a certain purine can only pair with a certain pyrimidine. This means A can pair with T, and G can pair with C, as Figure 3.33 shows. This is the base complementary rule. In other words, the DNA strands are complementary to each other. If the sequence of one strand is AATTGGCC, the complementary strand would have the sequence TTAACCGG. During DNA replication, each strand copies itself, resulting in a daughter DNA double helix containing one parental DNA strand and a newly synthesized strand.

Visual Connection

A mutation occurs, and adenine replaces cytosine. What impact do you think this will have on the DNA structure?

Ribonucleic acid, or RNA, is mainly involved in the process of protein synthesis under the direction of DNA. RNA is usually single-stranded and is comprised of ribonucleotides that are linked by phosphodiester bonds. A ribonucleotide in the RNA chain contains ribose (the pentose sugar), one of the four nitrogenous bases (A, U, G, and C), and the phosphate group.

There are four major types of RNA: messenger RNA (mRNA), ribosomal RNA (rRNA), transfer RNA (tRNA), and microRNA (miRNA). The first, mRNA, carries the message from DNA, which controls all of the cellular activities in a cell. If a cell requires synthesizing a certain protein, the gene for this product turns “on” and the messenger RNA synthesizes in the nucleus. The RNA base sequence is complementary to the DNA's coding sequence from which it has been copied. However, in RNA, the base T is absent and U is present instead. If the DNA strand has a sequence AATTGCGC, the sequence of the complementary RNA is UUAACGCG. In the cytoplasm, the mRNA interacts with ribosomes and other cellular machinery (Figure 3.34).

The mRNA is read in sets of three bases known as codons. Each codon codes for a single amino acid. In this way, the mRNA is read and the protein product is made. Ribosomal RNA (rRNA) is a major constituent of ribosomes on which the mRNA binds. The rRNA ensures the proper alignment of the mRNA and the Ribosomes. The ribosome's rRNA also has an enzymatic activity (peptidyl transferase) and catalyzes peptide bond formation between two aligned amino acids. Transfer RNA (tRNA) is one of the smallest of the four types of RNA, usually 70–90 nucleotides long. It carries the correct amino acid to the protein synthesis site. It is the base pairing between the tRNA and mRNA that allows for the correct amino acid to insert itself in the polypeptide chain. MicroRNAs are the smallest RNA molecules and their role involves regulating gene expression by interfering with the expression of certain mRNA messages. Table 3.2 summarizes DNA and RNA features.

FunctionCarries genetic informationInvolved in protein synthesis
LocationRemains in the nucleusLeaves the nucleus
StructureDouble helixUsually single-stranded
PyrimidinesCytosine, thymineCytosine, uracil
PurinesAdenine, guanineAdenine, guanine

Even though the RNA is single stranded, most RNA types show extensive intramolecular base pairing between complementary sequences, creating a predictable three-dimensional structure essential for their function.

As you have learned, information flow in an organism takes place from DNA to RNA to protein. DNA dictates the structure of mRNA in a process scientists call transcription , and RNA dictates the protein's structure in a process scientists call translation . This is the Central Dogma of Life, which holds true for all organisms however, exceptions to the rule occur in connection with viral infections.

Link to Learning

To learn more about DNA, explore the Howard Hughes Medical Institute BioInteractive animations on the topic of DNA.

How many phosphates are in the 5' end of a DNA strand? - Biology


It is estimated that our bodies consist of over 10 trillion cells. How do all of these cells know what job to do? The cells that make up the human body have nucleic acids that instruct the cell on how to function. The nucleic acid deoxyribonucleic acid (DNA) stores the information in our cells and selectively shares that information when appropriate. DNA is a molecule that can be passed from generation to generation. DNA can be replicated in a carefully regulated process designed to keep the genome safe from degradation and free from errors. Seemingly small changes in the genetic code can result in life-threatening and even life-incompatible alterations to protein structure and function. In this chapter, the unique structure of DNA will be discussed, along with replication and repair processes. The primary focus will be on eukaryotic processes, but there will be some review of prokaryotic genetics to help us better understand the molecular basis of life.

Much of the advancement of medicine in the past two decades has been due to our increased understanding of molecular genetics, which has led to the creation of an entire biotechnology industry centered around genomics and the utilization of nucleic acids for various diagnostic tests and therapeutic interventions. We will also take a look at some of these important principles in this chapter.

6.1 DNA Structure

There are two chemically distinct forms of nucleic acids within eukaryotic cells. Deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) are polymers, each with distinct roles, that together create the molecules integral to life in all living organisms. DNA is the focus of this chapter and RNA will be discussed in more detail in Chapter 7 of MCAT Biochemistry Review. The bulk of DNA is found in chromosomes in the nucleus of eukaryotic cells, although some is also present in mitochondria and chloroplasts.


DNA is a macromolecule and it is essential to understand how this molecule is constructed. DNA is a polydeoxyribonucleotide that is composed of many monodeoxyribonucleotides linked together. The nomenclature of nucleic acids can be complicated, so the terms have been defined here:

·&emspNucleosides are composed of a five-carbon sugar (pentose) bound to a nitrogenous base and are formed by covalently linking the base to C-1&prime of the sugar, as shown in Figure 6.1. Note that the carbon atoms in the sugar are labeled with a prime symbol to distinguish them from the carbon atoms in the nitrogenous base.

·&emspNucleotides are formed when one or more phosphate groups are attached to C-5&prime of a nucleoside. Often these molecules are named according to the number of phosphates bound. Adenosine di- and triphosphate (ADP and ATP), for example, gain their name from the number of phosphate groups attached to the nucleoside adenosine. These are high-energy compounds because of the energy associated with the repulsion between closely associated negative charges on the phosphate groups, as shown in Figure 6.2. Nucleotides are the building blocks of DNA.

Figure 6.1. Examples of Nucleosides

Figure 6.2. High-Energy Bonds in Adenosine Triphosphate, a Nucleotide

In Chapter 3 of MCAT General Chemistry Review, we learned that bond breaking is usually endothermic and bond making is usually exothermic. ATP offers a biologically relevant&mdashand MCAT tested&mdashexception to this rule. Due to all the negative charges in close proximity, removing the terminal phosphate from ATP actually releases energy, which powers our cells.

Nucleic acids are classified according to the pentose they contain, as shown in Figure 6.3. If the pentose is ribose, the nucleic acid is RNA if the pentose is deoxyribose (ribose with the 2&prime –OH group replaced by –H), then it is DNA.

Figure 6.3. Ribose and Deoxyribose Ribose has an –OH group at C-2 deoxyribose has –H.

The nomenclature for the common bases, nucleosides, and nucleotides is shown in Table 6.1. Note that there is no thymidine listed (only deoxythymidine) because thymine appears almost exclusively in DNA.


Names of nucleosides and nucleotides attached to deoxyribose are shown in parentheses.

Table 6.1. Nomenclature of Important Bases, Nucleosides, and Nucleotides

The backbone of DNA is composed of alternating sugar and phosphate groups it determines the directionality of the DNA and is always read from 5&prime to 3&prime. It is formed as nucleotides are joined by 3&prime–5&prime phosphodiester bonds. That is, a phosphate group links the 3&prime carbon of one sugar to the 5&prime phosphate group of the next incoming sugar in the chain. Phosphates carry a negative charge thus, DNA and RNA strands have an overall negative charge.

Figure 6.4. DNA Strand Polarity DNA strands run antiparallel to one another enzymes that replicate and transcribe DNA only work in the 5&prime to 3&prime direction.

Each strand of DNA has distinct 5&prime and 3&prime ends, creating polarity within the backbone, as shown in Figure 6.4. The 5&prime end of DNA, for instance, will have an –OH or phosphate group bound to C-5&prime of the sugar, while the 3&prime end has a free –OH on C-3&prime of the sugar. The base sequence of a nucleic acid strand is both written and read in the 5&prime to 3&prime direction. Thus, the DNA strand in Figure 6.4 must be written: 5&prime&mdashATG&mdash3&prime (or simply ATG). DNA sequences can also be written in slightly different ways:

·&emspIf written backwards, the ends must be labeled: 3&prime&mdashGTA&mdash5&prime

·&emspThe position of phosphates may be shown: pApTpG

·&emsp“d” may be used as shorthand for deoxyribose: dAdTdG


On the MCAT, always check nucleic acids for polarity. One of the easiest ways to generate incorrect answers is to simply reverse the reading frame: 3&prime&mdashGATTACA&mdash5&prime is not the same as 3&prime&mdashACATTAG&mdash5&prime.

DNA is generally double-stranded (dsDNA) and RNA is generally single-stranded (ssRNA). Exceptions to this rule may be seen, especially in viruses, as described in Chapter 1 of MCAT Biology Review.

There are two families of nitrogen-containing bases found in nucleotides: purines and pyrimidines. The bases described below, and shown in Figure 6.5, represent the common bases in eukaryotes however, it should be noted that exceptions may be seen in tRNA and in some prokaryotes and viruses. Purines contain two rings in their structure. The two purines found in nucleic acids are adenine (A) and guanine (G) both are found in DNA and RNA. Pyrimidines contain only one ring in their structure. The three pyrimidines are cytosine (C), thymine (T), and uracil (U) while cytosine is found in both DNA and RNA, thymine is only found in DNA and uracil is only found in RNA.

Figure 6.5. Bases Commonly Found in Nucleic Acids

To remember the types and structures of these two classes of nitrogenous bases, remember to CUT the PYe (as C, U, and T are pyrimidines). You can also note that pie has one ring of crust, and pyrimidines have only one ring in their structure. You can also remember PURe AsGold (as A and G are purines) think of gold wedding rings. It takes two gold rings at a wedding, just like purines have two rings in their structure.

Purines and pyrimidines are examples of biological aromatic heterocycles. In chemistry, the term aromatic describes any unusually stable ring system that adheres to the following four specific rules:

3. The compound is conjugated (has alternating single and multiple bonds, or lone pairs, creating at least one unhybridized p-orbital for each atom in the ring)

4. The compound has 4n + 2 (where n is any integer) &pi electrons. This is called Hückel's rule

The most common example of an aromatic compound is benzene, but many different structures obey these rules. In an aromatic compound, the extra stability is due to the delocalized &pi electrons, which can travel throughout the entire compound using available p-orbitals. All six of the carbon atoms in benzene are sp 2 -hybridized, and each of the six orbitals overlaps equally with its two neighbors. As a result, the delocalized electrons form two &pi electron clouds (one above and one below the plane of the ring), as shown in Figure 6.6. This delocalization is characteristic of all aromatic molecules, and because of this, aromatic molecules are fairly unreactive.

Figure 6.6. Delocalization of &pi Electrons in Benzene

Heterocycles are ring structures that contain at least two different elements in the ring. As shown in Figure 6.5 earlier, both purines and pyrimidines contain nitrogen in their aromatic rings. Nucleic acids are thus imbued with exceptional stability. This helps to explain the utility of nucleotides as the molecule for storing genetic information.

Putting this information together, we can start looking at the Watson–Crick model of DNA structure. In 1953, James Watson and Francis Crick presented one of the landmark findings of modern biology and medicine: the three-dimensional structure of DNA. They were able to deduce the double-helical nature of DNA and propose specific base-pairing that would be the basis of a copying mechanism. In the double helix, two linear polynucleotide chains of DNA are wound together in a spiral orientation along a common axis. The key features of the model&mdashsome of which have already been mentioned&mdashare:

·&emspThe two strands of DNA are antiparallel that is, the strands are oriented in opposite directions. When one strand has polarity 5&prime to 3&prime down the page, the other strand has 5&prime to 3&prime polarity up the page.

·&emspThe sugar–phosphate backbone is on the outside of the helix with the nitrogenous bases on the inside.

·&emspThere are specific base-pairing rules, often referred to as complementary base-pairing, as shown in Figure 6.7. An adenine (A) is always base-paired with a thymine (T) via two hydrogen bonds. A guanine (G) always pairs with cytosine (C) via three hydrogen bonds. The three hydrogen bonds make the G–C base pair interaction stronger. These hydrogen bonds, and the hydrophobic interactions between bases, provide stability to the double helix structure. Thus, the base sequence on one strand defines the base sequence on the other strand.

·&emspBecause of the specific base-pairing, the amount of A equals the amount of T, and the amount of G equals the amount of C. Thus, total purines will be equal to total pyrimidines overall. These properties are known as Chargaff's rules.

Figure 6.7. Base-Pairing in DNA


When writing a complementary strand of DNA, it is important to not only remember the base-pairing rules but to also keep track of the 5&prime and 3&prime ends. Remember that the sequences need to be both complementary and antiparallel. For example, 5&prime&mdashATCG&mdash3&prime will be complementary to 5&prime&mdashCGAT&mdash3&prime.


In double-stranded DNA, purines = pyrimidines:

If a sample of DNA has 10% G, what is the % of T?

The double helix of most DNA is a right-handed helix, forming what is called B-DNA, as shown in Figure 6.8. The helix in B-DNA makes a turn every 3.4 nm and contains about 10 bases within that span. Major and minor grooves can be identified between the interlocking strands and are often the site of protein binding. Another form of DNA is called Z-DNA for its zigzag appearance it is a left-handed helix that has a turn every 4.6 nm and contains 12 bases within each turn. A high GC-content or a high salt concentration may contribute to the formation of this form of DNA. No biological activity has been attributed to Z-DNA partly because it is unstable and difficult to research.

Figure 6.8. The B-DNA Double Helix


During processes such as replication and transcription, it is necessary to gain access to the DNA. The double helical nature of DNA can be denatured by conditions that disrupt hydrogen bonding and base-pairing, resulting in the “melting” of the double helix into two single strands that have separated from each other. Notably, none of the covalent links between the nucleotides in the backbone of the DNA break during this process. Heat, alkaline pH, and chemicals like formaldehyde and urea are commonly used to denature DNA.

Denatured, single-stranded DNA can be reannealed (brought back together) if the denaturing condition is slowly removed. If a solution of heat-denatured DNA is slowly cooled, for example, then the two complementary strands can become paired again, as shown in Figure 6.9.

Figure 6.9. Denaturation and Reannealing of DNA

Such annealing of complementary DNA strands is also an important step in many laboratory processes, such as polymerase chain reactions (PCR) and in the detection of specific DNA sequences. In these techniques, a well-characterized probe DNA (DNA with known sequence) is added to a mixture of target DNA sequences. When probe DNA binds to target DNA sequences, this may provide evidence of the presence of a gene of interest. This binding process is called hybridization and is described in further detail later in this chapter.

MCAT Concept Check 6.1:

Before you move on, assess your understanding of the material with these questions.

1. What is the difference between a nucleoside and a nucleotide?

2. What are the base-pairing rules according to the Watson–Crick model?

3. What are the three major structural differences between DNA and RNA?

4. How does the aromaticity of purines and pyrimidines underscore their genetic function?

5. If a strand of RNA contained 15% cytosine, 15% adenine, 35% guanine, and 35% uracil, would this violate Chargaff's rules? Why or why not?

If you are the copyright holder of any material contained on our site and intend to remove it, please contact our site administrator for approval.

DNA Double-Helix Structure

Figure 2. DNA is an antiparallel double helix. The phosphate backbone (the curvy lines) is on the outside, and the bases are on the inside. Each base interacts with a base from the opposing strand. (credit: Jerome Walker/Dennis Myts)

DNA has a double-helix structure (Figure 2). The sugar and phosphate lie on the outside of the helix, forming the backbone of the DNA. The nitrogenous bases are stacked in the interior, like the steps of a staircase, in pairs the pairs are bound to each other by hydrogen bonds. Every base pair in the double helix is separated from the next base pair by 0.34 nm.

The two strands of the helix run in opposite directions, meaning that the 5′ carbon end of one strand will face the 3′ carbon end of its matching strand. (This is referred to as antiparallel orientation and is important to DNA replication and in many nucleic acid interactions.)

Only certain types of base pairing are allowed. For example, a certain purine can only pair with a certain pyrimidine. This means A can pair with T, and G can pair with C, as shown in Figure 3. This is known as the base complementary rule. In other words, the DNA strands are complementary to each other. If the sequence of one strand is AATTGGCC, the complementary strand would have the sequence TTAACCGG. During DNA replication, each strand is copied, resulting in a daughter DNA double helix containing one parental DNA strand and a newly synthesized strand.


Figure 3. In a double stranded DNA molecule, the two strands run antiparallel to one another so that one strand runs 5′ to 3′ and the other 3′ to 5′. The phosphate backbone is located on the outside, and the bases are in the middle. Adenine forms hydrogen bonds (or base pairs) with thymine, and guanine base pairs with cytosine.

A mutation occurs, and cytosine is replaced with adenine. What impact do you think this will have on the DNA structure?

Ribonucleic acid, or RNA, is mainly involved in the process of protein synthesis under the direction of DNA. RNA is usually single-stranded and is made of ribonucleotides that are linked by phosphodiester bonds. A ribonucleotide in the RNA chain contains ribose (the pentose sugar), one of the four nitrogenous bases (A, U, G, and C), and the phosphate group.

There are four major types of RNA: messenger RNA (mRNA), ribosomal RNA (rRNA), transfer RNA (tRNA), and microRNA (miRNA). The first, mRNA, carries the message from DNA, which controls all of the cellular activities in a cell. If a cell requires a certain protein to be synthesized, the gene for this product is turned “on” and the messenger RNA is synthesized in the nucleus. The RNA base sequence is complementary to the coding sequence of the DNA from which it has been copied. However, in RNA, the base T is absent and U is present instead. If the DNA strand has a sequence AATTGCGC, the sequence of the complementary RNA is UUAACGCG. In the cytoplasm, the mRNA interacts with ribosomes and other cellular machinery (Figure 4).

Figure 4. A ribosome has two parts: a large subunit and a small subunit. The mRNA sits in between the two subunits. A tRNA molecule recognizes a codon on the mRNA, binds to it by complementary base pairing, and adds the correct amino acid to the growing peptide chain.

The mRNA is read in sets of three bases known as codons. Each codon codes for a single amino acid. In this way, the mRNA is read and the protein product is made. Ribosomal RNA (rRNA) is a major constituent of ribosomes on which the mRNA binds. The rRNA ensures the proper alignment of the mRNA and the ribosomes the rRNA of the ribosome also has an enzymatic activity (peptidyl transferase) and catalyzes the formation of the peptide bonds between two aligned amino acids. Transfer RNA (tRNA) is one of the smallest of the four types of RNA, usually 70–90 nucleotides long. It carries the correct amino acid to the site of protein synthesis. It is the base pairing between the tRNA and mRNA that allows for the correct amino acid to be inserted in the polypeptide chain. microRNAs are the smallest RNA molecules and their role involves the regulation of gene expression by interfering with the expression of certain mRNA messages.

DNA replication and the cell cycle

DNA is copied during interphase of the cell cycle. Cells spend 90% of the time in this stage which includes three events, gap 1 (G1), synthesis (S), and gap 2 (G2). The DNA replication process occurs during the synthesis stage of interphase.

The DNA molecule is checked at various checkpoints during interphase to make sure that there are no errors. If errors are found then either the problem is fixed, often by enzymes, or the cell may be ordered to self-destruct (undergo apoptosis).

The replication process is known as semiconservative because it uses both the original strands to make copies and in the end each new DNA that is produced consists of one old strand and one new strand.

There are three main stages of the DNA replication process, initiation, elongation, and termination. The initiation is the start of the process, the elongation is when the strand is being formed and elongated. Termination is the end of the replication process.

The process can involve several different proteins and, in fact, research on the bacterium E. coli has found that there are 20 different molecules involved in the replication of the genetic material.

In eukaryotic cells, there are also many molecules that work together in the process but there are three main enzymes that are involved, namely helicase, primase, and polymerase.

Helicase, primase, and polymerase

The DNA replication process involves various enzymes that catalyze reactions. The first enzyme involved is helicase which helps the DNA helix to unwind and the hydrogen bonds to break. These bonds are found between the corresponding bases of the two polynucleotide strands.

The bonds break and the two strands separate so that the next stage of replication can take place. A primer is put in place by the enzyme called primase. This primer is a particular sequence of bases that are added to the strand.

The next step of the replication process involves the activity of a third protein called polymerase. This enzyme carries pieces of nucleotides in towards the old strand. Both the original strands function as templates dictating the sequence by which the new bases will be added.

It is important to understand that the 5’ end of the DNA is the growing section and the process of replication actually occurs in the 5’ to 3’ direction. In other words, bases are added in this direction.

How the two new strands are created

There are two new strands created based on the original DNA polynucleotides. However, the formation of these two new strands takes place in a slightly different way.

A leading and a lagging strand is formed. The leading strand is made by bases being added to the template strand in a continual manner from 5’ to 3’.

The second strand is made in a different way and there are actually multiple starting points or initiation sites resulting in bases being added in a discontinuous way to produce what is called a lagging strand.

The lagging strands result in the formation of sections called Okazaki fragments. The enzyme DNA ligase has to be activated in order to fill in these gaps between the fragments in order to form a complete strand.

Transcription blockage by bulky end termini at single-strand breaks in the DNA template: differential effects of 5' and 3' adducts

RNA polymerases from phage-infected bacteria and mammalian cells have been shown to bypass single-strand breaks (SSBs) with a single-nucleotide gap in the template DNA strand during transcription elongation however, the SSB bypass efficiency varies significantly depending upon the backbone end chemistries at the break. Using a reconstituted T7 phage transcription system (T7 RNAP) and RNA polymerase II (RNAPII) in HeLa cell nuclear extracts, we observe a slight reduction in the level of transcription arrest at SSBs with no gap as compared to those with a single-nucleotide gap. We have shown that biotin and carbon-chain moieties linked to the 3' side, and in select cases the 5' side, of an SSB in the template strand strongly increase the level of transcription arrest when compared to unmodified SSBs. We also find that a small carbon-chain moiety linked to the upstream side of an SSB aids transcriptional bypass of SSBs for both T7 RNAP and RNAP II. Analysis of transcription across SSBs flanked by bulky 3' adducts reveals the ability of 3' end chemistries to arrest T7 RNAP in a size-dependent manner. T7 RNAP is also completely arrested when 3' adducts or 3'-phosphate groups are placed opposite 5'-phosphate groups at an SSB. We have also observed that a biotinylated thymine in the template strand (without a break) does not pose a strong block to transcription. Taken together, these results emphasize the importance of the size of 3', but usually not 5', end chemistries in arresting transcription at SSBs, substantiating the notion that bulky 3' lesions (e.g., topoisomerase cleavable complexes, 3'-phosphoglycolates, and 3'-unsaturated aldehydes) pose very strong blocks to transcribing RNA polymerases. These findings have implications for the processing of DNA damage through SSB intermediates and the mechanism of SSB bypass by T7 RNAP and mammalian RNAPII.


Protocol for making transcription substrates,…

Protocol for making transcription substrates, which consisted of two parts: a promoter-containing fragment,…

T7 RNAP transcription arrest at…

T7 RNAP transcription arrest at SSBs with naturally occurring end-chemistries. Runoff products (324nt)…

3′, but not 5′, moieties increase T7 transcription arrest at SSBs. A) Radiolabeled…

3′-C3, but not 5′-C3, moieties increase RNAP II transcription arrest at SSBs. A)…

Transcription arrest increases with increasing…

Transcription arrest increases with increasing size of bulky 3′ groups at SSBs with…


A DNA transcription unit encoding for a protein may contain both a coding sequence, which will be translated into the protein, and regulatory sequences, which direct and regulate the synthesis of that protein. The regulatory sequence before ("upstream" from) the coding sequence is called the five prime untranslated region (5'UTR) the sequence after ("downstream" from) the coding sequence is called the three prime untranslated region (3'UTR). [3]

As opposed to DNA replication, transcription results in an RNA complement that includes the nucleotide uracil (U) in all instances where thymine (T) would have occurred in a DNA complement.

Only one of the two DNA strands serve as a template for transcription. The antisense strand of DNA is read by RNA polymerase from the 3' end to the 5' end during transcription (3' → 5'). The complementary RNA is created in the opposite direction, in the 5' → 3' direction, matching the sequence of the sense strand with the exception of switching uracil for thymine. This directionality is because RNA polymerase can only add nucleotides to the 3' end of the growing mRNA chain. This use of only the 3' → 5' DNA strand eliminates the need for the Okazaki fragments that are seen in DNA replication. [3] This also removes the need for an RNA primer to initiate RNA synthesis, as is the case in DNA replication.

The non-template (sense) strand of DNA is called the coding strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). This is the strand that is used by convention when presenting a DNA sequence. [5]

Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA. As a result, transcription has a lower copying fidelity than DNA replication. [6]

Transcription is divided into initiation, promoter escape, elongation, and termination. [7]

Setting up for transcription Edit

Enhancers, transcription factors, Mediator complex and DNA loops in mammalian transcription Edit

Setting up for transcription in mammals is regulated by many cis-regulatory elements, including core promoter and promoter-proximal elements that are located near the transcription start sites of genes. Core promoters combined with general transcription factors are sufficient to direct transcription initiation, but generally have low basal activity. [8] Other important cis-regulatory modules are localized in DNA regions that are distant from the transcription start sites. These include enhancers, silencers, insulators and tethering elements. [9] Among this constellation of elements, enhancers and their associated transcription factors have a leading role in the initiation of gene transcription. [10] An enhancer localized in a DNA region distant from the promoter of a gene can have a very large effect on gene transcription, with some genes undergoing up to 100-fold increased transcription due to an activated enhancer. [11]

Enhancers are regions of the genome that are major gene-regulatory elements. Enhancers control cell-type-specific gene transcription programs, most often by looping through long distances to come in physical proximity with the promoters of their target genes. [12] While there are hundreds of thousands of enhancer DNA regions, [13] for a particular type of tissue only specific enhancers are brought into proximity with the promoters that they regulate. In a study of brain cortical neurons, 24,937 loops were found, bringing enhancers to their target promoters. [11] Multiple enhancers, each often at tens or hundred of thousands of nucleotides distant from their target genes, loop to their target gene promoters and can coordinate with each other to control transcription of their common target gene. [12]

The schematic illustration in this section shows an enhancer looping around to come into close physical proximity with the promoter of a target gene. The loop is stabilized by a dimer of a connector protein (e.g. dimer of CTCF or YY1), with one member of the dimer anchored to its binding motif on the enhancer and the other member anchored to its binding motif on the promoter (represented by the red zigzags in the illustration). [14] Several cell function specific transcription factors (there are about 1,600 transcription factors in a human cell [15] ) generally bind to specific motifs on an enhancer [16] and a small combination of these enhancer-bound transcription factors, when brought close to a promoter by a DNA loop, govern level of transcription of the target gene. Mediator (a complex usually consisting of about 26 proteins in an interacting structure) communicates regulatory signals from enhancer DNA-bound transcription factors directly to the RNA polymerase II (pol II) enzyme bound to the promoter. [17]

Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two enhancer RNAs (eRNAs) as illustrated in the Figure. [18] An inactive enhancer may be bound by an inactive transcription factor. Phosphorylation of the transcription factor may activate it and that activated transcription factor may then activate the enhancer to which it is bound (see small red star representing phosphorylation of transcription factor bound to enhancer in the illustration). [19] An activated enhancer begins transcription of its RNA before activating transcription of messenger RNA from its target gene. [20]

CpG island methylation and demethylation Edit

Transcription regulation at about 60% of promoters is also controlled by methylation of cytosines within CpG dinucleotides (where 5’ cytosine is followed by 3’ guanine or CpG sites). 5-methylcytosine (5-mC) is a methylated form of the DNA base cytosine (see Figure). 5-mC is an epigenetic marker found predominantly within CpG sites. About 28 million CpG dinucleotides occur in the human genome. [21] In most tissues of mammals, on average, 70% to 80% of CpG cytosines are methylated (forming 5-methylCpG or 5-mCpG). [22] Methylated cytosines within 5’cytosine-guanine 3’ sequences often occur in groups, called CpG islands. About 60% of promoter sequences have a CpG island while only about 6% of enhancer sequences have a CpG island. [23] CpG islands constitute regulatory sequences, since if CpG islands are methylated in the promoter of a gene this can reduce or silence gene transcription. [24]

DNA methylation regulates gene transcription through interaction with methyl binding domain (MBD) proteins, such as MeCP2, MBD1 and MBD2. These MBD proteins bind most strongly to highly methylated CpG islands. [25] These MBD proteins have both a methyl-CpG-binding domain as well as a transcription repression domain. [25] They bind to methylated DNA and guide or direct protein complexes with chromatin remodeling and/or histone modifying activity to methylated CpG islands. MBD proteins generally repress local chromatin such as by catalyzing the introduction of repressive histone marks, or creating an overall repressive chromatin environment through nucleosome remodeling and chromatin reorganization. [25]

As noted in the previous section, transcription factors are proteins that bind to specific DNA sequences in order to regulate the expression of a gene. The binding sequence for a transcription factor in DNA is usually about 10 or 11 nucleotides long. As summarized in 2009, Vaquerizas et al. indicated there are approximately 1,400 different transcription factors encoded in the human genome by genes that constitute about 6% of all human protein encoding genes. [26] About 94% of transcription factor binding sites (TFBSs) that are associated with signal-responsive genes occur in enhancers while only about 6% of such TFBSs occur in promoters. [16]

EGR1 protein is a particular transcription factor that is important for regulation of methylation of CpG islands. An EGR1 transcription factor binding site is frequently located in enhancer or promoter sequences. [27] There are about 12,000 binding sites for EGR1 in the mammalian genome and about half of EGR1 binding sites are located in promoters and half in enhancers. [27] The binding of EGR1 to its target DNA binding site is insensitive to cytosine methylation in the DNA. [27]

While only small amounts of EGR1 transcription factor protein are detectable in cells that are un-stimulated, translation of the EGR1 gene into protein at one hour after stimulation is drastically elevated. [28] Expression of EGR1 transcription factor proteins, in various types of cells, can be stimulated by growth factors, neurotransmitters, hormones, stress and injury. [28] In the brain, when neurons are activated, EGR1 proteins are up-regulated and they bind to (recruit) the pre-existing TET1 enzymes which are highly expressed in neurons. TET enzymes can catalyse demethylation of 5-methylcytosine. When EGR1 transcription factors bring TET1 enzymes to EGR1 binding sites in promoters, the TET enzymes can demethylate the methylated CpG islands at those promoters. Upon demethylation, these promoters can then initiate transcription of their target genes. Hundreds of genes in neurons are differentially expressed after neuron activation through EGR1 recruitment of TET1 to methylated regulatory sequences in their promoters. [27]

The methylation of promoters is also altered in response to signals. The three mammalian DNA methyltransferasess (DNMT1, DNMT3A, and DNMT3B) catalyze the addition of methyl groups to cytosines in DNA. While DNMT1 is a “maintenance” methyltransferase, DNMT3A and DNMT3B can carry out new methylations. There are also two splice protein isoforms produced from the DNMT3A gene: DNA methyltransferase proteins DNMT3A1 and DNMT3A2. [29]

The splice isoform DNMT3A2 behaves like the product of a classical immediate-early gene and, for instance, it is robustly and transiently produced after neuronal activation. [30] Where the DNA methyltransferase isoform DNMT3A2 binds and adds methyl groups to cytosines appears to be determined by histone post translational modifications. [31] [32] [33]

On the other hand, neural activation causes degradation of DNMT3A1 accompanied by reduced methylation of at least one evaluated targeted promoter. [34]

Initiation Edit

Transcription begins with the binding of RNA polymerase, together with one or more general transcription factors, to a specific DNA sequence referred to as a "promoter" to form an RNA polymerase-promoter "closed complex". In the "closed complex" the promoter DNA is still fully double-stranded. [7]

RNA polymerase, assisted by one or more general transcription factors, then unwinds approximately 14 base pairs of DNA to form an RNA polymerase-promoter "open complex". In the "open complex" the promoter DNA is partly unwound and single-stranded. The exposed, single-stranded DNA is referred to as the "transcription bubble." [7]

RNA polymerase, assisted by one or more general transcription factors, then selects a transcription start site in the transcription bubble, binds to an initiating NTP and an extending NTP (or a short RNA primer and an extending NTP) complementary to the transcription start site sequence, and catalyzes bond formation to yield an initial RNA product. [7]

In bacteria, RNA polymerase holoenzyme consists of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. In bacteria, there is one general RNA transcription factor known as a sigma factor. RNA polymerase core enzyme binds to the bacterial general transcription (sigma) factor to form RNA polymerase holoenzyme and then binds to a promoter. [7] (RNA polymerase is called a holoenzyme when sigma subunit is attached to the core enzyme which is consist of 2 α subunits, 1 β subunit, 1 β' subunit only). Unlike eukaryotes, the initiating nucleotide of nascent bacterial mRNA is not capped with a modified guanine nucleotide. The initiating nucleotide of bacterial transcripts bears a 5′ triphosphate (5′-PPP), which can be used for genome-wide mapping of transcription initiation sites. [35]

In archaea and eukaryotes, RNA polymerase contains subunits homologous to each of the five RNA polymerase subunits in bacteria and also contains additional subunits. In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiple general transcription factors that work together. [7] In archaea, there are three general transcription factors: TBP, TFB, and TFE. In eukaryotes, in RNA polymerase II-dependent transcription, there are six general transcription factors: TFIIA, TFIIB (an ortholog of archaeal TFB), TFIID (a multisubunit factor in which the key subunit, TBP, is an ortholog of archaeal TBP), TFIIE (an ortholog of archaeal TFE), TFIIF, and TFIIH. The TFIID is the first component to bind to DNA due to binding of TBP, while TFIIH is the last component to be recruited. In archaea and eukaryotes, the RNA polymerase-promoter closed complex is usually referred to as the "preinitiation complex." [36]

Transcription initiation is regulated by additional proteins, known as activators and repressors, and, in some cases, associated coactivators or corepressors, which modulate formation and function of the transcription initiation complex. [7]

Promoter escape Edit

After the first bond is synthesized, the RNA polymerase must escape the promoter. During this time there is a tendency to release the RNA transcript and produce truncated transcripts. This is called abortive initiation, and is common for both eukaryotes and prokaryotes. [37] Abortive initiation continues to occur until an RNA product of a threshold length of approximately 10 nucleotides is synthesized, at which point promoter escape occurs and a transcription elongation complex is formed.

Mechanistically, promoter escape occurs through DNA scrunching, providing the energy needed to break interactions between RNA polymerase holoenzyme and the promoter. [38]

In bacteria, it was historically thought that the sigma factor is definitely released after promoter clearance occurs. This theory had been known as the obligate release model. However, later data showed that upon and following promoter clearance, the sigma factor is released according to a stochastic model known as the stochastic release model. [39]

In eukaryotes, at an RNA polymerase II-dependent promoter, upon promoter clearance, TFIIH phosphorylates serine 5 on the carboxy terminal domain of RNA polymerase II, leading to the recruitment of capping enzyme (CE). [40] [41] The exact mechanism of how CE induces promoter clearance in eukaryotes is not yet known.

Elongation Edit

One strand of the DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy (which elongates during the traversal). Although RNA polymerase traverses the template strand from 3' → 5', the coding (non-template) strand and newly formed RNA can also be used as reference points, so transcription can be described as occurring 5' → 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand (except that thymines are replaced with uracils, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one fewer oxygen atom) in its sugar-phosphate backbone). [ citation needed ]

mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene. [ citation needed ] The characteristic elongation rates in prokaryotes and eukaryotes are about 10-100 nts/sec. [42] In eukaryotes, however, nucleosomes act as major barriers to transcribing polymerases during transcription elongation. [43] [44] In these organisms, the pausing induced by nucleosomes can be regulated by transcription elongation factors such as TFIIS. [44]

Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure. [ citation needed ]

Termination Edit

Bacteria use two different strategies for transcription termination – Rho-independent termination and Rho-dependent termination. In Rho-independent transcription termination, RNA transcription stops when the newly synthesized RNA molecule forms a G-C-rich hairpin loop followed by a run of Us. When the hairpin forms, the mechanical stress breaks the weak rU-dA bonds, now filling the DNA–RNA hybrid. This pulls the poly-U transcript out of the active site of the RNA polymerase, terminating transcription. In the "Rho-dependent" type of termination, a protein factor called "Rho" destabilizes the interaction between the template and the mRNA, thus releasing the newly synthesized mRNA from the elongation complex. [45]

Transcription termination in eukaryotes is less well understood than in bacteria, but involves cleavage of the new transcript followed by template-independent addition of adenines at its new 3' end, in a process called polyadenylation. [46]

Free Response

What are the structural differences between RNA and DNA?

DNA has a double-helix structure. The sugar and the phosphate are on the outside of the helix and the nitrogenous bases are in the interior. The monomers of DNA are nucleotides containing deoxyribose, one of the four nitrogenous bases (A, T, G and C), and a phosphate group. RNA is usually single-stranded and is made of ribonucleotides that are linked by phosphodiester linkages. A ribonucleotide contains ribose (the pentose sugar), one of the four nitrogenous bases (A,U, G, and C), and the phosphate group.

What are the four types of RNA and how do they function?

The four types of RNA are messenger RNA, ribosomal RNA, transfer RNA, and microRNA. Messenger RNA carries the information from the DNA that controls all cellular activities. The mRNA binds to the ribosomes that are constructed of proteins and rRNA, and tRNA transfers the correct amino acid to the site of protein synthesis. microRNA regulates the availability of mRNA for translation.


  1. Abdul- Qadir

    What necessary phrase... super, a brilliant idea

  2. Marcos

    Perhaps, I agree with your opinion

  3. Gilvarry

    Completely I share your opinion. In it something is and it is good idea. It is ready to support you.

  4. Macalister

    What matchless topic

Write a message