Does DNA being circular or linear directly affect the speed of DNA replication?

Does DNA being circular or linear directly affect the speed of DNA replication?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Let's say we have two DNA molecules of equal length, one belonging to a prokaryote and the other to an eukaryote. It's known that replication of the eukaryotic DNA is faster in this case. One clear reason for this is that linear DNA has multiple origins of replication whereas circular DNA only has one.

Now back to the real question: Does it matter for rate of replication whether the DNA is circular or linear? Does it contribute to eukaryotic DNA replication being faster in our specific case?

One thing I found in an old revision of this Wikipedia article is as such:

One reason that many organisms have evolved to having linear chromosomes is due to the size of their genome. Linear chromosomes make it easier for transcription and replication of large genomes. If an organism had a very large genome arranged in a circular chromosome, it would have the potential problems when unwinding due to torsional strain.

What I make out from this is that the rate of replication isn't directly affected. It's more closely related to avoiding other issues that arise from eukaryotic DNA molecules being typically longer than prokaryotic ones. But in our case where we declare both DNA to be of equal length, I'm assuming circular vs linear has no bearing on the rate of DNA replication.

So am I correct on my assumptions?

Edit in response to the answer below:

It's known that replication of the eukaryotic DNA is faster in this case (where DNA molecules are of equal length) because eukaryotes have linear chromosomes whereas prokaryotes have circular ones

If it makes sense to form a sentence like this, presenting linearity as the cause, then it's enough to satisfy my definition of directly affected in this case. For example, it's easy to make this claim if you present the number of origins of replication as the cause. I looked at the textbook you mentioned and the quantity of origins of replication is in fact mentioned this way.


I went ahead and did a bit more reading in that textbook. What I've understood is: An eukaryotic linear chromosome has multiple replication origins rather than a single one in order to compensate for its much larger size. So while the shape might be a factor (possibly, not certainly), the primary and "direct" reason is the difference in size, not the shape.

I am not sure if I well understand what you mean by directly affected.

I will list some possibilities below (For reference you can see any genetics textbook, I use, Genetics: A conceptual approach by Pierce, but I guess any textbook would do).

  1. Porkaryotic polymerases (usually processing circular DNA) have a higher nucleotdie per second speed than eukaryotic polymerases (processing linear DNA). Does this mean that shape directly affects speed? In this case I would say no. It's just a difference between prokaryotic and eukariotic polymerases. Some prokaryotes have linear chromosomes and their polymerase will still be faster than eukaryotic polymerases.

  2. One prokaryotic chromosome has only one origin of replication, while one eukaryotic chromosome has several of them, so that eukaryotes parallelize even in a single chromosome. Does this mean that shape directly affects speed? In this case, I would say yes. Due to steric effects, the shape of the chromosome has to do with the ability of having several origins of replications, even if the lengths are the same. Edit: After Ved's comment, I realized that this is also not necessarily true. Not all circular chromosomes have one single origin of replication; archaea have circular chromosomes with more than one origin of replication

  3. Linear chromosomes are easier to unwind. Does this mean that shape directly affects speed? I would say, yes for large chromosome and no for small chromosomes. Edit: After David's comment, I realized that I am not really sure that linear chromosomes are easier to unwind. A Wikipedia entry states this, but I gave a look at the paper they referenced and found no real statement about this.

Importantly, if the DNA molecules you compare are one prokaryotic and one eukaryotic and they are very short, I would bet my 5 bucks on prokaryotic (circular) being faster, because in absence of all the parallelization the eukaryotic DNA polymerases are slower than prokaryotic ones.

Effects of Circular DNA Length on Transfection Efficiency by Electroporation into HeLa Cells

Affiliations Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, United States of America, Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, United States of America, Department of Pharmacology, Baylor College of Medicine, Houston, TX, United States of America

Affiliation Department of Pathology, Texas Children’s Hospital, Houston, TX, United States of America

Affiliations Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, United States of America, Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, United States of America, Department of Pharmacology, Baylor College of Medicine, Houston, TX, United States of America

1. Sequence and conformation of nucleic acid samples

The basic principles of electrophoresis imply that nucleic acid samples have different rates of mobility when they are of different sizes. However, nucleic acids with the same number of nucleotides but different sequence composition and conformation may have different mobilities during electrophoresis (Figure 1).

  • Sequence: AT-rich DNA may migrate more slowly than GC-rich DNA of the same size, especially in high-resolution electrophoresis. Similarly, DNA molecules with 4–6 adenosine repeats at approximately every 10 bp (called curved DNA) will migrate irregularly, especially in polyacrylamide gels [1,2]. Their anomalous migration is likely due to sequence composition affecting their molecular conformation.
  • Conformation: The migration of DNA molecules of the same sequence but differing conformations, such as circular and linearized plasmids, is affected by the compactness of each conformation as they move through the gel pores. Highly compact supercoiled molecules migrate the fastest, followed by flexible linear and open circular molecules (Figure 1). This differential migration may be exploited to examine the integrity of plasmid DNA after isolation, since intact plasmid DNA is desirable in applications like transfection of mammalian cells for gene overexpression.

Figure 1. Electrophoretic migration of the same DNA in various conformations. (A) Electrophoresis of nicked circular, linear, and supercoiled plasmid DNA. (B) Conformation of relaxed circular, linear, and supercoiled plasmid DNA. Nicked plasmids assume a relaxed, open circular conformation and take up the most volume, migrating most slowly through the gel linearized plasmids move through the gel at a slightly higher rate intact, supercoiled plasmids, being the most compact, migrate the fastest.

DNA Repair

Himasha M. Perera , . Michael A. Trakselis , in The Enzymes , 2019

2.4 Bacteria

The loading of the bacterial replicative helicase (DnaB) is catalyzed by both the replication initiator protein (DnaA) and the helicase loader ( DnaC ) ( Table 1 and Fig. 1 A ) [20–22] at the origin of replication (oriC) [23] . DnaA is a highly conserved protein among bacteria, which binds discrete regions of oriC through a helix-turn-helix motif within the CTD to form oligomers [24] . Active DnaA oligomers regulate bacterial DNA replication initiation primarily through binding ATP, changing conformation of the oligomer to open a ssDNA DUE bubble. DnaC is a monomeric protein that binds each subunit of the DnaB protein assembly in a 1:1 ratio to conform the ring into an open lock-washer shape before loading it onto the open DUE [25–27] . Two DnaB-DnaC complexes are recruited and loaded sequentially onto opposite strands, such that each helicase encircles one strand and excludes the other [28] . Post-loading, DnaC is displaced by the primase (DnaG), which stabilizes the N-terminal DnaB collar and stimulates helicase activity [29,30] . Once the active DnaB is released by DnaC, it begins translocating CTD first in the 5′-3′ direction on what will become the lagging strand at a rate of approximately 35 base pairs per second [31] . Once a suitable region of ssDNA has been exposed, DnaG synthesizes an RNA primer de novo, recruiting the replicative polymerase complex including the Pol III holoenzyme (HE), and DNA replication begins in earnest.

Fig. 1 . Consensus DNA replication initiation and hexameric helicase loading steps in bacteria and eukaryotes. The Initiated state defines the origin through binding of initiation proteins. The Loaded state involves accessory proteins used to load hexameric helicases onto exposed ssDNA (bacteria) or dsDNA (eukaryotes). The Activated state recruits interacting proteins that enhance the enzymatic ability for DNA unwinding. Finally, the Established replisome coordinates DNA unwinding with synthesis.

Genomic methods for measuring DNA replication dynamics

Genomic DNA replicates according to a defined temporal program in which early-replicating loci are associated with open chromatin, higher gene density, and increased gene expression levels, while late-replicating loci tend to be heterochromatic and show higher rates of genomic instability. The ability to measure DNA replication dynamics at genome scale has proven crucial for understanding the mechanisms and cellular consequences of DNA replication timing. Several methods, such as quantification of nucleotide analog incorporation and DNA copy number analyses, can accurately reconstruct the genomic replication timing profiles of various species and cell types. More recent developments have expanded the DNA replication genomic toolkit to assays that directly measure the activity of replication origins, while single-cell replication timing assays are beginning to reveal a new level of replication timing regulation. The combination of these methods, applied on a genomic scale and in multiple biological systems, promises to resolve many open questions and lead to a holistic understanding of how eukaryotic cells replicate their genomes accurately and efficiently.

This is a preview of subscription content, access via your institution.

Past Systems of Classification

Viruses contain only a few elements by which they can be classified: the viral genome, the type of capsid, and the envelope structure for the enveloped viruses. All of these elements have been used in the past for viral classification (Table 1 and Figure 1). Viral genomes may vary in the type of genetic material (DNA or RNA) and its organization (single- or double-stranded, linear or circular, and segmented or non-segmented). In some viruses, additional proteins needed for replication are associated directly with the genome or contained within the viral capsid.

Table 1. Virus Classification by Genome Structure and Core
Core Classifications Examples
RNA Rabies virus, retroviruses
DNA Herpesviruses, smallpox virus
Single-stranded Rabies virus, retroviruses
Double-stranded Herpesviruses, smallpox virus
Linear Rabies virus, retroviruses, herpesviruses, smallpox virus
Circular Papillomaviruses, many bacteriophages
Non-segmented: genome consists of a single segment of genetic material Parainfluenza viruses
Segmented: genome is divided into multiple segments Influenza viruses

Figure 1. Viruses are classified based on their core genetic material and capsid design. (a) Rabies virus has a single-stranded RNA (ssRNA) core and an enveloped helical capsid, whereas (b) variola virus, the causative agent of smallpox, has a double-stranded DNA (dsDNA) core and a complex capsid. (credit “rabies diagram”: modification of work by CDC “rabies micrograph”: modification of work by Dr. Fred Murphy, CDC credit “small pox micrograph”: modification of work by Dr. Fred Murphy, Sylvia Whitfield, CDC credit “smallpox photo”: modification of work by CDC scale-bar data from Matt Russell)

Viruses can also be classified by the design of their capsids (Table 2 and Figure 2). Capsids are classified as naked icosahedral, enveloped icosahedral, enveloped helical, naked helical, and complex. The type of genetic material (DNA or RNA) and its structure (single- or double-stranded, linear or circular, and segmented or non-segmented) are used to classify the virus core structures (Table 2).

Table 2. Virus Classification by Capsid Structure
Capsid Classification Examples
Naked icosahedral Hepatitis A virus, polioviruses
Enveloped icosahedral Epstein-Barr virus, herpes simplex virus, rubella virus, yellow fever virus, HIV-1
Enveloped helical Influenza viruses, mumps virus, measles virus, rabies virus
Naked helical Tobacco mosaic virus
Complex with many proteins some have combinations of icosahedral and helical capsid structures Herpesviruses, smallpox virus, hepatitis B virus, T4 bacteriophage

Figure 2. Transmission electron micrographs of various viruses show their structures. The capsid of the (a) polio virus is naked icosahedral (b) the Epstein-Barr virus capsid is enveloped icosahedral (c) the mumps virus capsid is an enveloped helix (d) the tobacco mosaic virus capsid is naked helical and (e) the herpesvirus capsid is complex. (credit a: modification of work by Dr. Fred Murphy, Sylvia Whitfield credit b: modification of work by Liza Gross credit c: modification of work by Dr. F. A. Murphy, CDC credit d: modification of work by USDA ARS credit e: modification of work by Linda Stannard, Department of Medical Microbiology, University of Cape Town, South Africa, NASA scale-bar data from Matt Russell)

A primer on DNA sequencing for the practicing urologist

Have you ever wondered what exactly happens to a patient sample when it disappears into a laboratory’s ether? Suddenly, a report filled with results magically shows up in your patient’s file, but what happens during that unknown period? The answer is: a lot. That sample goes through a complex molecular journey.

This article will walk you through the history of genetic sequencing and polymerase chain reaction (PCR), where they are today, touch on microarrays, explain some standard terminology and the questions those terms are asking, plus describe the clinical laboratory’s workflow.

DNA and Sanger sequencing

In today’s molecular testing world, there are 2 very common applications: sequencing and quantitative polymerase chain reaction (qPCR). The first method of sequencing, known as Sanger sequencing, was founded by Frederick Sanger, PhD,
1 of 2 people to win a Nobel Prize twice in the same category. He is considered a pioneer of sequencing DNA for his work with Walter Gilbert, PhD. 1 Prior to his work, most research was done on RNA, which is single-stranded and was easily manipulated with RNase enzymes that cut at very specific nucleotide sequences.

With their discovery of DNA, Watson and Crick noted that nucleotides form the building blocks and that adenine binds with thymine and cytosine binds with guanine to form a base pair (bp). These nucleotides are called deoxynucleotides, meaning they are missing a hydroxyl group. In sequencing, in order to interrogate and “read” the genes and DNA of interest, the base pairs are read and identified. Sanger sequencing decided to throw a wrench into a small portion of those nucleotides and make them dideoxynucleotides, removing another hydroxyl group. When those dideoxynucleotides get added, sequencing immediately stops. Imagine you’re building a Lego tower out of regular-sized 2 by 4 pieces but mixed in about 10% of irregular, flat pieces where there aren’t bumps on top to add any more pieces. That’s how dideoxynucleotides work. They stop your Lego tower from growing.

Sanger sequencing was so revolutionary and important at the time, and for the next 20 years, it was used in the Human Genome Project. 2 By overlapping the sequences, the human genome was built about 500 to 600 bp at a time, with Sanger sequencing being a critical aspect of the entire project. The method is still used to this day for very fast and inexpensive sequencing results and difficult-to-sequence portions of the genome.

Next-generation sequencing

As with any technology, companies are looking to improve the speed, accuracy, and cost of an assay. There have been a few iterations of the next step beyond Sanger, but the one that has become dominant is called next-generation sequencing (NGS), formerly known as massively parallel sequencing. Nowadays, there are 2 versions of sequencing: short reads (Illumina) and long reads (Pacific Biosciences and Oxford Nanopore). Short reads go up to 600 bp together in 1 run, whereas long reads can go beyond 10,000 bp at once, with some reads in the millions of bp. To put this in perspective, the BRCA2 protein is approximately 3000 amino acids in length. By definition, an amino acid is coded for by 3 base pairs, and each base has 2 nucleotides. Thus, there are over 9000 bp (18,000 nucleotides) in the BRCA2 gene.

Short-read NGS uses a flow cell to hybridize short pieces of DNA to it, replicate that DNA, and then copy it over and over, sometimes hundreds of times. Each nucleotide is fluorescent and will activate upon reading, allowing that nucleotide to be added to the sequence. Remember Lite-Brites when you were a kid? You’d put little pieces on a black board with holes, and the pieces would subsequently glow. Imagine having 1 Lite-Brite as a template, and trying to copy the same image hundreds of times, and each time you add a piece, it glows a specific color assigned to that light, or in this case, nucleotide. Along the way, you consistently make an error in the exact same spot. Because of the consistency of that light being incorrect, that’s not just a mistake. Instead, that becomes an interesting diagnostic possibility because that patient sample has a mutation.

There are 2 common ways to use short-read NGS: whole-genome sequencing (WGS) and targeted sequencing, also known as amplicon sequencing or panel sequencing. WGS refers to just that: sequence the whole genome at a certain level of coverage, which is how often you read a base pair compared with the reference genome. Most of the time, 30 times coverage, meaning each base pair on average was read 30 times, is sufficient for nondisease applications. Gene panel sequencing looks at a specific subset of known disease genes, at a much greater coverage, up to 1000 times but mostly 500 to 600 times. For example, a provider may want to run a gene panel on a patient with a history of colorectal cancer to determine whether there is a hereditary component. Genes included in this panel would include MSH1, PMS2, MLH2, MSH6, EpCAM, all of which are associated with Lynch syndrome, as well as APC and MUTYH, which are associated with other syndromic patterns where colorectal cancers are common. 3 Prostate cancer–specific panels will often include BRCA1, BRCA2, ATM, CHEK2, PALB2, HOXB13, and others.

Long reads act a little differently from short reads. Instead of creating many short copies on a chip, long reads use a very large circular sequence of DNA and continuously run it through a mechanism, such as a protein pore, to consistently read the same DNA over and over. Comparatively, this is like copying a Lego tower repeatedly using the same colors in the same order vs riding a Ferris wheel and having the operator check each cab every time it passes the bottom. Long reads will catch the same errors as short reads, but also provide some structural variant support and help getting through more difficult areas to read. This allows for some deeper understanding of possible disease states and their proximity to other possible issues.

Quantitative PCR

Kary Mullis, PhD, was a chemist at Cetus Corporation. One night while driving around Mendocino County with his girlfriend, also a chemist at Cetus, he recognized that DNA base pairs were constant in their pairing, and had a random thought to match/hybridize short pieces of DNA that were complementary to long pieces of DNA plus DNA polymerase. This matching added nucleotides to a piece of DNA. 4 This allowed a short piece of DNA to be amplified repeatedly using different temperatures, creating billions of copies over many cycles, which would then be studied on an agarose gel (Figure 1). Hence, PCR was invented. Mullis was awarded the Nobel Prize in 1993 for this groundbreaking invention, which led to so many discoveries in science. Around the same time, Higuchi et al discovered that increasing amounts of DNA could be directly studied using a fluorescent marker without the need for agarose gel. 5 And voilà! qPCR was invented.

In order to use qPCR in diagnostics, the clinic has to know which specific gene is of interest, as the primers have to flank the target sequence to be amplified. The high specificity of this type of assay is both a blessing and a hindrance. It’s a blessing because the clinic can answer a diagnostic question with high confidence but a hindrance because it may miss other possible disease states that are outside the targeted region. Most of the time, a sample will be split up to run multiple different assays at the same time to cover a wider array of diseases. qPCR is also very commonly used to get to the heart of urinary tract infections (UTIs) and their persistent nature. Many companies are offering qPCR diagnostics for UTIs, prostatitis, and more. 6 Most of the time, those companies will also offer an NGS panel in addition to cover all diagnostic bases.


Microarrays are small chips with imprinted specific DNA targets of interest (Figure 2). The test is run with a reference sample, often labeled with a green fluorescent dye, and the targeted DNA sample, labeled with a red fluorescent dye. Both are then hybridized to the chip, and a comparative analysis is done. If the targeted DNA is expressed at a higher rate, that small area will glow red. If the control DNA is expressed higher (or decreased expression in the target DNA), it will glow green. Finally, if expressed in equal amounts, essentially no mutation, the square will glow yellow. 8 The sensitivity is generally low, but the ability to study many targets at once is a highlight. Microarrays are a common technique of many companies that offer testing for determining cultural heritage. Those companies will study up to 700,000 targets at 1 time. However, microarrays are slowly declining because of the greater adoption of NGS assays. These direct-to-consumer companies are really fun and interesting for those who are seeking information about their ancestry but should not be used for cancer risk assessment.

What happens to a sample in the clinical lab?

When a patient has a sample sent for molecular testing, whether it’s tissue, urine, blood, or saliva, that specimen is immediately tagged with a number specific to that patient. The sample is transported to the lab with the appropriate storage. The lab receives the sample and inputs it into their system, also called accessioning. Then, the molecular journey is as follows:

Nucleic acid isolation: convert RNA to DNA if needed

a. Sonication or enzyme digestion to create uniform DNA segments

b. Enrich the target DNA if needed, as target enrichment and amplicon generation workflows used in gene panels

c. Barcode ligation: also known as multiplexing, which is adding unique markers per patient sample so they can be mixed together, then parsed upon software analysis

d. Adapter ligation: allows the DNA to bind to the flow cell

Sequence the DNA: 0.5 days to several days, depending on sequencing type and instrument used

Bioinformatics: Results are parsed and analyzed.

Report generated: Details around the disease state are provided, and sometimes potential treatment scenarios depending on the software’s FDA approvals

What does a clinical laboratory look for in their tests?

Although you’ve learned about generic methods and workflows, labs look for specific issues using molecular methods that cause various disease states. In this section, I’ll lay out a few of the more common terms in the lexicon of genomic testing.

Single-nucleotide polymorphism (SNP). As the name implies and by definition, this occurs when a single nucleotide change is identified in a particular gene and present in 1% of the population. This SNP results in a gene mutation but may or may not cause an alteration of downstream protein function, depending on whether the change affects the specific amino acid in which it is coding. There is significant interest in looking at a panel of SNPs to determine risk assessment for breast and prostate cancer.

Copy number variation. This is a duplication or deletion of a sequence of nucleotides, not just a single nucleotide like an SNP. Most genes in the human genome have 2 alleles, 1 each inherited from your mother and father. In rare instances, short sequences can be replicated many times. For example, the HTT gene codes for the protein huntingtin. In this case, the trinucleotide CAG can be repeated 36 times or more. The result is abnormal protein production, which can then lead to Huntington disease. 7

In other cases, entire genes can be repeated or deleted, causing overexpression or underexpression, as in the case of α-amylase 1 and its overexpression because of dietary differences. 8 The largest example of this is the trisomy issues that cause Down syndrome.

Gene fusions. Fusions occurs when 2 genes fuse during replication, causing a pseudogene that creates expression issues. One of the earliest discovered examples of this is a reciprocal translocation where the ABL1 gene of chromosome 9 is translocated and fuses to the BCR gene on chromosome 22, causing a BCR-ABL1 gene (the Philadelphia chromosome), which induces chronic myeloid leukemia.9 This is difficult to detect using molecular testing because there are various fusion loci on each gene, but it can be done with proper techniques, such as digital PCR and NGS.

This partial list of 3 common issues is just a sample of what a molecular lab can discover. Some tests are more involved than others from a workflow and difficulty perspective, whereas others are fairly straightforward. The most challenging part for a lab is to discern the ability of a specific assay type to get the proper answer because some answers are much more difficult to come by.


The world of clinical diagnostics is changing. The development of targeted therapies is increasingly more specific to various molecular changes that are therapeutic resistance drivers. The development of companion diagnostics so patients can receive these new agents is mandatory. In addition, the ability to detect and potentially mitigate disease at a much earlier stage before systemic/metastatic disease has clear upside potential. Therefore, embracing and understanding these new and emerging molecular techniques will improve your patient outcomes and enhance your practice.

Wright has been involved in biotech and clinical sales and marketing for nearly 20 years. He has a Master’s in molecular biology from Washington University in St. Louis and an MBA in strategy and operations from Boston University. He has worked for companies such as Thermo Fisher and Illumina and has started multiple companies outside of the biotech world.

Which Came First?

There is some evidence DNA may have occurred first, but most scientists believe RNA evolved before DNA.   RNA has a simpler structure and is needed in order for DNA to function. Also, RNA is found in prokaryotes, which are believed to precede eukaryotes. RNA on its own can act as a catalyst for certain chemical reactions.

The real question is why DNA evolved if RNA existed. The most likely answer for this is that having a double-stranded molecule helps protect the genetic code from damage. If one strand is broken, the other strand can serve as a template for repair. Proteins surrounding DNA also confer additional protection against enzymatic attack.

Agúndez, L., González-Prieto, C., Machón, C., and Llosa, M. (2012). Site-specific integration of foreign DNA into minimal bacterial and human target sequences mediated by a conjugative relaxase. PLOS ONE 7:e31047. doi: 10.1371/journal.pone.0031047

Alperi, A., Larrea, D., Fernández-González, E., Dehio, C., Zechner, E. L., and Llosa, M. (2013). A translocation motif in relaxase TrwC specifically affects recruitment by its conjugative Type IV secretion system. J. Bacteriol. 195, 4999�. doi: 10.1128/JB.00367-13

Auchtung, J. M., Aleksanyan, N., Bulku, A., and Berkmen, M. B. (2016). Biology of ICEBs1, an integrative and conjugative element in Bacillus subtilis. Plasmid 86, 14�. doi: 10.1016/j.plasmid.2016.07.001

Baas, P. D. (1985). DNA replication of single-stranded Escherichia coli DNA phages. Biochim. Biophys. Acta 825, 111�. doi: 10.1016/0167-4781(85)90096-X

Balagྮ, C., Kalla, M., and Zhang, W. W. (1997). Adeno-associated virus Rep78 protein and terminal repeats enhance integration of DNA sequences into the cellular genome. J. Virol. 71, 3299�.

Balakrishnan, B., and Jayandharan, G. R. (2014). Basic biology of adeno-associated virus (AAV) vectors used in gene therapy. Curr. Gene Ther. 14, 86�. doi: 10.3389/fnmol.2014.00076

Balson, D. F., and Shaw, W. V. (1990). Nucleotide sequence of the rep gene of staphylococcal plasmid pCW7. Plasmid 24, 74�. doi: 10.1016/0147-619X(90)90027-A

Bellanger, X., Payot, S., Leblond-Bourget, N., and Guຝon, G. (2014). Conjugative and mobilizable genomic islands in bacteria: evolution and diversity. FEMS Microbiol. Rev. 38, 720�. doi: 10.1111/1574-6976.12058

Birch, P., and Khan, S. A. (1992). Replication of single-stranded plasmid pT181 DNA in vitro. Proc. Natl. Acad. Sci. U.S.A. 89, 290�. doi: 10.1073/pnas.89.1.290

Boer, D. R., Ruíz-Masó, J. A., López-Blanco, J. R., Blanco, A. G., Vives-Llr, M., Chacón, P., et al. (2009). Plasmid replication initiator RepB forms a hexamer reminiscent of ring helicases and has mobile nuclease domains. EMBO J. 28, 1666�. doi: 10.1038/emboj.2009.125

Boer, D. R., Ruiz-Masó, J. A., Rueda, M., Petoukhov, M. V., Machón, C., Svergun, D. I., et al. (2016). Conformational plasticity of RepB, the replication initiator protein of promiscuous streptococcal plasmid pMV158. Sci. Rep. 6:20915. doi: 10.1038/srep20915

Burrus, V. (2017). Mechanisms of stabilization of integrative and conjugative elements. Curr. Opin. Microbiol. 38, 44�. doi: 10.1016/j.mib.2017.03.014

Byrd, D. R., and Matson, S. W. (1997). Nicking by transesterification: the reaction catalysed by a relaxase. Mol. Microbiol. 25, 1011�. doi: 10.1046/j.1365-2958.1997.5241885.x

Cabezón, E., Ripoll-Rozada, J., Pe༚, A., de la Cruz, F., and Arechaga, I. (2015). Towards an integrated model of bacterial conjugation. FEMS Microbiol. Rev. 39, 81�. doi: 10.1111/1574-6976.12085

Carr, S. B., Phillips, S. E., and Thomas, C. D. (2016). Structures of replication initiation proteins from staphylococcal antibiotic resistance plasmids reveal protein asymmetry and flexibility are necessary for replication. Nucleic Acids Res. 44, 2417�. doi: 10.1093/nar/gkv1539

Carraro, N., and Burrus, V. (2014). Biology of three ICE families: SXT/R391, ICEBs1, and ICESt1/ICESt3. Microbiol. Spectr. 2:MDNA3-0008-2014. doi: 10.1128/microbiolspec.MDNA3-0008-2014

Carraro, N., Libante, V., Morel, C., Charron-Bourgoin, F., Leblond, P., and Guຝon, G. (2016). Plasmid-like replication of a minimal streptococcal integrative and conjugative element. Microbiology 162, 622�. doi: 10.1099/mic.0.000219

César, C. E., Machón, C., de la Cruz, F., and Llosa, M. (2006). A new domain of conjugative relaxase TrwC responsible for efficient oriT-specific recombination on minimal target sequences. Mol. Microbiol. 62, 984�. doi: 10.1111/j.1365-2958.2006.05437.x

Chandler, M., de la Cruz, F., Dyda, F., Hickman, A. B., Moncalian, G., and Ton-Hoang, B. (2013). Breaking and joining single-stranded DNA: the HUH endonuclease superfamily. Nat. Rev. Microbiol. 11, 525�. doi: 10.1038/nrmicro3067

Curcio, M. J., and Derbyshire, K. M. (2003). The outs and ins of transposition: from mu to kangaroo. Nat. Rev. Mol. Cell Biol. 4, 865�. doi: 10.1038/nrm1241

Daya, S., Cortez, N., and Berns, K. I. (2009). Adeno-associated virus site-specific integration is mediated by proteins of the nonhomologous end-joining pathway. J. Virol. 83, 11655�. doi: 10.1128/JVI.01040-09

Dempsey, L. A., Birch, P., and Khan, S. A. (1992). Six amino acids determine the sequence-specific DNA binding and replication specificity of the initiator proteins of the pT181 family. J. Biol. Chem. 267, 24538�.

Dressler, D. (1970). The rolling circle for phiX174 DNA replication. II. Synthesis of single-stranded circles. Proc. Natl. Acad. Sci. U.S.A. 67, 1934�. doi: 10.1073/pnas.67.4.1934

Dyda, F., and Hickman, A. B. (2003). A mob of reps. Structure 11, 1310�. doi: 10.1016/j.str.2003.10.010

Eisenberg, S., and Kornberg, A. (1979). Purification and characterization of phiX174 gene A protein. A multifunctional enzyme of duplex DNA replication. J. Biol. Chem. 254, 5328�.

Francia, M. V., and Clewell, D. B. (2002). Transfer origins in the conjugative Enterococcus faecalis plasmids pAD1 and pAM373: identification of the pAD1 nic site, a specific relaxase and a possible TraG-like protein. Mol. Microbiol. 45, 375�. doi: 10.1046/j.1365-2958.2002.03007.x

Francia, M. V., Clewell, D. B., de la Cruz, F., and Moncalián, G. (2013). Catalytic domain of plasmid pAD1 relaxase TraX defines a group of relaxases related to restriction endonucleases. Proc. Natl. Acad. Sci. U.S.A. 110, 13606�. doi: 10.1073/pnas.1310037110

Furuya, N., and Komano, T. (2003). NikAB- or NikB dependent intracellular recombination between tandemly repeated oriT sequences of plasmid R64 in plasmid or single-stranded phage vectors. J. Bacteriol. 185, 3871�. doi: 10.1128/JB.185.13.3871-3877.2003

Garcillán-Barcia, M. P., Bernales, I., Mendiola, M. V., and de la Cruz, F. (2001). Single-stranded DNA intermediates in IS91 rolling-circle transposition. Mol. Microbiol. 39, 494�. doi: 10.1046/j.1365-2958.2001.02261.x

Garcillán-Barcia, M. P., and de la Cruz, F. (2002). Distribution of IS91 family insertion sequences in bacterial genomes: evolutionary implications. FEMS Microbiol. Ecol. 42, 303�. doi: 10.1111/j.1574-6941.2002.tb01020.x

Garcillán-Barcia, M. P., Francia, M. V., and de la Cruz, F. (2009). The diversity of conjugative relaxases and its application in plasmid classification. FEMS Microbiol. Rev. 33, 657�. doi: 10.1111/j.1574-6976.2009.00168.x

Gennaro, M. L., Kornblum, J., and Novick, R. P. (1987). A site-specific recombination function in Staphylococcus aureus plasmids. J. Bacteriol. 169, 2601�. doi: 10.1128/jb.169.6.2601-2610.1987

Gilbert, W., and Dressler, D. (1968). DNA replication: the rolling circle model. Cold Spring Harb. Symp. Quant. Biol. 33, 473�. doi: 10.1101/SQB.1968.033.01.055

Gonzalez-Perez, B., Lucas, M., Cooke, L. A., Vyle, J. S., de la Cruz, F., and Moncalián, G. (2007). Analysis of DNA processing reactions in bacterial conjugation by using suicide oligonucleotides. EMBO J. 26, 3847�. doi: 10.1038/sj.emboj.7601806

González-Prieto, C., Agúndez, L., Linden, R. M., and Llosa, M. (2013). HUH site-specific recombinases for targeted modification of the human genome. Trends Biotechnol. 31, 305�. doi: 10.1016/j.tibtech.2013.02.002

González-Prieto, C., Gabriel, R., Dehio, C., Schmidt, M., and Llosa, M. (2017). The conjugative relaxase TrwC promotes integration of foreign DNA in the human genome. Appl. Environ. Microbiol. 83, e207�. doi: 10.1128/AEM.00207-17

Grabundzija, I., Messing, S. A., Thomas, J., Cosby, R. L., Bilic, I., Miskey, C., et al. (2016). A Helitron transposon reconstructed from bats reveals a novel mechanism of genome shuffling in eukaryotes. Nat. Commun. 7:10716. doi: 10.1038/ncomms10716

Griffiths, A. J. F., Miller, J. H., Suzuki, D. T., Lewontin, R. C., and Gelbart, W. M. (1999). An Introduction to Genetic Analysis, 7th Edn. San Francisco, CA: W.H. Freeman.

Grohmann, E. (2010). Autonomous plasmid-like replication of Bacillus ICEBs1: a general feature of integrative conjugative elements? Mol. Microbiol. 75, 261�. doi: 10.1111/j.1365-2958.2009.06978.x

Guglielmini, J., Néron, B., Abby, S. S., Garcillán-Barcia, M. P., de la Cruz, F., and Rocha, E. P. (2014). Key components of the eight classes of type IV secretion systems involved in bacterial conjugation or protein secretion. Nucleic Acids Res. 42, 5715�. doi: 10.1093/nar/gku194

Guglielmini, J., Quintais, L., Garcillán-Barcia, M. P., de la Cruz, F., and Rocha, E. P. (2011). The repertoire of ICE in prokaryotes underscores the unity, diversity, and ubiquity of conjugation. PLOS Genet. 7:e1002222. doi: 10.1371/journal.pgen.1002222

He, S., Corneloup, A., Guynet, C., Lavatine, L., Caumont-Sarcos, A., Siguier, P., et al. (2015). The IS200/IS605family and “peel and paste” single-strand transposition mechanism. Microbiol. Spectrum 3:MDNA3-0039-2014. doi: 10.1128/microbiolspec.MDNA3-0039-2014

Henckaerts, E., Dutheil, N., Zeltner, N., Kattman, S., Kohlbrenner, E., Ward, P., et al. (2009). Site-specific integration of adeno-associated virus involves partial duplication of the target locus. Proc. Natl. Acad. Sci. U.S.A. 106, 7571�. doi: 10.1073/pnas.0806821106

Henderson, D., and Meyer, R. (1999). The MobA-linked primase is the only replication protein of R1162 required for conjugal mobilization. J. Bacteriol. 181, 2973�.

Henry, T. J., and Knippers, R. (1974). Isolation and function of the gene A initiator of bacteriophage phiX174, a highly specific DNA endonuclease. Proc. Natl. Acad. Sci. U.S.A. 71, 1549�. doi: 10.1073/pnas.71.4.1549

Ikeda, E. J., Yudelevich, A., and Hurwitz, J. (1976). Isolation and characterization of the protein coded by gene A of bacteriophage phiX174 DNA. Proc. Natl. Acad. Sci. U.S.A. 73, 2669�. doi: 10.1073/pnas.73.8.2669

Ilyina, T. V., and Koonin, E. V. (1992). Conserved sequence motifs in the initiator proteins for rolling circle DNA replication encoded by diverse replicons from eubacteria, eucaryotes and archaebacteria. Nucleic Acids Res. 20, 3279�. doi: 10.1093/nar/20.13.3279

Janovitz, T., Klein, I. A., Oliveira, T., Mukherjee, P., Nussenzweig, M. C., Sadelain, M., et al. (2013). High-throughput sequencing reveals principles of adeno-associated virus serotype 2 integration. J. Virol. 87, 8559�. doi: 10.1128/JVI.01135-13

Khan, S. A. (1997). Rolling-circle replication of bacterial plasmids. Microbiol. Mol. Biol. Rev. 61, 442�.

Khan, S. A. (2003). DNA-protein interactions during the initiation and termination of plasmid pT181 rolling-circle replication. Prog. Nucleic Acid Res. Mol. Biol. 75, 113�. doi: 10.1016/S0079-6603(03)75004-1

Koepsel, R. R., Murray, R. W., Rosenblum, W. D., and Khan, S. A. (1985). The replication initiator protein of plasmid pT181 has sequence-specific endonuclease and topoisomerase-like activities. Proc. Natl. Acad. Sci. U.S.A. 82, 6845�. doi: 10.1073/pnas.82.20.6845

Koonin, E. V., and Ilyina, T. V. (1992). Geminivirus replication proteins are related to prokaryotic plasmid rolling circle DNA replication initiator proteins. J. Gen. Virol. 73, 2763�. doi: 10.1099/0022-1317-73-10-2763

Krupovic, M., and Forterre, P. (2015). Single-stranded DNA viruses employ a variety of mechanisms for integration into host genomes. Ann. N.Y. Acad. Sci. 1341, 41�. doi: 10.1111/nyas.12675

Lang, S., Gruber, K., Mihajlovic, S., Arnold, R., Gruber, C. J., Steinlechner, S., et al. (2010). Molecular recognition determinants for type IV secretion of diverse families of conjugative relaxases. Mol. Microbiol. 78, 1539�. doi: 10.1111/j.1365-2958.2010.07423.x

Lee, C. A., Babic, A., and Grossman, A. D. (2009). Autonomous plasmid-like replication of a conjugative transposon. Mol. Microbiol. 75, 268�. doi: 10.1111/j.1365-2958.2009.06985.x

Lee, C. A., and Grossman, A. D. (2007). Identification of the origin of transfer (oriT) and DNA relaxase required for conjugation of the integrative and conjugative element ICEBs1 of Bacillus subtilis. J. Bacteriol. 189, 7254�. doi: 10.1128/JB.00932-07

Lee, C. A., Thomas, J., and Grossman, A. D. (2012). The Bacillus subtilis conjugative transposon ICEBs1 mobilizes plasmids lacking dedicated mobilization functions. J. Bacteriol. 194, 3165�. doi: 10.1128/JB.00301-12

Llosa, M., Gomis-Rüth, F. X., Coll, M., and de la Cruz, F. (2002). Bacterial conjugation: a two-step mechanism for DNA transport. Mol. Microbiol. 45, 1𠄸. doi: 10.1046/j.1365-2958.2002.03014.x

López-Aguilar, C., Romero-López, C., Espinosa, M., Berzal-Herranz, A., and Del Solar, G. (2015). The 5’-tail of antisense RNAII of pMV158 plays a critical role in binding to the target mRNA and in translation inhibition of repB. Front. Genet. 6:225. doi: 10.3389/fgene.2015.00225

Lorenzo-D໚z, F., Dostál, L., Coll, M., Schildbach, J. F., Menéndez, M., and Espinosa, M. (2011). The MobM relaxase domain of plasmid pMV158: thermal stability and activity upon Mn 2+ and specific DNA binding. Nucleic Acids Res. 39, 4315�. doi: 10.1093/nar/gkr049

Lorenzo-D໚z, F., Fernández-López, C., Garcillán-Barcia, M. P., and Espinosa, M. (2014). Bringing them together: plasmid pMV158 rolling circle replication and conjugation under an evolutionary perspective. Plasmid 74, 15�. doi: 10.1016/j.plasmid.2014.05.004

Lorenzo-D໚z, F., Fernández-López, C., Lurz, R., Bravo, A., and Espinosa, M. (2017). Crosstalk between vertical and horizontal gene transfer: plasmid replication control by a conjugative relaxase. Nucleic Acids Res. 45, 7774�. doi: 10.1093/nar/gkx450

Mansfeld, A. D., van Teeffelen, H. A., Baas, P. D., and Jansz, H. S. (1986). Two juxtaposed tyrosyl-OH groups participate in phiX174 gene A protein catalysed cleavage and ligation of DNA. Nucleic Acids Res. 14, 4229�. doi: 10.1093/nar/14.10.4229

Masai, H., Nomura, N., Kubota, Y., and Arai, K. (1990). Roles of phi X174 type primosome- and G4 type primase-dependent primings in initiation of lagging and leading strand syntheses of DNA replication. J. Biol. Chem. 265, 15124�.

Mendiola, M. V., Bernales, I., and de la Cruz, F. (1994). Differential roles of the transposon termini in IS91 transposition. Proc. Natl. Acad. Sci. U.S.A. 91, 1922�. doi: 10.1073/pnas.91.5.1922

Mendiola, M. V., and de la Cruz, F. (1992). IS91 transposase is related to the rolling-circle-type replication proteins of the pUB110 family of plasmids. Nucleic Acids Res. 20, 3521. doi: 10.1093/nar/20.13.3521

Mendiola, M. V., Jubete, Y., and de la Cruz, F. (1992). DNA sequence of IS91 and identification of the transposase gene. J. Bacteriol. 174, 1345�. doi: 10.1128/jb.174.4.1345-1351.1992

Miyazaki, R., and van der Meer, J. R. (2011). A dual functional origin of transfer in the ICEclc genomic island of Pseudomonas knackmussii B13. Mol. Microbiol. 79, 743�. doi: 10.1111/j.1365-2958.2010.07484.x

Nash, K., Chen, W., Salganik, M., and Muzyczka, N. (2009). Identification of cellular proteins that interact with the adeno-associated virus rep protein. J. Virol. 83, 454�. doi: 10.1128/JVI.01939-08

Novick, R. P. (1998). Contrasting lifestyles of rolling-circle phages and plasmids. Trends Biochem. Sci. 23, 434�. doi: 10.1016/S0968-0004(98)01302-4

Pastrana, C. L., Carrasco, C., Akhtar, P., Leuba, S. H., Khan, S. A., and Moreno-Herrero, F. (2016). Force and twist dependence of RepC nicking activity on torsionally-constrained DNA molecules. Nucleic Acids Res. 44, 8885�. doi: 10.1093/nar/gkw689

Priebe, S. D., and Lacks, S. A. (1989). Region of the streptococcal plasmid pMV158 required for conjugative mobilization. J. Bacteriol. 171, 4778�. doi: 10.1128/jb.171.9.4778-4784.1989

Projan, S. J., and Novick, R. (1988). Comparative analysis of five related staphylococcal plasmids. Plasmid 19, 203�. doi: 10.1016/0147-619X(88)90039-X

Punta, M., Coggill, P. C., Eberhardt, R. Y., Mistry, J., Tate, J., Boursnell, C., et al. (2012). The Pfam protein families database. Nucleic Acids Res. 40, D290�. doi: 10.1093/nar/gkr1065

Rasooly, A., and Novick, R. P. (1993). Replication-specific inactivation of the pT181 plasmid initiator protein. Science 262, 1048�. doi: 10.1126/science.8235621

Rocco, J. M., and Churchward, G. (2006). The integrase of the conjugative transposon Tn916 directs strand- and sequence-specific cleavage of the origin of conjugal transfer, oriT, by the endonuclease Orf20. J. Bacteriol. 188, 2207�. doi: 10.1128/JB.188.6.2207-2213.2006

Ruiz-Masó, J. A., Bordanaba-Ruiseco, L., Sanz, M., Menéndez, M., and Del Solar, G. (2016). Metal-induced stabilization and activation of plasmid replication initiator RepB. Front. Mol. Biosci. 3:56. doi: 10.3389/fmolb.2016.00056

Ruiz-Masó, J. A., Machó, N. C., Bordanaba-Ruiseco, L., Espinosa, M., Coll, M., and Del Solar, G. (2015). Plasmid rolling-circle replication. Microbiol. Spectr. 3:PLAS-0035-2014. doi: 10.1128/microbiolspec.PLAS-0035-2014

Smillie, C., Garcillan-Barcia, M. P., Francia, M. V., Rocha, E. P., and de La Cruz, F. (2010). Mobility of plasmids. Microbiol. Mol. Biol. Rev. 74, 434�. doi: 10.1128/MMBR.00020-10

Smith, R. H. (2008). Adeno-associated virus integration: virus versus vector. Gene Ther. 15, 817�. doi: 10.1038/gt.2008.55

Surosky, R. T., Urabe, M., Godwin, S. G., McQuiston, S. A., Kurtzman, G. J., Ozawa, K., et al. (1997). Adeno-associated virus Rep proteins target DNA sequences to a unique locus in the human genome. J. Virol. 71, 7951�.

Tattersall, P., and Ward, D. C. (1976). Rolling hairpin model for replication of parvovirus and linear chromosomal DNA. Nature 263, 106�. doi: 10.1038/263106a0

Thomas, J., and Pritham, E. J. (2015). Helitrons, the eukaryotic rolling-circle transposable elements. Microbiol. Spectr. 3:MDNA3-0049-2014. doi: 10.1128/microbiolspec.MDNA3-0049-2014

Toleman, M. A., Bennett, P. M., and Walsh, T. R. (2006). ISCR elements: novel gene-capturing systems of the 21st century? Microbiol. Mol. Biol. Rev. 70, 296�. doi: 10.1128/MMBR.00048-05

Ton-Hoang, B., Siguier, P., Quentin, Y., Onillon, S., Marty, B., Fichant, G., et al. (2012). Structuring the bacterial genome: Y1-transposases associated with REP-BIME sequences. Nucleic Acids Res. 40, 3596�. doi: 10.1093/nar/gkr1198

Tourasse, N. J., Stabell, F. B., and Kolstø, A. B. (2014). Survey of chimeric IStron elements in bacterial genomes: multiple molecular symbioses between group I intron ribozymes and DNA transposons. Nucleic Acids Res. 42, 12333�. doi: 10.1093/nar/gku939

Waldor, M. K., and Mekalanos, J. J. (1996). Lysogenic conversion by a filamentous phage encoding cholera toxin. Science 272, 1910�. doi: 10.1126/science.272.5270.1910

Wang, P., Zhang, C., Zhu, Y., Deng, Y., Guo, S., Peng, D., et al. (2013). The resolution and regeneration of a cointegrate plasmid reveals a model for plasmid evolution mediated by conjugation and oriT site-specific recombination. Environ. Microbiol. 15, 3305�. doi: 10.1111/1462-2920.12177

Wright, L. D., and Grossman, A. D. (2016). Autonomous replication of the conjugative transposon Tn916. J. Bacteriol. 198, 3355�. doi: 10.1128/JB.00639-16

Yang, W. (2008). An equivalent metal ion in one- and two-metal ion catalysis. Nat. Struct. Mol. Biol. 15, 1228�. doi: 10.1038/nsmb.1502

Young, S. M. Jr., McCarty, D. M., Degtyareva, N., and Samulski, R. J. (2000). Roles of adeno-associated virus Rep protein and human chromosome 19 in site-specific recombination. J. Virol. 74, 3953�. doi: 10.1128/JVI.74.9.3953-3966.2000

Young, S. M. Jr., and Samulski, R. J. (2001). Adeno-associated virus (AAV) site-specific recombination does not require a Rep-dependent origin of replication within the AAV terminal repeat. Proc. Natl. Acad. Sci. U.S.A. 98, 13525�. doi: 10.1073/pnas.241508998

Zabala, J. C., de la Cruz, F., and Ortiz, J. M. (1982). Several copies of the same insertion sequence are present in alpha-hemolytic plasmids belonging to four different incompatibility groups. J. Bacteriol. 151, 472�.

Zhao, A. C., and Khan, S. A. (1997). Sequence requirements for the termination of rolling-circle replication of plasmid pT181. Mol. Microbiol. 24, 535�. doi: 10.1046/j.1365-2958.1997.3641730.x

Zupan, J. R., and Zambryski, P. (1995). Transfer of T-DNA from Agrobacterium to the plant cell. Plant Physiol. 107, 1041�. doi: 10.1104/pp.107.4.1041

Keywords : rolling-circle replication, transposition, conjugal transfer, multifunctional protein, mobile genetic elements, horizontal gene transfer

Citation: Wawrzyniak P, Płucienniczak G and Bartosik D (2017) The Different Faces of Rolling-Circle Replication and Its Multifunctional Initiator Proteins. Front. Microbiol. 8:2353. doi: 10.3389/fmicb.2017.02353

Received: 19 September 2017 Accepted: 15 November 2017
Published: 30 November 2017.

Gregory Marczynski, McGill University, Canada
Hao Luo, Tianjin University, China
Alan Leonard, Florida Institute of Technology, United States

Copyright © 2017 Wawrzyniak, Płucienniczak and Bartosik. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Positive and Negative Stranded RNA Viral Genomes

The ultimate size of the RNA viral genomes is affected by the fragility of RNA and the tendency of their long strands to break. In addition, RNA genomes tend to have higher mutation rates than those composed of DNA because they are copied less accurately. This tendency might have tended to drive RNA viruses towards smaller genomes. Genomes of RNA viruses encode for a limited number of proteins. RNA viral genomes are broadly divided into double stranded RNA, positive and negative strand single stranded RNAs, monopartite and multipartite RNA viruses. One of the primary proteins encoded by all these RNA genomes is RNA dependent RNA polymerase, essential for their replication. A major difference between + and − strand ssRNA viruses is that the RNA polymerase can be immediately translated from ss(+) RNA, whereas it is contained in ss(−) RNA. In monopartite ssRNA viruses, the genome encodes a single polyprotein, which is further processed into a number of small molecules, critical for the completion of the viral life cycle. In multipartite ssRNA genomes, each segment encodes for a single gene.

Positive Stranded RNA Viral Genomes

Among the viruses possessing RNA genomes, single stranded plus sense (+) RNA genomes represent an important subgroup including many pathogenic plant, animal and human viruses. These genomes contain cis-acting RNA elements that direct different viral processes, such as protein translation, genome replication, and subgenomic mRNA transcription mRNAs. Single stranded RNA genomes vary in size from coronaviruses (� kb) to those of phages such as MS2 and Qb(𢏃.5 kb). Although members of distinct families, most (+) sense RNA viruses share common features in terms of their genomes. Importantly, purified (+) sense virus RNA is capable of infecting the host cells in the absence of any viral proteins.

Picornavirus Genome

Picornaviruses are the etiologic agents of numerous diseases with medical and veterinary importance such as Poliomyelitis, common cold, flu, hepatitis, foot-and-mouth disease all are caused by picornaviruses. These viruses have a single-stranded RNA genome of positive polarity in the size of 7200 nucleotides in human rhinoviruses to 8500 nucleotides in foot and mouth disease virus, containing a number of features conserved in all picornaviruses. There is a long 5′ untranslated region (UTR) of 600� nucleotides, which is important for translation, virulence, possibly encapsidation as well as a shorter 3′ UTR of 50� nucleotides, necessary for the (−) strand synthesis during replication. 5′ UTR contains a 𠆌lover-leaf’ secondary structure known as the internal ribosomal entry site (IRES). The rest of the genome encodes a single polyprotein between 2100 and 2400 amino acids. Both ends of the genome are modified, with the 5′ end by a covalently attached small, basic VPg protein of the 23 amino acids length and the 3′ end by poly(rA) tail. Genome replication occurs in a process that uses the (+) strand as a template for the (−) strand synthesis, which, in turn, is used as a template for the production of an excess of (+) strands. Initiation of both (+) and (−) strand RNA synthesis is thought to be primed by a uridylylated form of VPg, VPg-pUpU.

Toga Virus Genome

consists of two genera, alphaviruses, and rubiviruses. Alphavirus genus has 27 members, many of which can be transmitted via insect vectors. Rubella virus is the only known member of the rubivirus. The genome of togavirus consists of a single stranded, non-segmented, + sense RNA of �.7 kb. Capsids of these viruses are composed of 240 copies of a single capsid protein of � amino acids. The envelope contains 2 virus-encoded glycoproteins, E1 and E2. The genome resembles an mRNA with a cap at the 5′ end and polyadenylation at the 3′ end. The 5′ of the genome encodes the nonstructural proteins required for transcription and replication and the 3′ encodes the structural proteins (capsid, E1, and E2). Translation of the non-structural proteins will be done from genomic RNA, resulting in the synthesis of a polyprotein, which is cleaved into the matured proteins. A subgenomic mRNA is derived from the 3′ end of the genome synthesized using an internal promoter on the (−) strand. Translation of the subgenomic mRNA gives three structural proteins, capsid, E1, and E2 respectively (Fig. 1.5 ). The capsid protein is synthesized on free cytoplasmic ribosomes. It is cleaved off co-translationally, exposing a signal sequence which directs the ribosome to the ER membrane where the translation of the remaining E1 and E2 proteins gets completed. E1 and E2 are synthesized in association with the rough ER membrane and are then processed through the Golgi apparatus before being transported to the plasma membrane.

Translation of togavirus subgenomic RNA

Flavi Virus Genome

The family Flaviviridae consisting of three genera, (1) Flaviviruses (yellow fever virus), (2) Pestiviruses (Bovine viral diarrhea virus) and (3) Hepatitis C Viruses. Many of the Flaviviruses are transmitted via insects. These viruses are enveloped and have a + strand genome of 10.5 kb capped at the 5′ and non-polyadenylated 3′ end. 5′ end of the genome encodes for structural proteins and 3′ end encodes for non-structural proteins. The entire viral genome is translated as a single polyprotein, which is cleaved into the mature proteins. No subgenomic RNA is formed.

The genome of Coronavirus

The family Coronaviridae is comprised of 2 genera, coronaviruses, and toroviruses. Inclusion of arteriviruses into this family is recent and is not still widely accepted. Coronaviruses are enveloped with a large 27� kb + strand RNA genome, which is capped at the 5′ end and is polyadenylated at the 3′ end. Approximately the first 60% of the genome from the 5′ end consists of two overlapping open reading frames (ORF1a and ORF1b) that encode the viral RNA-dependent RNA polymerase, proteases, and other non-structural proteins, along with a 60� nucleotide-long “leader” RNA, followed by 200� untranslated nucleotides. ORF 1b is translated via ribosome frame-shifting. The replication or gene expression in these viruses is done by the translation of ORF 1a or 1b to make the viral polymerase and other non-structural proteins, the transcription of (−) strand RNA using the viral polymerase and the synthesis of both full-length viral RNA and subgenomic mRNAs, using (−) strand RNA as template.

All subgenomic mRNAs contain similar leader sequence, derived from the 5′ end of the genome, followed by different regions of the genome, forming a nested set of transcripts, each of them is polyadenylated at the same site. Each of these transcripts is monocistronic and contains a single translation unit, which starts at the first AUG after the leader. The mechanism of the nested transcript synthesis is not completely clear, but it is not via RNA splicing since it occurs in the nucleus and coronaviruses replicate in the cytoplasm. Translation of the subgenomic mRNAs yields structural proteins along with some additional non-structural proteins. The new full-length + strands can either be translated or packaged into new virions. These viruses have a unique crown-like structure as identified in the electron microscope. The envelopes contain two viral glycoproteins M protein (membrane protein), which binds the viral nucleocapsid to the viral envelope during budding and S ( spike protein), which facilitates receptor binding and cell fusion. Entire (+) strand is coated with a nucleocapsid protein, directed to intracellular membranes bearing the M protein. The new virions bud through these membranes and are transported to the cell surface through the Golgi smooth-walled vesicles, which then fuse with the plasma membrane, releasing the virus from the cell, without lysis.

Negative Stranded RNA Viral Genomes

The life cycle of negative-strand RNA viruses differs from that of the other RNA viruses in many ways. Specifically, the genome of (−) RNA viruses is not infectious, and infectious virus particles must also deliver their own RNA-dependent RNA polymerase into the infected cell to start the first round of virus-specific mRNA synthesis. These viral genomes are more diverse than the (+) stranded viruses, possibly because of the difficulties in expression. These organisms tend to have large genomes encoding more genetic information. Because of this, most of these viral genomes are segmented. None of these genomes are infectious in its purified RNA. Although the gene encoding for RNA-dependent RNA polymerase has been found in some eukaryotic cells, most of the uninfected cells do not contain enough RNA-dependent RNA polymerase activity to support virus replication. As the (−) stranded RNA genome cannot be converted into mRNA without the activity of viral polymerase packed in each particle, these genomes are effectively inert.

The non-segmented (−) stranded RNA viruses belongs to the order Mononegavirales having linear, single stranded (−) sense RNA as their genetic material. This order includes four families: Rhabdoviridae, Paramyxoviridae, Filoviridae and the Bornaviridae, comprising a wide variety of human, animal and plant pathogens such as rabies virus, measles virus, canine distemper virus, Rinderpest virus and human, bovine respiratory syncytial viruses as well as the lethal Ebola and Marburg viruses and the recently described Nipah and Hendra virus. These four virus families hold one common factor, that the genetic information in their (−) sense RNA genomes is expressed via transcription of a series of discrete monocistronic mRNAs. Transcription originates from a single polymerase entry site near the 3′ end of the genome and is obligatorily sequential, but attenuation occurs specifically at each gene junction, resulting in a progressive reduction in the transcription of genes that are located further from the promoter. This simple method of regulation supports the order of core genes in the Mononegavirales, which are highly conserved among these four families and the genes whose products are required in large amounts are located proximally to the promoter, whereas those needed in catalytic amounts are distally located. The complete genome sequences of almost all genera belonging to all four families of the Mononegavirales have been determined, which ranges in size from the 𢏈.9 kb of Borna disease virus to �.9 Kb of Ebola virus genome, which is twice the size and contains 5� genes. The genomes of all four virus families contain 4 core genes encoding for a nucleocapsid protein gene (N), phosphoprotein gene (P), matrix protein gene (M) and an RNA-dependent RNA polymerase gene (L). There is a single additional G gene in Rabies virus encoding for the transmembrane attachment and glycoprotein entry. In a few members of this order, three additional genes encoding for transmembrane glycoproteins were also identified. Additionally, a small hydrophobic gene encoding for a protein of unknown function was found in the genus Rubulavirus. Many members of this order encode nonstructural proteins, which are encoded either individually as separate genes or by multiple ORFs within a single gene. The pneumoviral genomes have separate genes encoding for two non-structural proteins involved in evading the host response. These genes are not found in the genomes of avian pneumoviruses. Some rhabdoviruses have a gene between the G and L genes encoding a small nonstructural protein. In the Paramyxovirinae the P gene coding capacity is extended to give rise to a surprising number of polypeptides by utilizing multiple overlapping open reading frames on a single transcript or by co-transcriptional editing. In Pneumovirinae, an M2 gene encoding for an additional transcription factor M2𠄱 protein has been reported with a second overlapping ORF encoding the M2𠄲 protein. Some of the viruses are not strictly (−) sense but are ambisense, ie., they are partly (−) sense and partly (+) sense. Ambisense coding strategies occur in both plant viruses (Tospovirusgenus of the bunyaviruses) and animal viruses (the Phlebovirusgenus of bunyaviruses and arenaviruses).

Genome of Bunyavirus

Viruses in the family Bunyaviridae possess three distinct linear, single-stranded, negative or ambisense RNA segments in their genome named as small (S), medium (M) and large (L). RNA of the S segment is 0.9 kb, codes for the nucleocapsid protein and a non-structural protein NSs, which interferes with innate immunity. M segment RNA is 5.7 kb, encodes for a polyprotein, eventually giving rise to two glycol proteins Gn and Gc. The large segment (L) RNA is 8.5 kb, codes for the transcriptase, replicase p, and the large RNA-dependent RNA polymerase proteins. In addition to polymerase activity, Bunyaviral L proteins have an endonuclease activity which cleaves cellular messenger RNAs for the production of capped primers used to initiate transcription of viral messenger RNAs which is known as cap snatching. Exceptionally, in the members of the Phlebovirus and Tospovirus genera, S segment is rather larger than that of M and L. All the three segments of the genome have the same basic structure with the coding region flanked by untranslated regions (UTRs) at the 5′ and 3′ ends. In common with all (−) sense RNAs, the 5′ ends are not capped and the 3′ ends are not polyadenylated. Even though bunyaviruses are categorized into (−) strand viruses, some of the members have genome segments with an ambisense coding strategy, such as Phlebovirus and Tospovirus with 5′ end of each segment is (+) sense, but the 3′ end is (−) sense.

Bunyaviral genomes are remarkably flexible. UTRs of the three segments can be exchanged or can be drastically shortened, the ORFs within an ambisense segment can be swapped around, the genomes can be lengthened through insertions of epitope tags and additional ORFs and the tripartite genome can even be converted into two-segmented or four-segmented versions. The promoters for replication and transcription of the genome segments are located in the terminal sequences of the UTRs that are largely complementary and form panhandle like structures.

Genome of Rhabdovirus

Rhabdovirus is a member of a special class of viruses with linear single-strand (−) RNA genome, which is completely non-infectious and is complementary to functional, virus-specific, (+) sense messenger RNAs (mRNAs). These viruses are bullet-shaped, ubiquitous in nature with a uniquely broad and highly diversified range of host system comprising of vertebrates, invertebrates, and plants. Two most popular and frequently studied rhabdoviruses are the animal pathogenic Vesicular Stomatitis Virus (VSV) and the human pathogen Rabies Virus. Their genomes are non-segmented and are up to 11 kb. There are 60 nucleotides UTR at the 5′ end and a leader region of approximately 50 nucleotides at the 3′ end of the genome. It contains five genes nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and polymerase (L). Each gene is terminated with a conserved polyadenylation signal with short intergenic regions between these five genes. There are two structural components in Rhabdoviral genomes: a helical ribonucleoprotein core (RNP) and a surrounding envelope. In the ribonucleoprotein, genomic RNA is tightly encased by the nucleoprotein. Two viral proteins, the phosphoprotein and the large protein (L-protein) are associated with the ribonucleoprotein. The glycoprotein forms approximately 400 trimeric spikes which are tightly arranged on the viral surface. Matrix protein is associated both with the envelope and the ribonucleoprotein and may function as the central protein during the rhabdovirus assembly.

Rabies virus, a member of the genus Lyssavirus in the family Rhabdoviridae, is a neurotropic virus that causes fatal encephalitis in warm-blooded animals. This virus has a non-segmented, single-stranded, negative-sense RNA genome that is approximately 12 kb, comprising of the same five genes nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and polymerase (L). encoding for five proteins (Fig. 1.6 ). Gene encoding for the G protein is known to play a predominant role in the pathogenesis of rabies virus. The genome of this virus is tightly encapsidated by a viral nucleoprotein. RNA-N complex will act as the template for the process of transcription and replication performed by the viral specific RNA-dependent RNA polymerase and its cofactor, phosphoprotein. The domestic dog is a primary reservoir and vector of rabies transmission along with other animal species, such as cat, ferret�ger, fox, pig, cattle, donkey. An outbreak of pig rabies emerged in a rural pig farm in Sihui of southern China’s Guangdong Province in March 2011, resulted in the death of 14 pigs. A virulent wild RABV strain, GD-SH-01, was isolated from the brain tissue of a rabid pig, and its complete genomic nucleotide sequence has been determined recently.