Subcellular Life Forms

John Baez

August 12, 2017

Also available in PDF, Postscript and LaTeX, thanks to Stephen Mulraney. Alas, I have drastically updated the 1998 version of this webpage, but these revisions are not yet included in the PDF, Postscript and LaTeX files.

I like biology, but as a mathematician I am drawn to the elegance of the very simplest forms of life: the subcellular life forms. They are so simple, in fact, that even calling them "alive" can be controversial. They lack many of the usual features of life. They don't have cell walls, most of them don't metabolize, and they are all parasitic, depending on other organisms for their ability to reproduce! Some of them even have no genetic code! Many of them cause diseases, but others are crucial to the well-being of their host, and many are so well integrated with their host that it becomes difficult to decide whether they are part of the host or a separate entity.

Indeed, besides my love of elegance and my morbid fascination with parasites, the main reason subcellular life forms appeal to me is that they challenge our naive notion of organisms as entities with clear, well-defined boundaries. It's clear by now that life doesn't respect this simple picture. Whenever a pattern of any sort, however abstract, is able to replicate itself, it does! Typically these patterns overlap and interact in subtle ways, so one can't easily say where one ends and the other begins.

These are the main kinds of subcellular life forms that I know about so far:

I'll say a little about each kind.

I am just beginning to learn about

and which are not exactly "subcellular", but still interesting - and controversial!

It's hard to classify these life forms, since they don't all have "species" in the usual sense, and they don't fit into the standard "kingdoms" of cellular life - a classification scheme which is itself somewhat outdated. In my first attempts to understand the taxonomy of subcellular life, I was greatly aided by Diener and Prusiner's 1987 article The recognition of subviral pathogens [MM]. But in subsequent decades knowledge of these creatures has grown vastly, thanks to work on biotechnology. Now there's even a special committee devoted to their classification, called the International Committee on Taxonomy of Viruses (though they also tackle some other subcellular life forms). They even have a nice online database explaining their classification scheme.

But beware! People still argue about the correct classification of subcellular life forms. That's part of what's interesting about them: they really stretch our ideas in biology to the breaking point.

One thing to keep in mind: these life forms are small. Remember that DNA is a double helix containing information in the form of AT and CG "base pairs" - that is, paired molecules of adenine and thymine, or cytosine and guanine. Single-stranded RNA is a single helix containing information in the form of A, U, C, and G "bases" - molecules of adenine, uracil, cytosine and guanine. The human genome is made of DNA and contains about 5 billion base pairs. The genome of a bacterium is also made of DNA but has less than 10 million bases. The potato spindle tuber viroid, on the other hand, is nothing but a circular loop of RNA consisting of 359 bases! Small, simple - but effective!

The potato spindle tuber viroid is the smallest naturally self-replicating bit of RNA. "Spiegelman's Monster" is even smaller - it consists of just 220 bases! But this entity survives only under artificial conditions. The story of this monster is fascinating. Here is Paul Davies' [Da] account of it:

The Qb virus doesn't need anything as complicated as a cell in order to replicate: a test tube full of suitable chemicals is enough. The experiment, conducted by Sol Spiegelman of the University of Illinois, consisted of introducing the viral RNA into a medium containing the RNA's own replication enzyme, plus a supply of raw materials and some salts, and incubating the mixture. When Spiegelman did this, the system obligingly replicated the strands of naked RNA. Spiegelman then extracted some of the freshly synthesized RNA, put it in a separate nutrient solution, and let it multiply. He then decanted some of that RNA into yet another solution, and so on, in a series of steps.

The effect of allowing unrestricted replication was that the RNA that multiplied fastest won out, and got passed on to the "next generation" in the series. The decanting operation therefore replaced, in a highly accelerated way, the basic competition process of Darwinian evolution, acting directly on the RNA. In this respect it resembled an RNA world.

Spiegelman's results were spectacular. As anticipated, copying errors occurred during replication. Relieved of the responsibility of working for a living and the need to manufacture protein coats, the spoon-fed RNA strands began to slim down, shedding parts of the genome that were no longer required and merely proved to be an encumbrance. The RNA molecules that could replicate the fastest simply out-multiplied the competition. After seventy-four generations, what started out as an RNA strand with 4,500 nucleotide bases ended up as a dwarf genome with only 220 bases. This raw replicator with no frills attached could replicate very fast. It was dubbed Spiegelman's monster.

Incredible though Spiegelman's results were, an even bigger surprise lay in store. In 1974, Manfred Eigen and his colleagues also experimented with a chemical broth containing Qb replication enzyme and salts, and an energized form of the four bases that make up the building blocks of RNA. They tried varying the quantity of viral RNA initially added to the mixture. As the amount of input RNA was progressively reduced, the experimenters found that, with little competition, it enjoyed untrammeled exponential growth. Even a single RNA molecule added to the broth was enough to trigger a population explosion. But then something truly amazing was discovered. Replicating strands of RNA were still produced even when not a single molecule of viral RNA was added! To return to my architectural analogy, it was rather like throwing a pile of bricks into a giant mixer and producing, if not a house, then at least a garage. At first Eigen found the results hard to believe, and checked to see whether accidental contamination had occurred. Soon the experimenters convinced themselves that they were witnessing for the first time the spontaneous synthesis of RNA strands form their basic building blocks. Analysis revealed that under some experimental conditions the created RNA resembled Spiegelman's monster.

In 1997, further experiments by Eigen and Oehlenschlager [EO] showed that Spiegelman's monster eventually evolves (under the same unnatural conditions) to two kinds of RNA, one consisting of 54 bases and one consisting of only 48!

One thing this shows is that viruses will always evolve towards becoming smaller if it helps them reproduce faster. Experiments by Turner [T] have shown that viruses will gladly "cheat" and drop some of their genes if there are enough other viruses of the same sort around doing the work these genes allow. This leads to some interesting problems in game theory, related to the so-called Prisoner's Dilemma. This tendency of viruses to shrink to the bare minimum may explain how satellites came into existence.

By the way: please email me if you find mistakes in this webpage, or if you know any more fascinating facts about subcellular life forms - especially if you know kinds that aren't on this list! It will take me a long time to reply, but I eventually will. I would like to thank Daniele Focosi for doing this, and urge everyone to look at his webpage on the physiology of subcellular life forms. I also thank Axel Boldt.


Diener and Prusiner define a virus to be a "small infectious pathogen composed of one or more nucleic acid molecules usually surrounded by a protein coat." They typically reproduce by latching onto the wall of a cell and inserting their genetic material - i.e., the nucleic acids - into the cell. This genetic material then uses the cell's machinery to make more copies of the virus. Typically, these copies overrun the cell until it bursts. However, the actual life cycle of a virus is often more complicated than this thumbnail sketch! Viruses use a large number of sneaky tricks to overcome the defense mechanisms of the cell.

Apart from their intrinsic interest, viruses are important because they cause many diseases among humans, such as:

as well as diseases of domesticated animals and plants. For a detailed tour, try The Big Picture Book of Viruses, available online. For even more information, try the chapter on virology prepared by Margaret and Richard Hunt as part of a wonderful online textbook called Microbiology and Immunology.

There is by now a standard taxonomy of viruses [F], [CT] [Ma2], [Re]. Here, however, I will content myself with a rough classification of viruses into following 3 sorts:


DNA viruses

The genome of a DNA virus is a single molecule of DNA, either linear or circular. Outside the host cell, this DNA is usually surrounded by a protein coat. There are 5 known families of DNA viruses affecting humans. The size and structure of the DNA viruses varies widely, from small ones with only 5,000 base pairs to the large brick-shaped or ovoid pox viruses, which have a lipid coating and whose DNA has between 120,000 and 360,000 base pairs.

One can broadly classify the DNA viruses into two kinds:

If you click on the options above you'll see that most known DNA viruses are double-stranded, including all the DNA viruses that affect humans.

RNA viruses

The genome of an RNA virus is usually a single molecule of RNA, either linear or circular, but some contain up to a dozen molecules of RNA. Outside the host cell, this RNA is protected by a protein coat. Most viruses are RNA viruses. There are 13 known families of RNA viruses affecting humans. RNA viruses range widely in morphology and size, with their genome containing anywhere from 1,700 to 60,000 nucleotides.

The smallest RNA virus, the hepatitis delta agent (HDV), is quite different from all the rest. With only about 1,700 nucleotides, its genome is much smaller than that of any other virus. Like a virusoid, it's a circular loop of RNA that can only reproduce in cells infected by a helper virus, the hepatitis B virus. But unlike a virusoid, it codes for its own protein coat. Its genome is also much bigger than those of virusoids, which have only about 350 nucleotides. In these ways it's more like a viroid. However, viroids only infect plants!

One can broadly classify RNA viruses into:

A "positive-sense" RNA virus consists of single-stranded RNA that functions directly as messenger RNA in the host cell, so that ribosomes in the host cell synthesize various proteins needed by the virus when encountering this RNA. A "negative-sense" RNA virus consists of single-stranded RNA that does not function as messenger RNA, since it contains the complementary base pairs. Negative-sense RNA viruses carry enzymes with them into the host cell to synthesize messenger RNA from the RNA in the virus. "Double-stranded" RNA viruses have both positive-sense and negative-sense strands. For some reason these are more likely to consist of several separate pieces of RNA. If you click on the options listed above, you'll see examples of these different kinds of RNA viruses.

Reverse transcribing viruses

Reverse transcribing viruses behave quite differently from the other viruses described above. There are two main kinds of reverse transcribing viruses:

If you click on these options you'll see examples of both kinds.

RNA reverse transcribing viruses are usually called "retroviruses". They have a single-stranded RNA genome. They infect animals, and when they get inside the cell's nucleus, they copy themselves into the DNA of the host cell using reverse transcriptase. In the process they often cause tumors, presumably by damaging the host's DNA.

Retroviruses are important in genetic engineering because they raised for the first time the possibility that RNA could be transcribed into DNA, rather than the reverse. In fact, some of them are currently being deliberately used by scientists to add new genes to mammalian cells.

Retroviruses are also important because AIDS is caused by a retrovirus: the human immunodeficiency virus (HIV). This is part of why AIDS is so difficult to treat. Most usual ways of killing viruses have no effect on retroviruses when they are latent in the DNA of the host cell.

From an evolutionary viewpoint, retroviruses are fascinating because they blur the very distinction between host and parasite. Their genome often contains genetic information derived from the host DNA. And once they are integrated into the DNA of the host cell, they may take a long time to reemerge. In fact, so-called "endogenous retroviruses" can be passed down from generation to generation, indistinguishable from any other cellular gene, and evolving along with their hosts, perhaps even from species to species! It has been estimated that up to 1% of the human genome consists of endogenous retroviruses! Furthemore, not every endogenous retrovirus causes a noticeable disease. Some may even help their hosts [LA], [V].

It gets even spookier when we notice that once an endogenous retrovirus lost the genes that code for its protein coat, it would become indistinguishable from an LTR retrotransposon - one of the many kinds of "junk DNA" cluttering up our chromosomes. Just how much of us is made of retroviruses? It's hard to be sure.

So much for retroviruses... what about DNA reverse transcribing viruses? These have a DNA genome, and instead of reverse transcriptase copying the RNA of the free-floating virus to the DNA of the host, they work the other way around. When they've infected the host cell, there is a lot of RNA floating around; they use reverse transcriptase to package themselves as DNA when they leave the cell!

There aren't many DNA reverse transcribing viruses. Most of them are relatives of the hepatitis B virus (HBV). This virus attacks liver cells, and can cause tumors. Unlike a typical DNA virus, the hepatitis B virus consists of both single-stranded and double-stranded DNA. It's also smaller than any DNA virus: its nucleotide consists of only about 2,400 base pairs.

I wonder why the viruses affecting the liver are so strange and diverse. The hepatitis B virus is quite unusual, and I've already mentioned its even more bizarre symbiote (or parasite): the hepatitis delta agent (HDV), which is the smallest RNA virus, and different from all the rest. There are also four other forms of hepatitis: positive-sense RNA viruses called HAV, HCV, HEV, and HGV. None of these are in the same family! Apparently all they have in common is that they attack the liver.


A viroid is defined to be a "small infectious pathogen composed entirely of a low molecular weight RNA molecule". Thus, unlike a virus, a viroid has no protein coat. It is nothing but a single-stranded circular loop of RNA! Most viroids consist of about 250 to 375 nucleotides, much smaller than a typical virus. Also, viroids don't function as messenger RNAs, so they don't make the cell synthesize enzymes: they rely completely on pre-existing enzymes in the host for their reproduction.

Most known viroids cause diseases in plants. The first viroid was discovered in 1971, by Diener. It's called the potato spindle tuber virus (PSTV), since it causes a disease that makes potatos abnormally long and sometimes cracked. At the time, Diener's isolation of the viroid causing this disease met with some skepticism, since it was so much smaller than any known virus. By 1991, however, at least 15 plant diseases had been traced to viroids. There are also 2 viroids known, the hop latent viroid (HLV) and a viroid living in grapevines, that cause no known symptoms! This raises the fascinating possibility that there could be more such viroids lurking around.

The complete molecular structure of many viroids has been worked out, which has allowed a classification of viroids on the basis of their RNA sequences. Roughly speaking, there are a large family of viroids that share many features with PSTV, together with one viroid that seems very different: the avocado sunblotch viroid (ASBV). McInnes and Simons have proposed a further classification of the PSTV-type viroids into three kinds [Ma1].

It is clear from these RNA sequences that viroids are not "degenerate viruses", as had once been thought. They are quite different from any known viruses. One interesting theory is that they arose from RNA that escaped from cell nuclei.

It's also interesting that all viroid diseases have been detected in the 20th century, some quite recently - in contrast to diseases caused by viruses. Also, many viroid diseases have been spreading after their discovery, often due to human activity. A fascinating example is the coconut cadang-cadang viroid (CCCV), a disease of coconuts which has been spreading throughout the Phillipines. On the island of Luzon, a puzzling feature of this disease was that it only affected crops owned by speakers of Bicalano, while adjacent crops owned by speakers of Tagalog went unharmed! Eventually people realized that the viroids were spread by workers cutting the palms. Tagalog owners prefer to hire Tagalog workers, while Bicalanos hire Bicalanos, some of whom came from an area where the disease was prevalent. (See the article by Maramarosch entitled The cadang-cadang viroid disease of palms [Di].)

Because of this sort of epidemiology, Diener has suggested that viroids may be latent to their native host plants (like HLV), becoming pathogenic only when transferred to other species thanks to agriculture. Indeed, the viroid causing tomato "planta macho" disease in Mexico, TPMV, has also been found in wild plants there. Also, an avocado plant will sometimes seem to "recover" from ASBV by sending up a new shoot. This new shoot is still infected with the viroid, but it shows no symptoms other than reduced fruit yield. Descendants of such a "recovered" tree are also infected with the viroid, and also symptomless, except for reduced fruit yield. Thus the avocado appears able to "come to terms" with the viroid in some way. Personally, I'd like to raise this possibility: that some viroids actually play a beneficial role in their native host plants! This may seem surprising, but when we compare the behavior of plasmids, it may seem less so.


A satellite is a "sub-viral agent composed of nucleic acid molecules that depends for its reproduction on co-infection of a host cell with a helper virus". In other words, just as a planet can have a moon orbiting it, a virus can have a satellite orbiting it!

There are various kinds of satellites:

A "satellite virus" is a satellite whose genome codes for the protein coat in which it is encapsidated. All known satellite viruses have a genome made of single-stranded RNA. There are only five known kinds. A good example is the tobacco mosaic satellite virus, which goes along with the well-known tobacco mosaic virus (TMV).

A "satellite nucleic acid" is a satellite whose genome does not code for a protein coat; instead, it hides in the protein coat of its virus helper! A satellite nucleic acid can consist of either DNA or RNA. All known satellite DNAs are single-stranded, but satellite RNAs can be single- or double-stranded.

Single-stranded satellite RNAs are fascinating because they include the very smallest forms of life known - the virusoids! These are circular loops of single-stranded RNA containing about 350 nucleotides. They can only reproduce in cells that have been infected by their helper virus, because they use some of the RNA of the helper virus to reproduce. The helper virus is typically an RNA virus which causes a disease of plants and consists of about 4500 nucleotides.

One might be tempted to say that a virusoid is a parasite of its helper virus. But it's not always so simple. Sometimes the helper virus is unable to reproduce unless the virusoid is present! Then we have symbiosis rather than parasitism.

The first virusoids were discovered in the early 1980s in Australia, associated with viruses causing diseases such as velvet tobacco mottle (VTMoV), solanum nodiflorum mottle (SNMV), lucerne transient streak (LTSV), and subterranean clover mottle (SCMoV).

An interesting theory about the origin of virusoids is that in plants infected with both viruses and viroids, the viroids got encapsidated in the viruses and later lost their ability to reproduce independently.

An easy way to learn more about viroids and satellites is to read the online course notes by the plant pathologist Zhongguo Xiong. In fact, if you like subcellular life forms, his whole course on plant virology is worth reading!


A plasmid is a "small autonomously replicating circular molecule of DNA that is devoid of protein and not essential for the survival of its host". Plasmids range in size greatly, from about 4350 to 240,000 base pairs. Most known plasmids infect bacteria, but some infect plant and animal cells. They often copy themselves into the DNA of the host cell, and many carry genetic traits from one cell to another. Most plasmids keep a limit on the number of copies of themselves they keep in each host - the so-called "copy number", which ranges from 1 to about 40. Many plasmids are "conjugative". This means they can transfer copies of themselves from one host to another by forcing the host to undergo "conjugation" - a form of sex in which genetic material is exchanged between bacteria.

People tend not to speak of plasmids as "life forms" quite as often as they do with viruses. In part this may be because plasmids are sometimes beneficial to their host cells, rather than pathogenic.

However, is difficult for me to resist the impression that plasmids are just as "alive" as viruses. Indeed, some viruses become plasmids when parts of them are missing! For example, the "lambda bacteriophage" is a virus that infects the intestinal bacterium E. coli, but "lambda dv particles", which arise from the lambda phage simply by deleting some DNA, are plasmids. The lambda phage multiplies inside its host and then kills it by "lysis", which destroys the cell membrane and releases lots of copies. The lambda dv particles, on the other hand, stays in the cell in a fairly stable number of copies and does not kill its host. The difference is that while the lambda dv particles contain genes for replication, they lack genes for lysis and the protein coat.

If we think of plasmids as life forms, we must admit that they are very successful. Many plasmids spread so thoroughly in cultures of bacteria that less than one cell in 100,000 lacks a copy! Some kinds of plasmids contain genes that help make sure copies are efficiently passed on to both daughter cells when the host cell divides. F plasmids have a particularly clever mechanism - they temporarily inhibit cell division when they have not yet replicated inside the host!

Plasmids are diverse and very interesting. Some important kinds are:


While they don't quite fit under this heading, I can't resist also mentioning


These are man-made entities based on plasmids, used in biotechnology. Are they alive? You judge.

Some good books on plasmids include Plasmids by Paul Broda [B], Bacterial Plasmids by Kimber Hardy [H], and Plasmids of Eukaryotes: Fundamentals and Applications by K. Esser et al [E].

R Plasmids

R plasmids were first discovered in Japan in 1957. In Japan, dysentery was treated with sulphonamide until about 1950. Then, more and more strains of the bacteria causing dysentery became resistant to this antibiotic, rapidly rendering it ineffective. Doctors then began using tetracycline, streptomycin and chloramphenicol. By 1957, 2% of the bacteria causing dysentery were resistant to one or more of these drugs, and by 1960, 13% were resistant. It turned out that R plasmids were the culprit!

R plasmids contain genes that give their bacterial hosts resistance to antibiotics as well as to poisonous metal ions such as arsenic, silver, copper, mercury, lead, zinc and so on. Because many R plasmids are conjugative, this resistance can spread from one bacterium to another. Because they can live in more than one species of bacteria, R plasmids can also spread resistance between bacteria of different species!

Spread of resistance to antibiotics is now a major problem in medicine. Drugs which were used for many years to control bacterial diseases are now becoming helpless against new resistant strains. The problem has been made worse by the tendency for doctors and veterinarians to use antibiotics when they aren't strictly necessary, for example as part of livestock food. As a result an environment is created where bacteria with resistance have a great competitive advantage, so they spread rapidly.

It has also recently been found that weeds growing near crops that were genetically engineered to resist herbicides can acquire this trait. I'm not sure, but I suspect that this happens via plasmids as well.

R plasmids make it clear that the idea of evolution as a battle between species with separately evolving genomes is a great oversimplification. Instead, genetic communication and cooperation between different species can be very important.

F Plasmids

F plasmids live in the bacterium E. coli and were discovered in the 1920s. An F plasmid contains genes that make the cell membrane of its host form long tubes. These tubes, called "sex pili", attach themselves to other E. coli and puncture their cell membranes. The F plasmid then duplicates and a copy passes from the original host to the new host. A clever system has evolved to ensure that the sex pili of a given bacterium never attach to itself.

F plasmids give their hosts no known traits besides these sex pili. The evolutionary origins of sex are much debated these days; we see here the fascinating possibility that sex can originate as a kind of disease whose sole function is to spread a parasite!

Colicin Plasmids

Colicin plasmids contain genes that give their host bacterium a certain small probability of bursting open and releasing chemicals called "colicins". These chemicals kill other bacteria by rendering their cell membranes permeable to important ions. There are many strains of colicin plasmid. Each one confers immunity only to the particular sort of colicin it produces. Different strains of colicin plasmid are "incompatible", meaning that a given strain bacterium cannot stably contain both.

In short, different strains of colicin plasmid compete with each other using the resources of their hosts. A colicin plasmid will confer an advantage to its host bacteria if the other strains of bacteria nearby do not have a colicin plasmid. However, when there are many different strains of colicin plasmid present, all strains of host bacteria suffer. Thus there is a certain similarity between colicin plasmids and "protection rackets" run by Mafia-like gangs.

Colicin plasmids are not the only sort of plasmids that exhibit incompatibility. Similar plasmids tend to be incompatible with each other, while drastically different plasmids are usually compatible. One theory is that incompatible plasmids use the same mechanisms to maintain their copy number. In a cell containing two incompatible sorts of plasmid, their reproduction is blocked until the total number of copies of the two together drops to the copy number of each one. This is an unstable situation, especially for plasmids with a low copy number, so eventually descendants of the host cell contain only one or the other plasmid.

Virulence Plasmids

Virulence plasmids contain genes that make their bacterial hosts more virulent to their hosts. A familiar example involves the bacterium E. coli, which inhabits the human large intestine. Certain strains of E. coli contain plasmids whose genes make the E. coli synthesize toxins that cause diarrhea. These "enterotoxigenic strains" of E. coli are probably an important cause of diarrhea among travellers. More seriously, in developing countries, diarrhea is one of the principal causes of death among those under five.

"Vibrio cholerae", the cause of cholera, is a bacterium whose genes code for a diarrhea-causing toxin. The DNA of these genes is closely related to the DNA of certain virulence plasmids infecting E. coli - so closely that there is almost certainly a common ancestor. For example, Vibrio cholerae could have evolved from an earlier bacterium by permanently integrating the DNA from a virulence plasmid into its genome.

Strains of bacteria and viruses often become less virulent as they coevolve with their hosts. Thus one may wonder what evolutionary advantage a virulence plasmid could confer to the bacteria containing it. In the case of bacteria causing diarrhea, there is an obvious possibility: diarrhea can serve as a mechanism for spreading the bacteria - and their plasmids - that cause it!

Metabolic Plasmids

Metabolic plasmids contain genes that let their bacterial hosts metabolize or degrade otherwise indigestible or toxic chemicals. For example, the bacterium Pseudomonas putida is able to grow on a wide range of organic compounds that are toxic to most bacteria, including toluene, octane, camphor, napthalene and nicotinic acid! It does this with the help of genes contained by metabolic plasmids called TOL, OCT, CAM, NAH and NIC plasmids.

It's worth noting that some of these chemicals are secreted by plants as part of a defense against bacteria. Thus we probably have a kind of natural chemical arms race going on here. Other metabolic plasmids allow bacteria to degrade herbicides like 2,4-D, as well as certain detergents! People are investigating the use of such plasmids to help biodegrade pollution.

Tumor-Causing Plasmids

"Crown gall" is a cancer of plants caused by a bacterium known as Agrobacterium tumefaciens. But actually, the disease is caused by a plasmid having this bacterium as its host! When the plasmid passes from the bacterium to the cells of infected fruit trees, some of the genes contained in the plasmid cause tumors. Do these tumors help spread the bacteria to other trees?

Cryptic Plasmids

Cryptic plasmids are plasmids that have no known effect on their hosts. How much of this is our ignorance, and to what extent is being truly "cryptic" a successful strategy?


Cosmids are man-made circular loops of DNA containing plasmid DNA together with an arbitrary sequence of up to 45,000 base pairs of DNA. They are constructed by recombinant DNA techniques and then packaged in lambda phage protein coats. They are used to transfer genes to bacteria.

The lambda phage is a virus that specializes in invading bacteria such as E. coli. In nature, its protein coat latches onto the bacterial cell membrane and injects the phage DNA into the bacterium. Biotechnologists have taken advantage of this by using the lambda phage protein coat to inject a cosmid into the bacterium! Once inside, the cosmid replicates like a plasmid and, like a plasmid, integrates its DNA into the genome of the bacterium.


Phasmids are man-made linear DNA molecules whose ends are sequences taken from the lambda phage, while the middle is a sequence taken from a plasmid, together with a sequence of whatever DNA one wants. Like cosmids, they are constructed by recombinant DNA techniques and packaged in lambda phage protein coats, and used to transfer genes to bacteria. However, both the lambda phage and plasmid replication functions are intact. In particular, they contain the lambda phage genes for "lysis", the process whereby a virus dissolves the cell membrane of its host. Depending on the conditions, the phasmid can act either like a phage or a plasmid - hence its name.


Transposons, or "transposable elements", are sequences of DNA that move within their host's genome from one position to another. They were first discovered in the 1940s by Barbara McClintock, who later won the Nobel prize for this work. They exist in all known organisms, often in large quantities. Their main "function" appears to be simply their own self-replication, rather than any benefit to the host, or even any direct effect whatsoever on the host phenotype. For this reason, people sometimes refer to transposons as "selfish DNA".

In addition to transposons, there is plenty of other DNA in our chromosomes that doesn't seem to code for proteins. This is sometimes called "junk DNA". It comes in various distinct forms, such as "introns", "satellite DNA", and "pseudogenes". In fact, junk DNA makes up about 97% of the human genome! Clearly despite its derogatory name, it's worth understanding and potentially very important. However, since transposons are the most "organism-like" of junk DNA, I will only talk about them here.

There is a fair amount of genetic evidence that transposons spread "horizontally" between sexually isolated species in addition to being "vertically" passed down the evolutionary tree. However, the mechanisms of this horizontal transmission are poorly understood. One interesting fact is that certain viruses, the baculoviruses, can pick up and accomodate transposons from their hosts. They have been proposed as a possible mechanism for horizontal transmission of transposons.

The two main classes of transposons are:

and The best book on transposons seems to be Dynamics and Evolution of Transposable Elements, by Pierre Capy, Claude Bazin, Dominique Higuet, and Thierry Langin [CBHL]. In this book, retrotransposons are called "Class I elements", while DNA transposons are called "Class II elements". They also discuss "Class III elements". This seems to be a grab-bag consisting of transposons that don't clearly fit into the other two categories. Examples include the "Foldback" elements in fruit flies, the "Tu" elements in sea urchins, and "MITEs", or "miniature inverted repeat transposable elements", which are found mainly in plants and fungi.


Retrotransposons copy themselves from one location in the host genome to another using an RNA intermediate, with the help of reverse transcription from RNA to DNA. This process is called "transposition".

A rough classification of retrotransposons divides them as follows:

LTR retrotransposons are 5000-9000 base pairs long and have "long terminal repeats" - repeating sequences of base pairs at both ends. Between these are the genes needed for transposition. These code for enzymes like reverse transcriptase (which copies RNA into DNA), integrase (which integrates the DNA into the host chromosome), and so on. In all these respects, LTR retrotransposons are very similar to retroviruses. The most important difference is that retrotransposons do not code for the proteins forming the viral protein coat. There seems to be some debate as to whether retrotransposons are retroviruses that have somehow lost their ability to code for a protein coat, or whether retroviruses are retrotransposons that have somehow gained this ability. Of course, the two possibilities aren't mutually exclusive!

Why do LTR retrotransposons and retroviruses have long terminal repeats? People know the answer for retroviruses. If one of these rascals creates an exact DNA copy of its RNA and sticks it into the host genome, the host won't return the favor and create an exact RNA copy of that DNA, because it won't copy various bits that don't code for proteins, like the "promoter" - the bit of DNA that tells the host to create an RNA copy! To make up for this, the virus has to do some complicated tricks. Among other things, it creates duplicate copies of some of its RNA in the host DNA. As this happens over and over, long repetitive sequences build up in the host DNA: the long terminal repeats.

The precise process is pretty complicated - too complicated for me to explain here. If you're interested, you can watch a movie of how long terminal repeats get formed by retroviruses! I guess it works similarly for LTR retrotransposons, but I'm not really sure.

As the name suggests, non-LTR retrotransposons lack terminal repeats. They have been divided into LINEs and SINEs. LINEs have a characteristic adenosine-rich sequence at one end, and are generally 5000-8000 base pairs long, though truncated versions are common. They code for various enzymes such as reverse transcriptase and RNase. The genomes of higher animals and plants may have over 10,000 copies of LINEs. In fact, about 21% of the human genome consists of LINEs!

SINEs are usually shorter than 500 base pairs. The source of the enzymes needed for the mobility of SINEs is not yet known - but perhaps it is LINEs! Higher animals and plants may have over 100,000 copies of SINEs.

DNA transposons

DNA transposons mainly move using a cut-and-paste mechanism: they code for an enzyme called a "transposase" that catalyzes a process in which the transposon DNA is excised and reinserted elsewhere in the host genome. Thus RNA and reverse transcriptase plays no role in their life cycle. As far as I can tell they don't reproduce themselves, only move around. Given this, they probably don't deserve to be called "alive"!


Prions are small, proteinaceous infectious particles that contain no detectable nucleic acid of any form, but are transmissible among certain animals, where they cause fatal brain diseases. These particles are rod-shaped, about 165 nanometers long and about 11 nanometers in diameter, and they consist largely of a protein called PrPSc, having molecular weight 33,000-35,000. They are able to resist inactivation by boiling, acid (pH 3-7), ultraviolet radiation (254 nm), formaldehyde, and nucleases! They can be inactivated by boiling in detergents, alkali (pH > 10), autoclaving at 132 degrees centigrade for over 2 hours, and denaturing organic solvents such as phenol.

Stanley Prusiner won the Nobel prize for medicine in 1997 for his work on prions. His theory is that prions are a modified form of a protein naturally occuring in the brain (PrP), and that this modified form can arise from a cell mutation, but then spread by means of a kind of autocatalyzed chain reaction. This theory was initially very controversial, because all other self-reproducing biological entities appear to contain RNA or DNA. There are still many doubters. In the earlier literature prions are sometimes called "slow viruses", because of their slow effect. However, no virus has ever been associated with prion diseases.

Prions have recently received a lot of publicity as the cause of "mad cow disease", technically known as bovine spongiform encephalopathy. Starting in the mid-1980s, this disease infected thousands of cattle in England, in part because they were being fed offal containing nerve tissue from sheep infected with a prion-caused disease called "scrapie". People got worried that eating meat from cows with bovine spongiform encephalopathy could cause a prion-induced brain disease in people. This caused an enormous uproar.

There are already a number of prion-induced brain diseases in people, such as Creutzfeldt-Jakob disease (which occurs spontaneously in about one in a million people) and kuru (transmitted by means of cannibalism among the Fore tribe in New Guinea). There are also prion-induced brain diseases in mink, cats, deer and moose.

You can get a lot of information about prions from the prion science archives. There are also a couple of online courses on virology at All the Virology on the WWW, and these include a nice lecture on prions.


Mycoplasmas are bacteria with a very small genome and no cell wall! They are smaller than 450 nanometers in diameter. One of them, Mycoplasma pneumoniae, has been completely sequenced, it has a genome with about 800,000 base pairs, as compared to 10 million for a typical bacterium. It is thought to be responsible for both tracheobronchitis and primary atypical pneumonia, but there is a lot of controversy surrounding it. In particular, some people argue that it has connections with Gulf War Syndrome and AIDS, while others (who sound a bit nutty to me) claim it has been genetically engineered to become more virulent.

Another distinctive feature of mycoplasmas is their response to the codon UGA - that is, uracil/guanine/adenosine, a triplet of bases in their DNA. When they read this they make the amino acid tryptophan. All other bacteria use UGA as a "stop codon" - that is, a signal that a given gene has ended. In this respect mycoplasmas are like mitochondria (the "powerhouses of the cell"), which also produce tryptophan when they read UGA. Since mitochondria were once organisms on their own, which became symbiotes and eventually merged with the cells of animals and plants, could this be a clue that mycoplasmas are more closely related to mitochondria than to other bacteria?

Two good sources of information on mycoplasmas are Katherine Howard's webpage and Joel Baseman and Joseph Tully's article in the journal Emerging Infectious Diseases, entitled Mycoplasmas: sophisticated, reemerging, and burdened by their notoriety. A lot of articles on mycoplasmas are available from the Rain-Tree website.


Nanobacteria are the most mysterious of all the things described on this webpage. They might not exist at all! The idea is that they are single-celled organisms that are much smaller than other bacteria: between 50 and 200 nanometers in size.

The idea of nanobacteria appears to have been conceived by the geologist Robert Folk in 1988. Using a scanning electron microscope to examine mineral deposits in hot springs, he saw "hordes of tiny bumps and balls" and theorized that they were very small single-celled organisms. Eventually he decided that nanobacteria were very important for geology and may even form most of the earth's biomass! For more on this, read his paper Nanobacteria: surely not figments, but what under heaven are they?

In 1998, two Finnish scientists claimed to have grown nanobacteria in a culture, but others have argued that their results are flawed and that these small balls are formed by purely chemical (i.e. nonliving!) processes. So, as far as I can tell, the existence of nanobacteria remains hotly disputed. A great source of information on this puzzle is the nanobiology webpage of the journal naturalSCIENCE.


Here are some good books and articles to read about this stuff:

[B] Plasmids, by Paul Broda, W. H. Freeman, San Francisco, 1979.

[CBHL] Dynamics and Evolution of Transposable Elements, by Pierre Capy, Claude Bazin, Dominique Higuet, and Thierry Langin, Landes Bioscience, 1998.

[CHV] Retroviruses, John M. Coffin, Stephen H. Hughes, and Harold E. Varmus, Cold Spring Harbor Laboratory Press, Plainview, New York, 1997.

[CT] Principles of Bacteriology, Virology and Immunity, vol. 4: Virology, edited by L. H. Collier and M. C. Timbury, 8th edition, Decker, 1990.

[Da] The Fifth Miracle, Paul Davies, Simon and Schuster, New York, 1999, pp. 127-128.

[Di] The Viroids, edited by Theodore Otto Diener, Academic Press, 1985.

[EO] 30 years later - a new approach to Sol Spiegelman's and Leslie Orgel's in vitro evolutionary studies: dedicated to Leslie Orgel on the occasion of his 70th birthday, M. Eigen and F. Oehlenschlager, Orig. Life Evol. Biosph. 5-6 (1997), 437-457.

[E] Plasmids of Eukaryotes: Fundamental and Applications, K. Esser et al, Springer-Verlag, New York, 1986.

[F] Virology, 2 volumes, edited by Bernard N. Fields, David M. Knipe and Peter M. Howley, Lippincott-Raven Publishers, 3rd edition, 1996.

[H] Bacterial Plasmids, Kimber Hardy, American Society for Microbiology, Washington D.C., 1986.

[LA] Beneficial role of human endogenous retroviruses: facts and hypotheses, E. Larsson and G. Andersson, Scand. J. Immunol. 48 (1998), 329-338.

[MM] Subviral Pathogens of Plants and Animals: Viroids and Prions, edited by K. Maramorosch and J. J. McKelvey, Jr.., Plenum Press, 1987.

[Ma1] Viroids and Satellites: Molecular Parasites at the Frontier of Life, edited by Karl Maramarosch, CRC Press, 1991.

[Ma2] The Atlas of Insect and Plant Viruses: Including Mycoplasmaviruses and Viroids, edited by Karl Maramorosch, Academic Press, 1977.

[Mo] The Evolutionary Biology of Viruses, edited by Stephen S. Morse, Raven Press, 1994.

[Re] Virus Taxonomy: the Classification and Nomenclature of Viruses, Seventh Report of the International Committee on Taxonomy of Viruses, edited by M. H. V. van Regenmorel et al., Academic Press, 2000.

[Ro] Plant Infectious Agents: Viruses, Viroids, Virusoids, and Satellites, edited by Hugh D. Robertson et al., Cold Spring Harbor Laboratory, Plainview, New York, 1983.

[T] Escape from prisoners dilemma in RNA phage 6, Paul E. Turner and L. Chao, American Naturalist 161 (2003), 497-505.

[V] On viruses, sex, and motherhood, Luis P. Villareal, J. Virology 71 (1997), 859-865. © 2005 John Baez