Skip to content

Synthetic Life Part I – Making Genomes

May 30, 2011

This post considers the rapidly developing field of synthetic biology, which, if taken to its logical limits, will ‘create life’. And if this is possible at all, it will allow much more than merely copying what natural evolution has already provided us. In effect, it will foster an experimental means for testing some of the biological ‘dark matter’ questions raised in a previous post. But before we get there, this post will be devoted to considering some recent developments in the field….

Synthetic Genomes à la Mode

For tens of thousands of years, humans viewed the creation of life as the preserve of gods or (sometimes) dark sorcerers. More commonly, any notion of humans themselves being capable of life creation was regarded as the height of hubris, the ultimate in human folly. But at the present time, it is not hubris, and research in this area is certainly not folly. This kind of achievement is very probably quite close at hand.

In fact, some people might ask, aren’t we already there? Wasn’t there a big flap last year about a ‘synthetic cell’ made by J. Craig Venter and his colleagues? (It was in all the newspapers, and many a TV news service as well). Well, no, not really, at least not by a stringent definition of ‘synthesis’. But what has indisputably been accomplished by any definition is the synthesis and functional deployment of a genome, which is not the same thing as a completely synthetic cell.

Consider the following three sentences as a logical argument: (1) The genomes of all existing free-living organisms, including the simplest bacterial cells, are composed of double-stranded DNA; (2) It has long been possible to chemically synthesize DNA single strands, convert them to duplex form, and join them together in the laboratory; (3) Therefore, it is possible to make a chemical copy of a bacterial genome by artificial synthesis, and transfer it into a recipient host cell. Is this then a simple syllogism, an A and B with an obvious consequence of C? In principle, perhaps, but in this case it’s a long and arduous way from theory to practice – and that statement is a salute to the very real achievement of making an artificial bacterial genome.

A Little History

Only a little over three decades ago, DNA synthesis of short sequences (oligonucleotides) was a novelty. A major pioneer in this area was Har Gobind Khorana, whose determined efforts led to the first synthesis of an artificial gene, and its successful functional expression in E. coli in 1979. The total length of the artificial segment (including its control regions) was 207 base pairs, and the synthesis and functionality of this gene was quite impressive at the time. Take chemical building blocks and make them into strings of single-stranded DNA. Convert them into duplexes, and join them together in the appropriate manner. Create the desired gene sequence, and its linked promoter for expression by RNA polymerase. Place the final DNA into a suitable cloning vector, and transfer to the intended bacterial host. And….it worked!  A synthetic gene was demonstrated, and with it a convincing demonstration of the chemical nature of life itself. Just as a synthetic drug with exactly the same molecular structure as a drug from natural sources is identical in all aspects and behaves identically in human patients, so too does a chemically synthesized gene behave exactly as for its natural counterpart of identical sequence.

Despite the significance of this triumph, at that time DNA synthesis for molecular biological applications was mostly restricted to specialist laboratories. But not for long. By the early 1980s, custom synthesis of short oligodeoxynucleotides (where the sequence is specified on demand) was offered commercially. Technical developments, especially the phosphoramidite chemistry, greatly streamlined the synthetic process and reduced the cycle times between successive base additions. Before much more time had elapsed, ‘gene machines’ (automated oligonucleotide synthesizers) were on the market which were affordable as a service facility for most academic and research institutions. At the same time (and as a continuing trend until the present), many commercial enterprises offered their services for custom ‘oligo’ synthesis, and as the technology became more streamlined, prices fell dramatically. Although there is a remarkably wide range of distinct applications for oligonucleotides (both with normal structures, and with a wide variety of chemical modifications), the advent of the polymerase chain reaction and the constant need for oligonucleotide primers was a major source of increased demand for synthetic DNA.

At the dawn of the synthetic oligonucleotide era, the specific application which a user had in mind was a potential issue, since making the short DNA strand was one thing, but cloning it in bacterial cells was another. This was simply due to imperfections in the synthetic process. No chemical reaction scheme can provide perfection, and within a synthesized oligo population, molecules with unwanted chemical alterations may pose problems for natural biological enzymes which handle the replication of DNA molecules. Such ‘unnatural’ pieces of DNA may accordingly fail to allow cloning within a bacterial host cell. If the load of ‘dud’ molecules is high, obtaining a viable clone using the original population may be quite difficult. But all technologies undergo a period of teething problems and fine-tuning, and for many years synthetic DNA preparations have become routinely ‘clonable’ without causing undue stress to molecular biologists.

As always, what was once ground-breaking and highly innovative soon becomes routine and taken for granted. And the synthetic trend has continued. It is now very often more economical in both time and money to synthesize gene segments (or whole genes) of many hundreds of base pairs, rather than use more traditional ‘cut-and-paste’ assembly methods for generating DNA plasmid constructs.

Genomic synthesis and expression

So, this brief history of oligonucleotide synthesis takes us back to the Venter group and their generation of an artificial genome. This undertaking necessitated chemical DNA synthesis on a different scale: the size of the chosen genome (from the bacterial organism Mycoplasma mycoides) was over a megabase (Mb) in size (1.077 Mb, or 1,077,947 base pairs to be precise). Although a big challenge, this genome is small by bacterial standards, where the familiar E. coli has a genome of 4.6 Mb. (Obviously, the relatively small size of the M. mycoides genome was a factor in designing the synthetic strategy). The actual synthesis involved three key steps (and three corresponding publications): establishing the means for transfer to recipient mycoplasma cells of genome-sized DNAs; establishing a viable synthetic and fragment assembly strategy, and combining these methods for the final approach. Oligonucleotides were assembled into segments of approximately 1 kb (kilobase) and then successively built up into larger segments through a recombinational process in yeast. After the final assembly of 100 kb segments into the full-length genomic sequence, the product was extracted from yeast and transferred to recipient cells of a different Mycoplasma species (M. capricolum). As for standard transfers of much smaller plasmids, the yeast-assembled genome bore antibiotic resistance and an enzymatic marker (β-galactosidase), in order to select and help confirm that recipient cells containing the newly introduced genome.

One of the difficulties encountered during large-scale DNA synthesis echoes back to cloning problems often experienced during the early days of oligo synthesis (as noted above), but at the level of misincorporation of incorrect bases, rather than other chemical aberrations. In a very long sequence created for functional utility, even a single base alteration can prove fatal if it occurs at crucial sites. So an important aspect of this project involved quality-control checks on products obtained at different levels of assembly, and this was done by DNA sequencing. And of course, the entire synthetic genome needed to be confirmed by sequencing once placed in its cellular host. In terms of enabling technologies which allowed this work to succeed, it is thus important to note that high-throughput sequencing was of great importance, as well as chemical synthesis itself.

One aspect of the syntheses performed by the laboratories of both Khorana and Venter is shared in common. Any segment of DNA in isolation, whether of natural origin or arising from artificial chemical synthesis, is inert in the absence of the molecular machinery which will potentially enable its expression – or ‘bring it to life’, in an operational sense, at least. (Clearly, for this to occur, the said piece of DNA must encode signals which the molecular machines can recognize – only very specific sequences will suffice). The necessary proteins and RNA molecules for such enablement are of course encoded by the genome, but must be present initially to allow the genome to continue transcription of protein-coding and functional RNAs, and to permit its own replication. The genome specifies the proteins and RNAs which control the normal operation of a cell, but the genome needs these same tools in order for the tools themselves to be made. This flow-back loop between a genome and its encoded products is sometimes referred to as the ‘entanglement’ of nucleic acids and proteins, a universal feature of modern biosystems, and an early but dramatic event in molecular evolution which moved beyond the pre-existing RNA World.

Another way of looking at this is to consider it as a software (genome) / hardware (cellular machinery) division, but where the software itself specifies its own hardware, and where the hardware enables the software to be read out. Comparable analogies have been made between genomic information as a tape and its cellular environment as the tape-player. So the artificially-derived genome software or tape cannot ‘boot-up’ in isolation, and requires the machinery (enzymes, ribosomes, and so on) of the host cell to make the proteins and RNAs which it encodes. Of course, one has to use a tape-player which is tailored to the particular DNA segment or genome of interest, since it is essential that the provided molecular machines can interpret the DNA sequences which act as regulatory signals for gene expression, replication, and other functions. As an example noted above, the synthesis of the M. mycoides genome was assembled progressively in eukaryotic yeast cells which cannot ‘read’ the bacterial signals. Consequently, the bacterial sequences were simply faithfully replicated (without expression of bacterial proteins) by virtue of the appended yeast recognition segments, until finally assembled for transfer to their new mycoplasmal host. By the same token, since the beginning of the recombinant DNA era, foreign DNA sequences have been replicated in bacterial cells as passive sequences carried by bacterial plasmids.

This software / hardware issue is clearly relevant to the cross-species (M. mycoides to M. capricolum) nature of the synthetic genomic transplant achieved by the Venter group. In their words, this was equivalent to ‘turning a Mac into a PC’. For many years, Macs have been functionally ‘turned into PCs’ by means of emulation software design, but in a literal analogy of the entanglement of biological software and hardware involved in genome transfer, an introduced software package to (say) a Macbook Pro would have to convert it physically into some type of PC laptop. In fact, the mycoplasmal bacteria involved were closely related (in the same genus), and thus able to read each other’s genomic regulatory sequences. Among many bacteria, important genetic controls are widely similar, but it is far from certain how well the synthetic M. mycoides genome could operate if cross-species transfers were attempted across wider evolutionary gaps, even within the more broad group of mycoplasma-like organisms. And if a standard mycoplasmal genome (synthetic or otherwise) was placed in an Archaeal host (prokaryotic, but a separate kingdom), it is very unlikely that it could function.

Come the Revolution?

How revolutionary is the synthesis of a genome and its transfer to a new host cell? In some ways, not as much as certain news sources have trumpeted. It depends on the particular aim of a molecular biological project involving prokaryotic genomic engineering. For many years, it has been possible to easily carry and express foreign sequences in bacterial host cells as extra-genomic replicators (plasmids), or, if appropriate, inserted into the bacterial genome itself. Relatively recent technological developments have taken this to new levels of proficiency, and are applicable to many bacterial species beyond the traditional work-horse of E. coli. Much interest is currently focused on the conversion of suitable bacteria into useful biochemical factories, performing tasks such as the removal of pollutants, the cost-effective synthesis of chemicals, or the generation of energy sources. Usually, these goals require whole sets of enzymes, transporter molecules, and other protein factors working in concert. But even the most complex exemplars of such metabolic pathways can be encoded on DNA segments of substantially sub-genomic size. Advances in synthesis may allow entire pathway segments to be produced chemically rather than through assembling bits and pieces from biological sources, but this would not necessitate full-scale genomic synthesis. (A large DNA segment with the appropriate information could be transferred to the desired host and propagated as either an extra-genomic plasmid, or chromosomally inserted). By the same token, ‘watermarks’ (encoded messages) of any description can be easily inserted into a desired bacterial genome, without having to make the whole thing.

Of course, there are undoubtedly commercial / intellectual property reasons which make whole-genome syntheses attractive for patenting purposes, but this is not a purely scientific criterion. On the other hand, just as advances in technology have taken once basic oligo syntheses to new levels, so too may whole genome synthetic assemblies become a favored and cost-effective alternative to general microbial engineering, as the field progresses.

A Small Nomenclature Issue, and the Future

As a way of considering ongoing development of this area, let’s take a closer look at the term ‘synthetic cell’, which Gibson et al. used to describe the historic first synthetic functional genome. They justified this by (quite correctly) pointing out that after many generations of growth, the daughter cells will be entirely composed of molecules encoded (directly or indirectly) by the new genome. To use the above ‘tape’ analogy, by such a time, the old tape-player will have been completely replaced by a successor specified by the introduced new genome. However, as they also noted, it is the genome which does this work, and not the direct intervention of the inventors themselves.

Well, that’s fine as is. But let’s then consider it in the context of what else may arise in synthetic biology and ‘synthetic life’. Some of these trends are noted in the Table below:

Level 1 here is clearly that which has been recently achieved by Venter and colleagues. But the rest of the ‘progress levels’ refer to increasing levels of synthetic sophistication. A major premise is that a truly synthetic cell will employ an artificial membrane encapsulation procedure, and provision of the necessary materials to ‘start up’ the newly synthesized genome. (The generation of artificial cell membranes is being actively researched at present). Levels 2-4 only differ in the nature of this protein and RNA-based initialization package. At first (Level 2), along with the synthetic genome, a natural ‘soup’ of bacterial cytoplasm containing ribosomes and all other needed factors are artificially co-encapsulated by a bounded membrane to provide the necessary compartmentalization of the nascent biosystem. But clearly not all of these proteins and RNAs are needed from the very beginning. Many will only have roles later in a replicative cycle, or are only required under certain specific conditions. Thus, a minimal core of ‘start-up’ proteins and RNAs should be definable (including ribosomes, as complex ribonucleoproteins essential for protein synthesis), and could be added either from natural sources (3) or fully created synthetically (4). Beyond these levels lies huge potential, which is likely to involve altered genetic codes and radically altered proteins (referred to as foldamers in Level 6). This us takes into the area to be followed up the next post.

Before moving on, it should be noted that this ‘Synthetic Life Levels’ Table is intended to show progressive steps which lead to increasing levels of artificial input in the synthetic process; it does not contend that synthetic biology must necessarily move in the order as shown. For example, even at the highest level shown (6), it might be more cost-effective to use a pre-existing membrane-bound compartment (whether artificial or natural) than prepare such a compartment de novo. The development of artificial synthetic life will itself be an evolutionary process, reflecting the way technology grows by accretion and combinatorial building upon previous achievements. In certain ways, synthetic organisms are likely to reflect proposals about early molecular evolution itself – as, for example, the fundamental importance of compartmentalization.

But to return to the naming of the current ‘synthetic cell’ of Venter and colleagues, I suggest that this is jumping the gun, and is technically inaccurate. If theirs is a synthetic cell, what does one call a development produced at Level 4 of the above Table? So, using their terminology at this time is potentially confusing, especially among the general public for whom these matters may seem complex. By way of comparison, consider the term ‘artificial life’, which has a respectable history in referring to computational studies of self-replicating and self-interactive model systems. As such, attempts to generate real artificial cells generally avoid this term, in favor of ‘synthetic life’. So words and meanings do matter, if clarity is to be aimed at.

A case can therefore be made that by the very nature of biological entanglement, to create a truly synthetic cell, one needs a synthetic genome AND a minimal cohort of synthetic ‘start-up’ proteins and RNA (including the ribosome as a protein-RNA complex) AND a synthetic compartmentalization system. So what to call a single-celled organism bearing a functional synthetic genome (based directly upon a natural genome), which has been introduced by artificial means? One possibility would be a CRAIG (Cellular Replicator; Artificial Inserted Genome), or more simply, just an Engineered-Genome Organism (EGO).

And a little sign-off, for better or (biopoly)verse:

Information on matters genetic

Accelerates at a pace frenetic

Genomes à la Venter

A door there to enter

A universe of life that’s synthetic

References & Details

(In order of citation)

General references for ‘Synthetic Genomes a la Mode’ See Gibson et al. 2008; Lartigue et al. 2009; Gibson et al. 2010.

‘……an artificial bacterial genome.’    Alternatively, this could be called an artificially synthesized bacterial chromosome. Prokaryotic genomes are composed of a single circular DNA molecule, or a single chromosome – as opposed to multicellular eukaryotes like us, with multiple separate linear chromosomal DNA molecules.

‘… in this area was Har Khorana…’   Khorana received the Nobel Prize for Physiology or Medicine in 1968 for his work in the area of chemical DNA synthesis.

‘….a tyrosine suppressor tRNA gene.’    See Khorana 1979.  A bit of background is necessary to explain the function of this E. coli gene, which encodes a mutant transfer RNA (tRNA) for the amino acid tyrosine. As translational adaptors between amino acids and the genetic code, tRNAs recognize triplet codons in mRNA molecules via a complementary anticodon triplet in a specific loop of the folded tRNA molecule. Sometimes a mutation occurs in a protein coding sequence within an mRNA molecule which changes a codon for an amino acid into a ‘stop’ codon, which terminates translation at that point. Obviously, such a mutation will cause truncation of normal protein expression, and (if the protein is essential) may be lethal to the host cell. For example, if the mRNA codon UAC, which encodes tyrosine, undergoes a single-base C-to-G point mutation to UAG, a stop codon results (known for historical reasons as an ‘amber’ codon). A second-site mutation which can overcome the deleterious effects of a primary mutation is known as a suppressor mutation, and these can occur through changes in tRNA molecules. If a specific tyrosine tRNA molecule (which, in E. coli, exists in several copies) itself acquires a mutant anticodon complementary to the amber mutation, then it can in effect suppress the termination event caused by the amber codon, and allow protein translation to continue. Should the original codon triplet have specified tyrosine, then suppression of its amber mutant derivative by a tyrosine suppressor tRNA will restore a normal protein sequence. However, other amino acid codons can also give rise to amber codons through single-base changes. (For example, UCG (a serine codon) and GAG (a glutamic acid codon) can form UAG amber codons via C-to-A and G-to-U mutations, respectively). In these cases, a tyrosine suppressor tRNA will overcome protein truncation, but with a resulting protein bearing a tyrosine substitution in place of the normal amino acids. Often, however, proteins can retain some function after a single amino acid substitution, albeit at reduced efficiency. Even in such cases, suppression of the amber mutation will be detectable.

The particular tyrosine suppressor tRNA chosen (termed supF) was a good choice, given the small size of the functional gene (including its promoter sequence), and also the fact that it is easy to select or screen for. (A host cell with an amber mutation in an antibiotic resistance gene cannot grow on the selective antibiotic, but will survive if the supF tRNA is expressed). As with the Venter group genomic synthesis, Khorana and colleagues made their tRNA gene from single-stranded bits, rendered them as duplexes, and joined them together appropriately. The difference, of course, is the advance of technology which enabled Venter et al. to make much longer initial strands, and perform the operation on a much faster time scale.

‘….just as a synthetic drug with exactly the same molecular structure as a drug from natural sources…’   Unfortunately, this simple statement of fact is still unclear to a large number of people.

‘….especially the phosphoramidite chemistry…’   As historical examples, see Vu et al. 1990; Brown 1993.

‘…..the advent of the polymerase chain reaction….’   For some background on this, see the specific article on PCR in the Searching for Molecular Solutions ftp site.

‘…..the familiar E. coli has a genome of 4.6 Mb….’   Sequenced in 1997 by Blattner et al. for strain K-12; varies between different strains.

‘…..the relatively small size of the M. mycoides genome…’   This is by no means the smallest known microbial genome, and in fact the Venter team had previously synthesized (but not functionally deployed) the Mycoplasma genitalium genome, the smallest known from an organism capable of independent growth in defined laboratory media (only 0.58 Mb;  Gibson 2008). However, M. mycoides was chosen as basis of the later functional synthetic genome design through its better growth properties.

‘…….three corresponding publications.’   As above; Gibson et al. 2008; Lartigue et al. 2009; Gibson et al. 2010.

‘……the pre-existing RNA World.’   The RNA World as a stage of early molecular evolution has been referred to in previous posts (5th April;  3rd May).

‘….analogies have been made between genomic information as a tape…’   Jack Cohen has made this analogy with respect to the difficulties of performing a ‘Jurassic Park’ scenario with amplified ancient DNA, when the cellular environment for such DNAs (eggs) is lost. (published in Brockman, J. & Matson, K. (eds.) How Things Are: A Science Tool-Kit for the Mind. [William Morrow, New York, 1995].)

‘…..passive sequences carried by bacterial plasmids…’   A foreign sequence can of course be expressed in a bacterial host if a bacterial promoter for RNA polymerase recognition is provided.

Relatively recent technological developments have taken this to new levels of proficiency.’    In diverse organisms, homologous recombination has been used to effect custom-designed genomic alterations.  In E. coli, the use of certain phage proteins has lead to the development of in vivo ‘recombineering’ and considerably facilitated DNA construction methods (see Sharan et al. 2009). Oligonucleotide-driven recombineering is applicable in diverse bacterial species (Swingle et al. 2010).

‘…..the term ‘synthetic cell’, Gibson et al. used to describe…’   See Gibson 2010.

‘…..generation of artificial cell membranes is being actively researched….’   For example, see the recent paper of Noireaux et al. 2011.

‘…..a minimal core ….. should be definable…’    In the general context of bacterial cytoplasmic contents, it is interesting to reflect on a previous post, which discussed the notion of Cell-Harbored Autonomous RNA Molecules, or CHARMs. If any such self-replicating entities were present in bacteria, and completely independent of genomic controls, then they would persist in the genome-transplanted bacteria. (Of course, this is only a formal statement of possibility, should such presumed relics of the RNA World exist at all. And no current evidence supports this).

‘…… technology grows by accretion and combinatorial building upon previous achievements.’   This is further explored in W. Brian Arthur’s book, The Nature of Technology: What It Is and How It Evolves; Free Press, 2009.

‘…… as, for example, the fundamental importance of compartmentalization.’   See Szathmary & Maynard Smith (1995) For their work on the major evolutionary transitions, which include compartmentalization.

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s