Skip to content

Speed Matters: Biological Synthetic Rates and Their Significance

September 21, 2014

Previous posts from biopolyverse have grappled with the question of biological complexity (for example, see the post of January 2014). In addition, the immediate predecessor to the current post (April 2014) discussed the essential role of molecular alphabets in allowing the evolution of macromolecules, themselves a necessary precondition for the complexity requirements underlying functional biology as we understand it. Yet although molecular alphabets enable very large molecules to become the springboard for biological systems, another often overlooked factor in their synthesis exists, and that is the theme of the present post.

Initially, it will be useful to consider some aspects of the limitations on molecular size in living organisms.

 How Big is Big?

If it is accepted that biological complexity requires molecules of large sizes (as examined in the previous post), what determines the upper limits of such macromolecules? At the most fundamental level of chemistry, ultimately determined by the ability of carbon atoms to form concatenates of indefinite length, no direct constraints on biomolecular size appear to exist. In seeking examples to demonstrate this, we need look no further then the very large single duplex DNA molecules which constitute individual eukaryotic chromosomes. The wheat 3B chromosome is among the largest known of these, with almost a billion base pairs, and a corresponding molecular weight of around 6.6 x 1011 Daltons.

But in almost all known eukaryotic cases, an individual chromosome does not equate with genome size. In other words, a general rule is that it takes more than one chromosome to constitute even a haploid (single-copy) genome. Why then should not all genomes be composed of a single (very) long DNA string, rather than being constituted from separate chromosomal segments? And why should separate organisms differ so markedly in their chromosome numbers (karyotypes)? At least a part of an answer to this may come down to contingency, where alternative chromosomal arrangements may have been equally effective, but one specific configuration has become arbitrarily fixed during evolution of a given species. But certainly other factors must exist which are connected ultimately to molecular size. A DNA molecule of even ‘average’ chromosomal size in free solution would be an impractical prospect for containment within a cell nucleus of eukaryotic dimensions, unless it was ‘packaged’ in a manner such that its average molecular volume was significantly curtailed. And of course the DNA in natural chromosomes is indeed packaged into specific complexes with various proteins (particularly histones), and to a lesser extent RNA, termed chromatin.

Yet even a good packaging system must have its limits, and in this respect it is likely that selective pressures exist that act as restrictions on the largest chromosomal sizes. An extremely long chromosomal length may eventually reach a point where its functional efficiency is reduced, and organisms bearing such karyotypic configurations would be at a selective disadvantage.

No biological proteins can begin to rival the sheer molecular weights of chromosomal DNA molecules, but once again there is no fundamental law that prevents polypeptide chains from attaining an immense length, purely from a chemical point of view. Of course, proteins (in common with functional single-stranded RNA molecules) have a very significant constraint placed upon them relative to linear DNA duplexes. Biological proteins must fold into specific three-dimensional shapes even to attain solubility, let alone exhibit the astonishing range of functions which they can manifest. This folding is directed by primary amino acid sequence, and this dictate dramatically reduces the number of potentially useful forms which could arise from a polypeptide of even modest length. Yet since the largest proteins (such as titin, considered in the previous post) are composed of a series of joined modules, the ‘module-joining’ could in principle be extended indefinitely to produce proteins of gargantuan size.

So why not? Why aren’t proteins on average even bigger? Here one might recall a saying attributed to Einstein, “Keep things as simple as possible, but no simpler”, and repackage it into an evolutionary context. Although many caveats can be introduced, it is valid to note that evolutionary selection will tend to drive towards the most parsimonious ‘solutions’ to biological imperatives. Thus, the functions performed by proteins are usually satisfied by molecules which are large by the standards of small-molecule organic chemistry, but much smaller than titin-sized giants of nearly 30,000 amino acid residues. A larger version of an existing protein will require an increased energy expenditure for its synthesis, and therefore will be selected against unless it offered a counter-balancing significant advantage over the existing wild-type form.

So selective pressures ultimately deriving from the cellular energy balance-sheet will often favor smaller molecules, if they can successfully compete against larger alternatives. But another factor to note in this context – and this brings us to the major theme of this post – is the sheer time it takes to synthesize an exceedingly large molecule. Clearly, this synthetic time is itself determined by the maximal production rates which can be achieved by biochemical mechanisms available to an organism. Yet even with the most efficient systems, it is inevitable that eventually a molecular size threshold will be crossed where the synthetic time requirement becomes a negative fitness factor. In this logical scenario, a ‘megamolecule’ might provide a real fitness benefit, but lose competitiveness through the time lag required for its synthetic production relative to alternative smaller molecular forms.

These ‘drag’ effects of biosynthetic time requirements are not merely hypothetical, and can be relevant for chromosomal DNA replication, to briefly return to the same example as used above. Although as we have seen, chromosome length and number do not directly equate with genome size, as far as a cell is concerned, it is the entire genome that must be replicated before cell division can proceed. In this respect, it is notable that certain plants have genomes of such size that their genomic replication becomes a significant rate-limiting step in comparison to other related organisms.

Life in the Fast Lane

Let’s consider primordial replicative biosystems (perhaps pre-dating even the RNA World, and certainly the RNA-DNA-Protein World – see a previous post), where the machinery for replication of informational biomolecules is at a rudimentary stage of evolutionary development. In such a case, it can be proposed that an individual biosystem will selectively benefit from mutations in catalysts directing its own replication, where the mutational changes increase the efficiency and rate of replicative synthesis. This simply follows from the supposition that for biosystems A and B replicating in time t, if for one copy of B, n copies of A are made (where n > 1.0), then A systems will eventually predominate. Even very small positive values of n will still have the same end result. In principle, numerous factors could result in an enhancement of this n value, but here we are assuming that a simple increase in replicative rate would do the trick.

But improved replicative rates could also have an accelerating effect on early biosystem molecular evolution, by enabling the synthesis of larger molecular forms than were previously feasible. This assumes that a slow replication rate for essential biomolecular components of an early ‘living’ system would mean that its upper molecular size limits were much more constrained than for alternative ‘faster’ variants. Such a scenario could arise for any very long molecular concatenate whose replication rate was too slow to be an effective functional member of a simple co-operative molecular system. Faster replication rates would then be in effect enabling factors for increased molecular size, and in turn increased molecular complexity. Fig. 1 depicts this putative effect in two possible modes of operation.




Fig. 1: Proposed effects of enhancement in synthetic rates as enabling factors for increased molecular size and complexity in early biosystems. Increased rates of biosynthesis leading to increased replicative rates in themselves provide a selective advantage (top panel). Yet it can also be considered that an acceleration of synthetic rate potential could also act as an enabling factor for increased potential molecular size, and in turn increasingly complex molecular structures. This might occur through ‘quantum leaps’ (bottom panel, A), where at certain crucial junctures a small rate increase has a large flow-on effect in terms of size enablement, or via a more continuous process (B), where rate increases are always associated with size and complexity enablement. In both cases, though, such effects could not occur indefinitely, owing to an increasing need for regulation of synthetic rates within complex biosystems.



In a very simple replicative system, a single catalyst might determine the replication rate of all its individual components, and accordingly the replication speed of the system as a whole. But increasing catalytic replicative efficiency could become a victim of its own success as system complexity (associated with enhanced reproductive competitiveness) rises. In such cases, differential replicative rates of different components will determine system efficiency. It is both energetically wasteful and potentially a wrench in the works if system components only needed in several copies are made at the same level as components needed in hundreds of copies. Clearly, system regulation is needed in such circumstances, and without it, molecular replication enhancement is likely to be detrimental beyond a certain point. This eventuality is schematically depicted in Fig. 2.





Fig. 2: Proposed effect of introduced regulatory sub-systems on sustaining enhanced biosystem replicative rates. This suggests that even at the same replicative speed, a regulated system will be better off than an unregulated one; and that higher speeds may be permitted by tight regulation. But limits are placed even here. Absence of controlled regulation would probably apply only in the very earliest of emerging biosystems. In other words, the co-evolved regulation is likely to have been a fundamental feature of biosystem synthetic rates, since an imbalance between rates of production of the components of gene expression would be deleterious even in simple systems.



Until this point, we have been considering replication of biosystem molecules in quite simplistic terms. In real systems of a biological nature, functional molecules undergo several levels of processing beyond their basic replicative synthesis. It is appropriate at this point to take a quick look at some of these.


Processing Levels and Biological Synthetic Speed


In even relatively simple bacterial cells, both RNA and protein molecules typically undergo extensive processing, in a variety of ways. And this trend is considerably more emphasized in complex eukaryotes. Although an in-depth discussion of such effects is beyond the scope of the present post, some of them (but by no means all) are listed in Table 1 below.




Table 1. Levels of processing involving primary transcription or translation. These processes can be considered as secondary steps which are required for the complete maturation of biological macromolecules, varying by type and biological circumstances. Where several processing levels are necessary, any one of them is potentially a rate-limiting step for production of the final mature species. It should be noted that while some of these processes are near-universal (such as accurate protein folding following primary polypeptide chain expression), some are restricted to a relatively small subset of biological systems (such as protein splicing via inteins).


One way of enhancing the overall production rates of biological macromolecules bearing modifications after primary transcription and translation is to couple processes together. For protein expression, mRNA transcription and maturation is itself a necessary initial step, and mRNA and protein synthesis are in fact coupled in prokaryotic cells. Where transcription and translation are so linked, a nascent RNA chain can interact with a ribosome for polypeptide translation initiation before transcription is complete.

In contrast, such transcriptional-translational coupling is not found in eukaryotic cells, where mature mRNAs are exported from the nucleus for translation via cytoplasmic ribosomes. Yet examples of ‘process coupling’ can certainly still be uncovered in complex eukaryotes, with a good example being the coupling of primary transcription with the removal of intervening sequences (introns) via splicing mechanisms mediated by the RNA-protein complexes termed spliceosomes.

The sheer complexity of the diverse processing events for macromolecular maturation in known biological systems serves to emphasize the above-noted point that regulation of the replication of biomolecules in general is far from a luxury, but an absolute pre-requisite. Before complex biosystems had any prospects of emerging in the first place, at least basic regulatory systems for replicative processes would necessarily have already been in place, in order to allow the smooth ‘meshing of parts’ which is part and parcel of life itself.

 Speed Trade-Offs and Regulation

There is certainly more than one way for a replicative system to run off the rails, like a metaphorical speeding locomotive, if increasing replicative rates are not accompanied by regulatory controls. A key factor which will inevitably become highly significant in this context is the replicative error rate, or replicative fidelity. ‘Copying’ at the molecular level would ideally be perfect, but this is no more attainable in an absolute sense than the proverbial perpetual motion machine, and for analogous entropic reasons. Thus, what a biosystem can gain in the roundabouts with an accentuated replication rate, it may lose in the swings with loss of replicative accuracy. The problem of fidelity, particularly with the replication of key informational DNA molecules, has been addressed up to a point by the evolution of proof-reading mechanisms (where DNA polymerases possess additional enzymatic capabilities for excising mismatched base-pairs), and DNA repair systems (where damaged DNA is physically restored to its original state, to avoid damage-related errors being passed on with the next replication round). Although such systems might seem obviously beneficial for an organism, there are trade-offs in such situations. Proof-reading may act as a brake on replicative speeds, and also comes at a significant energetic cost.

The complexities of regulatory needs also dictate that rates at some levels of biological synthesis are less than what could be achieved were the component ‘factories’ to be completely unfettered. A good example of this is the relative rate of translation in prokaryotes vs. eukaryotes, where the latter have a significantly slower rate of protein expression on ribosomes. It is highly likely that a major reason for this is the greater average domain complexity of eukaryotic proteins, which require a concomitantly longer time for correct folding to occur, usually as directed by protein chaperones. A striking confirmation of this, as well as a very useful application, has been to employ mutant ribosomes in E. coli with a slower expression rate. When this was done, significant enhancement of the folding of eukaryotic proteins was observed, to the point where proteins otherwise virtually untranslatable in E. coli could be successfully expressed.

Speed Limits In Force?

How can the rates of biological syntheses be slowed down? In principle, one could envisage a number of ways that this could be achieved. In one such process, the degeneracy of the genetic code (where a single amino acid is specified by more than one codon) has been exploited through evolutionary time as a means for ‘speed control’ in protein synthesis. Degenerate triplet ‘synonymous’ codons differ in the third ‘wobble’ positions. For example, the amino acid alanine is specified by four mRNA codons, GCA, GCG, GCC, and GCU. Where synonymous codons in mRNAs are recognized by specific subsets of transfer RNA (tRNA) molecules within the total tRNA group charged with the same amino acid, translational speed can be significantly influenced by the size of the relevant tRNA intracellular pools. To illustrate this in simplified form, consider a specific amino acid X with codons A, B, C, and D, where relevant tRNA molecules a, b, c, and d exist (such that when charged with the correct amino acid, tRNA-aX, tRNA-bX, tRNA-cX and tRNA-dX are formed). Here we arbitrarily assign tRNA-a and –b as mutually recognizing both the codons A and B, and likewise tRNA-c and –d as mutually recognizing the codons C and D. If the tRNA pools for the latter C and D codons are less than those for A and B codons, then the C / D synonymous codons are ‘slow’ in comparison with A and B. A known determinant of tRNA pool size (and thus in turn codon translational efficiency and speed) is the respective tRNA gene copy number. Thus, in this model, it would be predicted that the gene copy number for (A +B) would be significantly greater than for (C + D). Where there are selectable benefits in slowing down translation rates, the use of ‘slow’ codons is thus a useful strategy known to be pervasively applied in biology.

So, the initial and simplistic picture of ‘more is better’ which is logically applicable in very basic organized biosystems (Fig. 1) is not compatible with more advanced cellular systems. This must be kept in mind if we ask whether current biological synthetic rates could be accelerated across the board, either through natural evolution or artificial synthetic biological intervention. So much interlinking of distinct biological processes exists that it would seem difficult for evolutionary change itself to have much impact on synthetic rates in the most fundamental circumstances. Single mutations that accelerate a synthetic process will almost always fail to accommodate the global biosystem’s optimal requirements, and therefore elicit a fall in fitness. From this stance, fundamental synthetic rates would seem likely to be ‘locked in’ or ‘frozen’ by the need for each component of complex regulatory networks to be compatible with each other. Synthetic biology, on the other hand, is not necessarily limited in this way, but even here the would-be biological tinkerer would have to construct multiple changes in a biosystem at once. So global and fundamental changes in biological synthetic rates are not likely to be on the agenda in the near-term future.

To conclude, a biopoly(verse) appropriate for this post’s theme:


Let’s consider synthetic speed

As a potent driver, indeed

An organism’s fate

My come down to rate

The faster, the more it can breed



But recall the many caveats made above with respect to regulation…..


References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).

‘……..wheat 3B chromosome is among the largest known of these……….’     See Paux et al. 2008.

‘….in almost all known eukaryotic cases, an individual chromosome does not equate with genome size.’     The Australian ant Myrmecia pilosula (the ‘jack jumper’ ant) has been reported to have only a single chromosomal pair, such that somatic cells of haploid males bear only a single chromosome. See Crosland & Crozier 1986.

An extremely long chromosomal length may eventually reach a point where its functional efficiency is reduced, and organisms bearing such karyotypic configurations would be at a selective disadvantage.‘     The evolution of chromosome length cannot be studied without considering the role of non-coding DNA, which composes a large percentage of the total genomes of many organisms. By reducing the amounts of non-coding DNA tracts relative to coding sequences, chromosome number can be reduced without necessitating commensurately extended individual remaining chromosomes.

‘….the number of potentially useful forms which could arise from a polypeptide of even modest length….’     Even a small protein of 100 amino acid residues could in principle be composed of 20100 different sequences, for a protein of titin size the number is beyond hyper-astronomical (2026,926).

‘….titin-sized giants of nearly 30,000 amino acid residues….’       Titins and other very large proteins are found in muscle tissues, where they have a physical role as molecular ‘springs’ and fibers, or their attendant co-functionary species. It is presumed that in this specialized context, proteins of such extreme size were advantageous over possible alternatives with smaller macromolecules.

‘…..certain plants have genomes of such size that their genomic replication becomes a significant rate-limiting step…’    Here the plant Paris japonica with 1.3 x 1011 base pairs is the current place-holder, and has a concurrent slow growth rate. See a Science report by Elizabeth Pennisi.

‘….protein splicing via inteins….’     For a recent review and discussion of intein applications, see Volkmann & Mootz 2013.

‘……a good example being the coupling of primary transcription with the removal of intervening sequences (introns) via splicing mechanisms ……. ‘     See Lee & Tam 2013 for a recent review.

‘……such systems [proof-reading and repair] might seem obviously beneficial for an organism, there are trade-offs in such situations….’      It is also interesting to consider that a low but significant level of mutation is ‘good’ in evolutionary terms, in providing (in part, along with other mechanisms such as recombination) the raw material of genetic diversity upon which natural selection can act. But of course, this benefit is not foreseen by selection upon individual organisms: only immediately selectable factors such as metabolic costs are relevant in such contexts.

‘…..proof-reading mechanisms (where DNA polymerases possess additional enzymatic capabilities for excising mismatched base-pairs……’     Proof-reading DNA polymerases possess 3’-exonucleolytic activity that excises base mismatches, allowing correction re-insertion of the appropriate base.

‘……has been to employ mutant ribosomes in E. coli with a slower expression rate. ….. significant enhancement of the folding of eukaryotic proteins was observed….’      For this work, and a little more background on eukaryotic vs. prokaryotic expression, see Siller et al. 2010.

‘…..the degeneracy of the genetic code (where a single amino acid is specified by more than one codon) has been exploited through evolutionary time as a means for ‘speed control’….’      Different classes of eukaryotic proteins have different requirements for enforced ‘slow-downs’, and secreted and transmembrane proteins are major examples of those which benefit from such imposed rate controls. (See Mahlab & Linial 2014). Additional complications arise from the role of sequence context effects (local mRNA sequence environments), as noted in prokaryotes by Chevance et al. 2014. In E. coli, many specific synonymous codons can be removed and replaced with others with little apparent effect on fitness, but notable exceptions to this have been found. See in this respect the study by Lajoie et al. 2013.


Next post: January 2015.


Alphabetic Life and its Inevitability

April 23, 2014


In the very first post of this series, reference was made to ‘molecular alphabets’, and in a post of last year (8th September) it was briefly proposed that molecular alphabets are so fundamental to life that a ‘Law of Alphabets’ might even be entertained. This theme is further developed in this current post.

 How to Make A Biosystem

The study of natural biology provides us with many lessons concerning the essential properties of life and living systems. A recurring and inescapable theme is complexity, observed across all levels from the molecular to cellular scales, and thence to whole multicellular organisms. While the latter usually have many layers of additional complexity relative to single-celled organisms, even a typical ‘simple’ free-living bacterial cell possesses breath-takingly complex molecular operations which enable its existence.

Why such complexity? In the first case, it is useful to think of the requirements for living systems, as we observe them. While comprehensive definitions of life are surprisingly difficult, the essence of biology is often seen as informational transfer, where the instructions for building an organism (encoded in nucleic acids) are replicated successively through continuing generations. (A crucial accompaniment to this is the ability of living organisms to evolve through Darwinian evolution, since no replicative system can ever be 100% error-free, and reproductive variation provides the raw material for natural selection). But while the replication of genomes may be the key transaction, it is only enabled by a wide array of accompanying functions provided by (largely) proteins. The synthesis of proteins and complex cellular structures requires energy and precursor molecules, so systems for acquiring and transducing these into usable forms must also be present.

Molecular Size and Life

The primal ‘motive’ of biological entities to replicate themselves requires a host of anciliary systems for creating the necessary building blocks and structuring them in the correct manner. All this requires energy, the acquisition and deployment of which in turn is another fundamental life need. Processes for molecular transport and recognition of environmental nutrients and other factors are also essential. And since organisms never exist in isolation, systems for coping with competitors and parasites are not merely an ‘optional extra’. Although all of these activities are associated with functional requirements necessitating certain distinct catalytic tasks, a major driver of complexity is the fundamental need for system regulation. In many cases, the orderly application of a series of catalyses is essential for obtaining an appropriate biological function. But in general, much regulatory need comes down to the question of efficiency.

This has been recognized from the earliest definition of regulatory systems in molecular biology. The lac operon of E. coli regulates the production of enzymes (principally ß-galactosidase) involved with the metabolism of the sugar lactose. If no lactose is available in the environment, it is clearly both functionally superfluous and energetically wasteful to synthesize the lactose-processing enzymes. Thus, a regulatory system that responds to the presence of lactose and switches on the relevant enzyme production would be beneficial, and this indeed is what the natural lac operon delivers. In general, an organism that possesses any regulatory system of this type (or various other types of metabolic regulators) will gain a distinct competitive edge over organisms lacking them. And hence this kind of selection drives the acquisition of complex regulatory systems.

So, if complexity is a given, how can this be obtained in molecular terms? How can the molecular requirements for both high catalytic diversity and intricate system regulation be satisfied? An inherent issue in this respect is molecular size. Biological enzymes are protein molecules that can range in molecular weight from around 10 kilodaltons (kD) to well over an order of magnitude greater. If we look beyond catalysts to include all functional molecules encountered in complex multicellular organisms, we find the huge protein titin, an essential component of muscle. Titin is composed of a staggering 26,920 amino acid residues, clocking up a molecular weight of around 3 megadaltons.

But in terms of catalysis itself, why is size an issue? This is a particularly interesting question in the light of relatively recent findings that small organic biomolecules can be effective in certain catalytic roles. Some of these are amino acids (proline in particular), and have hence been dubbed ‘aminozymes’. While certain catalytic processes in living cells may be mediated by such factors to a greater degree than previously realized, small molecule catalysis alone cannot accommodate the functional demands of complex biosystems.

This assertion is based on several factors, including: (1) Certain enzymatic tasks require stabilization of short-lived transitional states of substrate molecules, accomplished by a binding pocket in a large molecule, but difficult to achieve otherwise; and (2) Some necessary biological reactions require catalytic juxtaposition of participating substrate molecules across relatively large molecular distances, a function for which small molecules are unlikely to be capable of satisfying. Even apart from these dictates, the necessity of efficient regulation, as considered above, also limits possible roles for small molecules. A fundamental mechanism for biological control at the molecular level is the phenomenon of allostery, where binding of a regulatory molecule to a site in a larger effector molecule causes a conformational change, affecting the function of the effector molecule at a second distant active site. By definition, to be amenable to allosteric regulation, an effector molecule must be sufficiently large to encompass both an effector site for its primary function (catalytic or otherwise) and a second site for regulatory binding.

Since better regulation equates with improved biosystem efficiency and biological fitness, the evolution of large effector molecules should accordingly be a logical advantage:



Fig. 1: Competitive Advantages and complexity


 Small Conundrums

Even if we accept that molecular complexity and associated molecular size is an inexorable requirement of complex life, why should such biosystems use a limited number of building blocks (molecular alphabets) to make large effector molecules? Why not, in the manner of an inspired uber-organic chemist, build large unique effectors from a wide variety of small-molecule precursor components?

Let’s look at this in the following way. Construction of a unique complex molecule from simpler precursors will necessitate not just one, but a whole series of distinct catalytic tasks, usually requiring in turn distinct catalysts applied in a coordinated series of steps. But, as noted above, mediation of most biological catalytic events requires complex molecules themselves. So each catalyst in turn requires catalysts for its own synthesis. And these catalysts in turn need to be synthesized……all leading suspiciously towards an infinite regress of complexity. This situation is depicted in Fig. 2:


Fig. 2. Schematic depiction of synthesis of a complex uniquely-structured (non-alphabetic) molecule. Note in each case that the curved arrows denote the action of catalysts, where (by definition) the catalytic agent promotes a reaction and may be transiently modified, but emerges at the end of the reaction cycle in its original state. A: A series of intermediate compounds are synthesized from various simpler substrates (S1, S2, S3 …), each by means of distinct catalysts (1, 2, 3….). Each intermediate compound must be sequentially incorporated into production of the final product (catalyst 6 …… catalyst i). Yet since each catalytic task demands complex mediators, each catalyst must be in turned synthesized, as depicted in B. Reiteration of this for each of (catalyst a …… catalyst j) leads to an indefinite regress.


These relatively simple considerations might suggest that attempts to make large ‘non-alphabetic’ molecules as functional biological effectors will inevitably suffer from severe limitations. Are things really as straightforward as this?

Autocatalytic Sets and Loops

There is a potential escape route from a linear infinite synthetic regression, and that is in the form of a loop, where the ends of the pathway join up. Consider a scenario where a synthetic chain closes on itself through a synthetic linkage between the first and last members. This is depicted in Fig. 3A below, where product A gives rise to B, B to C, C to D, and finally D back to A. Here the catalytic agents are shown as external factors, and as a result this does not really gain anything on the linear schemes of the above Fig. 2, since by what means are the catalysts themselves made? But what if the members of this loop are endowed with special properties of self-replicative catalysis? In other words, if molecule B acts on A to form B itself, and C on B to form C, and so on. This arrangement is depicted in Fig. 3B.



Fig. 3. Hypothetical molecular synthetic loops, mediated by external catalysts (A), self-replicating molecules (B), or a self-contained autocatalytic set (C). In cases (B) and (C), each member can act as both a substrate and a catalyst. In case (B), each member can directly synthesize a copy of itself through action on one of the other members of the set, whereas in case (C) the replication of each member is indirect, occurring through their coupling as an autocatalytic unit. Note that in case (B) each catalysis creates a new copy of the catalysts themselves, as well as preserving the original catalysts. For example, for molecule D acting on molecule C, one could write: C [D-catalyst] à D + [D-catalyst] = 2D. In case (C) it is also notable that the entire cycle can be initiated by 4 of the 6 possible pairs of participants taken from A, B, C and D. In other words, the (C) cycle can be initiated by starting only with pairs AD, AB, CD, and CB – but not with the pairs AC and BD. As an example for a starting population of A and D molecules: D acts on A to produce B; remaining A can act on B to produce C; remaining B can act on C to produce D, remaining C acts on D to produce A, thus completing the cycle. If the reaction rates for each were comparable, a steady-state situation would result tending to equalizing the concentrations of each participant.



But the scenarios of Fig. 3 might not seem to approach the problem of how to attain increasing molecular size and complexity needed for intricate biosystems in a non-alphabetic manner. This can readily be added if we assume a steady increase in complexity / size around a loop cycle, with a final re-production of an original component (Fig. 4). These effects could be described in the terms used for biological metabolism: the first steps in the cycle are anabolic (building up of complexity), while the final step is catabolic (breaking down complex molecules into simpler forms).

Fig.4-AutocatLoop&CmpxtyFig. 4. A hypothetical autocatalytic loop, where black stars denote rising molecular size and complexity. For simplicity, here each component is rendered in blue when acting as a product or substrate, and in red when acting as a catalyst. Here the additional co-substrates and/or cofactors (assumed here to be simple organics that are environmentally available) are also depicted (S1 – S3) for molecules D, A, and B acting as catalysts. Since C cleaves off an ‘A moiety’ from molecule D, no additional substrate is depicted in this case.


Of course, the schemes of Figs. 3 & 4 are deliberately portrayed in a simple manner for clarity; in principle the loops could be far larger and (as would seem likely) also encompass complex cross-interactions between members of each. Both anabolic and catabolic stages (Fig. 4) could be extended into many individual steps. The overall theme is the self-sustaining propagation of the set as a whole.

So, could autocatalysis allow the production of large, complex and non-alphabetic biomolecules, acting in turn within entire biosystems constituted in such a manner? The hypothetical loop constructs as above are easy to design, but the central question is whether the principles are viable in the real world of chemistry.

In order to address this question, an important point to note is that not just a few such complex syntheses would need to be established for a non-alphabetic biosystem, but very many. And each case would need to serve complex and mutually interacting functional requirements. It is accordingly hard to see how the special demands of self-sustaining autocatalytic loops could be chemically realized on his kind of scale, even if a few specific cases were feasible. The ‘chemical reality’ problem with theoretical autocatalytic systems has been elegantly discussed by the late Leslie Orgel.

Even this consideration does not delve into the heart of the matter, for we must consider how life on Earth – and indeed life anywhere – may attain increasing complexity. This, of course, involves Darwinian evolution via natural selection, which operates on genetic replicators. It is not clear how an autocatalytic set could produce stable variants that could be selected for replicative fitness. Models for replication of such sets as ‘compositional genomes’ have been put forward, but in turn refuted by others. But in any case, there is an elegant natural solution to the question of how to attain increasing complexity, which is inherently compatible with evolvability.

 The Alphabetic Solution

And here we return to the theme of molecular alphabets, generally defined as specific sets of monomeric building blocks from which indefinite numbers of functional macromolecules may be derived, through covalently joined linear string of monomers (concatemers). But how does the deployment of alphabets accomplish what non-alphabetic molecular systems cannot?

Here we can refer back to the above-noted issue of building complex molecules, and the problem of complexity regression for the necessary catalysts, and building the catalysts themselves. The special feature of alphabets is that, with a suitable suite of monomers, a vast range of functional molecules can be produced by concatenation of specific sequences of alphabetic members. We can be totally confident that this is so, given the lessons of both the proteins and nucleic acid alphabets. The versatility of proteins for both catalysis and many other biological functions has long been appreciated, but since 1982 the ability of certain folded RNA single strands to perform many catalytic tasks has also become well-known. And specific folded DNA molecules can likewise perform varied catalyses, even though such effects have not been found in natural circumstances.

So, nature teaches us that functional molecules derived from molecular alphabets can perform essentially all of the tasks required to operate and regulate highly complex biosystems. But how does this stand with synthetic demands, seen to be a crucial problem with complex natural non-alphabetic structures? Two critical issues are pertinent here. Firstly, an alphabetic concatemer can be generated by simply applying the same catalytic ligation process successively, provided the correct sequence of monomers is attained. This is fundamentally unlike a complex non-alphabetic molecule, where sites of chemical modification may vary and thus require quite different catalytic agents. The other major issue addresses the question of how correct sequences of alphabetic concatemers are generated. In this case the elegant solution is template-based copying, enabled through molecular complementarities. This, of course, is the basis of all nucleic acid replication, through Watson-Crick base pairing. Specific RNA molecules can thus act both as replicative templates and folded functional molecules. The power of nucleic acid templating was taken a further evolutionary step through the innovation of adaptors (transfer RNAs), which enabled the nucleic acid-based encoding of the very distinct (and more functionally versatile) protein molecular alphabet.

But in order to achieve these molecular feats, a certain number of underlying catalytic tasks clearly must be satisfied in the first place. These are required to create the monomeric building blocks themselves, and all the ‘infrastructure’ needed for template-directed polymerization of specific sequences of new alphabetic concatenates. But once this background requirement is in place, in principle products of any length can be created without the need for new types of catalytic tasks to be introduced. In contrast, for non-alphabetic complex syntheses, the number of tasks required will tend to rise as molecular size increases. In a large series of synthetic steps towards building a very large and complex non-alphabetic molecule, some of the required chemical catalyses may be of the same type (for example, two discrete steps both requiring a transesterification event). But even if so, the specific sites of addition must be controlled in a productive (non-templated) manner. This requires some form of catalytic discrimination, in turn necessitating additional catalytic diversity. Fig. 5 depicts this basic distinction between alphabetic and complex non-alphabetic syntheses.



Fig. 5. Schematic representation of catalytic requirements for alphabetic vs. complex (non-repetitive) non-alphabetic syntheses. For alphabetic macromolecular syntheses, a baseline level of catalytic tasks (here referred to as a ‘complexity investment’; of N tasks) allows the potential generation of alphabetic concatenates of specific sequences and of indefinite length – thus shown by a vertical line against the Y-axis (this line does not intercept the X-axis since a minimal size of a concatenate is determined by the size of the alphabetic monomers). For non-alphabetic complex molecules of diverse structures, as molecular size increases the number of distinct catalysts required will tend to continually rise, to cope with required regiospecific molecular modifications performed with the correct stereochemistry. It should be stressed that the curved ‘non-alphabetic’ line is intended to schematically represent a general trend rather a specific trajectory. Catalytic requirements could vary considerably subject to the types of large and complex molecules being synthesized, while still exhibiting the same overall increasing demand for catalytic diversity.


It must be noted that the above concept of a ‘complexity investment’ (Fig. 5) should not be misconstrued as arising evolutionarily prior to the generation of templated alphabetic syntheses. Progenitor systems enabling rudimentary templated syntheses would necessarily have co-evolved with the generation of templated products themselves. Yet once a threshold of efficiency was attained in direct and adapted templated molecular replication, a whole universe of functional sequences is potentially exploitable through molecular evolution.

And herein lies another salient point about molecular alphabets. As noted above, the secret of life’s ascending complexity is Darwinian evolution, and it is difficult to see how this could proceed with autocatalytic non-alphabetic systems. But variants (mutations) in a replicated alphabetic concatemeric string can be replicated themselves, and if functionally superior to competitors, they will prove selectable. Indeed, even for an alphabet with relatively few members (such as the 4-base nucleic acid alphabet), the numbers of alternative sequences for concatenates of even modest length soon becomes hyper-astronomical. And yet the tiny fraction of the total with some discernable functional improvement above background can potentially be selected and differentially amplified. Successive cumulative improvements can then ensue, eventually producing highly complex and highly ordered biological systems.

 Metabolic Origins vs. Genetic Origins and Their Alphabetic Convergence

The proposed importance of alphabets leads to considerations of abiogenesis, the question of ultimate biological beginnings. Two major categories of theories for the origin of life exist. The ‘genetic origin’ stance holds that some form of replicable informational molecule must have emerged first, which led to the molecular evolution of complex biological systems. This school of thought points to considerable evidence for an early ‘RNA World’, where RNA molecules fulfilled both informational (replicative) and functional (catalytic) roles. But given difficulties in modeling how RNA molecules could arise de novo non-biologically, many proponents of the RNA World invoke earlier, simpler hypothetical informational molecules which were later superseded by RNA.

An alternative view, referred to as the ‘metabolic origin’ hypothesis, proposes that self-replicating autocatalytic sets of small molecules were the chemical founders of biology, later diversifying into much higher levels of complexity.

Both of these proposals for abiogenesis have strengths and weaknesses, but the essential point to make in the context of the present post is that it is not necessary to take a stand in favor of either hypothesis in order to promote the importance of molecular alphabets for the evolution of complex life. In a nutshell, this issue can be framed in terms of the difference between factors necessary for the origin of a process, and factors essential for its subsequent development. In the ‘alphabetic hypothesis’, molecular alphabets are crucial and inescapable for enabling complex biosystems, but are not necessarily related to the steps at the very beginning of the process from non-biological origins.

If the ‘genetic origin’ camp are correct, then alphabets are implicated at the very beginning of abiogenesis. On the other hand, if the opinions of ‘metabolic origin’ advocates eventually hold sway, molecular alphabets (at least in the sense used for building macromolecules from a limited set of monomers) would seem to be displaced at the point of origin. But the biological organization we see around us on this planet (‘Life 1.0’ ) is most definitely based on well-defined alphabets. So, both abiogenesis hypotheses necessarily must converge upon alphabets at some juncture in the history of molecular evolution. For genetic origins, a direct progression in the complexity of both alphabets themselves and their derived products would be evident, but a metabolic origin centering on autocatalytic small-molecule sets must subsequently make a transition towards alphabetic systems, in order for it to be consistent with observable extant biology. Thus, stating that alphabets enable the realization of highly complex biological systems refers to all the downstream evolutionary development once alphabetic replicators have emerged. No necessary reference is accordingly made to the role of alphabets at the beginning of the whole process .

 A ‘Law of Alphabets’?

Now, the last issue to look at briefly in this post is the postulated universality of alphabets. It is clear that molecular alphabets are the basis for life on this planet, but need that always be the case? To answer this, we can revisit the above arguments: (1) Complex biosystems of any description must involve complex molecular interactions; (2) The demand for molecular complexity is inevitably associated with requirements for increasing molecular size; (3) Biological synthesis of a wide repertoire of large and complex functional molecules is difficult to achieve by non-alphabetic means; (4) The fundamental requirement for Darwinian evolution for the development of complex life is eminently achievable through alphabetic concatenates, but is difficult to envisage (and certainly unproven) via non-alphabetic means.

It is also important to note that these principles say nothing directly about the chemistry involved, and quite different chemistries could underlie non-terrestrial biologies. Even if so, the needs for molecular complexity and size would still exist, favoring in turn the elegant natural solution of molecular alphabets.

So if this proposal is logically sound, then it would indeed seem reasonable to propose that a ‘Law of Alphabets’ applies universally to biological systems. In a previous post, it was noted that an even more fundamental, but related ‘law’ could be a ‘Law of Molecular Complementarity’, since such complementarities are fundamental to known alphabetic replication. Indeed, it is difficult to conceive of an alphabetic molecular system where complementarity-based replication at some level is absent. Still, while complementarity may be an essential aspect of alphabetic biology, it does not encompass the whole of what alphabets can deliver, and is thus usefully kept as in separate, though intersecting, compartment.

To conclude, a biopoly(verse), delivered in a familiar alphabet:


If high bio-complexity may arise

In accordance with molecular size

Compounds that are small

Are destined to fall

And with alphabets, intricacy flies

References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).


‘…comprehensive definitions of life are surprisingly difficult…..’     For example, see Cleland & Chyba 2002; Benner 2010, Root-Bernstein 2012.

‘……The lac operon of E. coli regulates the production of enzymes……’     The story of the lac operon is a classic in molecular biology, included in most basic textbooks. The French group involved, led by Jacques Monod, won a Nobel prize for this in 1965. For a revisit of an old 1960 paper regarding the operon concept, see Jacob et al. 2005.

An inherent issue in this respect is molecular size.‘     See my recent paper (Dunn 2013) for a more detailed discussion of molecular size in relation to functional demands.

Biological enzymes are protein molecules that can range in molecular weight….’     A case in point for a large enzyme, pertaining to the above lac operon, is the E. coli enzyme ß-galactosidase, which has 1024 amino acid residues and a molecular weight of 116 kilodaltons. For details on the structure of ß-galactosidase, see Juers et al. 2012.

Titin is composed of a staggering 26, 920 amino acid residues…….’     See Meyer & Wright 2013.

‘…..small organic biomolecules can be effective in certain catalytic roles. ‘     See Barbas 2008.

‘……small molecule catalysis cannot accommodate the functional demands of complex biosystems. ‘     See again Dunn 2013 for a more detailed discussion of this issue.

‘…..the phenomenon of allostery…’    The lac operon again can be invoked as a good example of the importance of allostery; see Lewis 2013.

‘……a potential escape route from a linear infinite synthetic regression… in the form of a loop….’     A major proponent of autocatalytic loops and self-organization has been Stuart Kauffman, outlined (among with many other themes) in his book, The Origins of Order. (Oxford University Press, 1993).

‘…..The ‘chemical reality’ problem with theoretical autocatalytic systems has been elegantly discussed by the late Leslie Orgel. ‘     See Orgel 2008.

Models for replication of such sets as ‘compositional genomes’ have been put forward, but in turn refuted by others.‘     For the model of autocatalytic set replication, see Segré et al. 2000; for a refutation of it, see Vasas et al. 2010.

‘……molecular alphabets, generally defined as specific sets of monomeric building blocks….’       See Dunn 2013 for a more detailed definition, and discussion of related issues.

‘……since 1982 the ability of certain folded RNA single strands to perform many catalytic tasks….’     The seminal paper on ribozymes came from Tom Cech’s group in 1983 (Kruger et al. 1982).

‘…….specific folded DNA molecules can likewise perform varied catalyses….’     See Breaker & Joyce 1994.

This is fundamentally unlike a complex non-alphabetic molecule, where sites of chemical modification may vary…….’     Note that this statement does not include molecules such as polymeric carbohydrates, where these are composed of repeated monomers and thus relatively simple in their structures.

‘….the numbers of alternative sequences for concatenates of even modest length soon becomes hyper-astronomical. ‘     For example, in the case of an RNA molecule of 100 nucleotides in length, 4100 (equivalent to 1060) sequence combinations are possible.

‘…..quite different chemistries could underlie non-terrestrial biologies. ‘     See Bains 2004 for a detailed discussion of this issue.

Next post: September.

Laws of Biology: A Complexity Law?

January 29, 2014

This post continues from the previous, which discussed the notion of laws in biology, and considered candidates for what might be the major contenders for universal ‘lawful’ status in this domain. Although at the end of this post, a series of possible universal biological dictates were briefly listed, the status of evolution as the primary biological law was highlighted. In turn, this places natural selection, the primer mover of evolution, in the spotlight.

But is this really the case? Is there a more fundamental law that underlies (or at least accompanies) all evolutionary processes? These issues are examined further in this post.

 Complexity and Evolution

 When discussing the priority of biological ‘laws’, semantics and definitions will inevitably enter the picture. Thus, while it might be acknowledged that evolution is ‘Law No. 1’, it might be proposed that more fundamental laws operate which enable evolution in itself. The ‘law’ of Darwinian natural selection immediately springs to mind, but other processes have been proposed as even more fundamentally significant.

In the previous post, the intriguing role of complexity as a putative ‘arrow of evolution’ was alluded to. Here it was also noted that this apparent ramping of complex functions could arise from ‘locking in’ of systems in increasingly intricate arrangements. In this view, where evolutionary success is associated with a structural or functional innovation which increases the overall complexity of an organism, reversing such a change in later generations may be unfeasible.  This will occur when the selected evolutionary change has become enmeshed as a fundamental part of a complex network of interactions, where removing a key component cannot easily be compensated. Successive innovations which become ‘locked’ in a comparable manner thus lead to a steady increase in the net complexity of biological systems. Of course, this is not to say that all complex adaptations are irreversible, and a classic example of ‘complexity loss’ is the deletion of functional eyes from cave-dwelling animals living in complete darkness.

An interesting recent publication highlights a possible example of an evolutionary process that may lead to increased complexity. Gene duplication has long been acknowledged as an important driver of evolution, where formation of a duplicate gene copy allows continuation of the original gene function while allowing mutations in the second copy (or ‘paralog’), and accompanying exploration of new evolutionary space. An implicit assumption in such cases is that the expressed paralog does not interfere with the original gene function, but this may not apply where such functions depend on networks of co-operating protein-protein and protein-nucleic acid interactions. Johnson and colleagues have shown that this is indeed the case for certain duplicated transcription factor genes in yeast. As a result, a strong selective pressure exists for resolution of such paralog interference, one solution of which is mutational change which removes the interference effect, while simultaneously allowing the progressive emergence of novel functions. Such effects were noted by this group as being a potential source of increasing complexity, although this is not formally proven.

A complexity gradient is readily apparent between bacterial cells and single-celled eukaryotes, and in turn between the latter and the variety of multicellular organisms based on the eukaryotic cellular design. The key enabling event here was an ancient symbiosis between ancestral bacterial cells, resulting in the eventual evolution of mitochondria and chloroplasts as key eukaryotic organelles for energy production and photosynthesis respectively. The energetic endowment of mitochondria allowed the evolution of large genomes capable of directing the complex cell differentiation required for multicellular life. (And among the mechanisms for such genomic complexity acquisition, we can indeed note duplication events, as mentioned above).

And yet it is important to consider the putative evolutionary ‘drive’ towards complexity in terms of the biosphere as a whole. Whatever the evolutionary origin of biological systems of escalating intricacy, it is clearly not a global phenomenon. Only certain eukaryotic lineages have shown this apparent trend, while the prokaryotic biomass on Earth has existed in a similar state for billions of years, and will no doubt continue to exist as long as suitable habitats exist on this planet. (And here prokaryotes are clear masters at colonizing extreme environments).

Such observations are entirely consistent with the blind forces of natural selection. Change in a lineage will not occur unless variants emerge which are better-suited to existing or changing environments (including evasion or domination of natural predators or competitors). So, the question then becomes: is increased complexity simply a by-product or accompaniment to natural selection itself which may or may not occur, or is it an inevitability? Before continuing any further, it will be useful to briefly look at just how complexity might be assessed and measured.

Defining Biological Complexity

Perhaps consistent with its intuitive properties, a good definition of complexity in itself is not the essence of simplicity. One approach would seem logical: to perform a census of the range of different ‘parts’ that comprise an organism, where the total count provides a direct complexity index. (Obviously, by this rationale, the higher the ‘parts list’ number, the greater the perceived complexity). But a problem emerges in terms of the system level at which such a survey should be performed, since it can be in principal span hierarchies ranging from the molecular, organelle, and cell differentiation state, up to macroscopic organs in multicellular organisms. The physical size of organisms has also been noted as a correlate of complexity, but not a completely reliable one.

An additional and very important observations also suggests that a simple parts list of organismal components is at best a considerable under-rating of what may be the true underlying complexity. Biological systems are characteristically highly modular and parsimonious. This bland statement refers to the often incredible economy of informational packing in genomes, such that a basic human protein-encoding gene count of only approximately 20,000 can encode the incredible complexity of functioning human being. The baseline gene figure is greatly amplified by systems using separate gene parts (exons) in alternative ways, through RNA splicing and editing, and a gamut of post-translational modifications.  But beyond this level of parsimonious modularity, the same gene products can perform quite distinguishable functions through differential associations with alternative expressed products of the same genome, corresponding to distinct cellular differentiation states. A far better account of complexity must therefore cover the entire interactome of an organism, but this is a far more onerous undertaking than a mere parts list.

And the levels of potentially encoded complexity don’t even stop there. Consider a protein A that interacts with proteins B and C in one cell type (α) within an organism, and D and E in another type of cell (β) within the same organism. The differential complexes ABC and ADE result from alternate programs of gene expression (cell type α having an expressed phenotype A+, B+, C+ D-, E-; while the β phenotype is A+, B-, C-, D+, E+), combined with the encoded structural features of each protein which enable their mutual interactions. The interaction of A with its respective partners is thus directly specified by the genome via regulatory control mechanisms. But indirect programming is also possible. There are numerous routes towards such a scenario, but in one such process a genomically-encoded gene A can be randomly assorted with other gene fragments prior to expression, such that a (potentially large) series of products (A*, A**, A***, and so on) is created. If a single cell making a specific randomly-created modification of the A gene (A*, for example) is functionally selected and amplified, then A* is clearly significant for the organism as a whole, yet is not directly specified by the genome. And the creation of A* thus entails a ramping-up of organismal complexity.

The ‘indirect complexity’ scenario is actively realized within the vertebrate adaptive immune system, where both antibody and T cell receptor genes are diversified by genomic rearrangements, random nucleotide additions (by the enzyme terminal transferase) and somatic hypermutation. And clearly the circuitry of the mammalian nervous system, with its huge number of synaptic linkages, cannot be directly specified by the genome (although here the details of how this wiring is accomplished remained to be sketched in).

These considerations make the point that defining and quantitating complexity in biological systems is not as straightforward as it might initially seem. In principle, a promising approach centers on treating the complexity of a system as a correlate of system information content.  While this has been productive in many virtual models, it still remains an elusive goal to use informational measures in accounting for all of the above nuances of how biology has achieved such breath-taking levels of complexity.

A ‘Zeroth’ Law for Biology?

Where measures of complexity can be kept within a specified rein, multiple computer simulations and models have suggested that evolving systems do show a trend towards increasingly complex ‘design’. But in real biological systems, what is the source of burgeoning complexity? Is it somehow so inevitable that it needs the status of a ‘law’?

McShea and colleagues have proposed the ‘Zero-Force Evolutionary Law’ (ZFEL), which has been stated as: “In any evolutionary system in which there is variation and heredity, there is a tendency for diversity and complexity to increase, one that is always present but may be opposed or augmented by natural selection, other forces, or constraints acting on diversity or complexity.” This could be seen as a law where complexity is increasing ratcheted up over evolutionary time, through the acquisition of variations which may have positive, negative, or neutral selectable properties.  If subject to negative selection, such variants are deleted, while positively-selected variants are amplified by differential reproductive success. Variants that are completely neutral, however, may be retained, and potentially serve as a future source of evolutionary diversity.

An interesting wrinkle on the notion of neutral mutations is the concept of conditional neutrality, where a mutation may be ‘neutral’ only under certain circumstances. For example, it is known that certain protein chaperones can mask the presence of mutations in their ‘client’ proteins which would be otherwise unveiled in the absence of the chaperone activity. (A chaperone may assist folding of an aberrant protein into a normal structural configuration, whereas with impaired chaperone assistance the protein may assume a partially altered and functionally distinct structural state). Such a masking / unmasking phenomenon has been termed evolutionary capacitance.

But is the ‘Zero-Force’ law truly that, or simply a by-product of the primary effect of Darwinian natural selection? (The latter was discussed in the last post as the real First Law of Biology).  The above ZFEL definition itself would seem to embed the ‘Zero Force’ law as an off-shoot of evolution itself, by beginning with ‘In any evolutionary system……’. Certainly ZFEL may correctly embrace at least one means by which complexity is enhanced, but since the adoption or elimination of such candidate complexity is ultimately controlled by natural selection, it would seem (at least to biopolyverse) that it is a subsidiary rule to the overarching theme of evolution itself.

In any case, if a ‘zero-force’ law is operative, why has the huge biomass of prokaryotic organisms persisted within the biosphere for such immense periods of time? An interesting contribution to this question highlights the importance of an organism’s population size for the acquisition of complexity. In comparison with prokaryotes, eukaryotes (from single celled organisms to multicellular states) are typically larger in physical size but with smaller total population numbers. (Recall the above mention of the role of eukaryotic mitochondria in bioenergetically enabling larger genomes, and in turn larger cell sizes). In a large and rapidly replicating population, under specific circumstances a paralog gene copy arising from a duplication event (noted above as an important potential driver of complexity acquisition) has a significantly greater probability of being deleted and lost before it can spread and become fixed. Thus, from this viewpoint, a eukaryotic organism with a substantially reduced population base is more likely to accumulate genomic and ultimately phenotypic complexity than its prokaryotic counterparts. Once again, the origin of eukaryotes through the evolution of symbiotic organelles derived from free-living prokaryotes was an absolutely key event in biological evolution, without which complex multicellular life would never have been possible. And eons of prokaryotic existence on this planet preceded this development, suggesting that it was not a highly probable evolutionary step, perhaps dependent on specific environmental factors combined with elements of chance.

A complexity-enabling but highly contingent (and evidently rate-limiting) event such as eukaryogenesis does not create confidence in the operation of a regular biological law. And other ‘complexity breakthroughs’ are likely to exist. The ‘Cambrian Explosion’, where a variety of animal phyla with distinct body plans emerged during the beginning of the Cambrian era about 540 million years ago, may be a case in point. This ‘explosion’ of complexity in a relatively short period of geological time has long been pondered, although molecular phylogenetic data have suggested earlier origins of many phyla. Still, an intriguing suggestion has been that the first evolution of ‘good vision’ was an enabling factor for the rapid evolution (and thus complexification) of marine Cambrian fauna.

So increasing biological complexity seems to have more of a ‘punctuated’ evolutionary history than an inexorable upward trend. Fitting a ‘law’ into what is governed by environmental changes, contingency, and natural selection may be a tall order. But perhaps it is too early to say……

On that note, a non-complex biopoly-verse offering:

Within life, one often detects

A trend towards all things complex

Does biology have laws

That underlie such a cause?

Such questions can sorely perplex….


References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).

 Johnson and colleagues……’  See Baker et al. 2013.

‘….the eventual evolution of mitochondria and chloroplasts….’    See the excellent book by Nick Lane: Power, Sex, Suicide – Mitochondria and the Meaning of Life. Oxford University Press, 2005.

‘…..energetic endowment of mitochondria…..’   See Lane & Martin 2010 ; Lane 2011.

‘…..a good definition of complexity……’ For discussions of the meaning and measurement of complexity see, Adami et al. 2000; Adami 2002; Tenaillon et al. 2007.

‘…..a census of the range of different ‘parts’….’ See McShea 2002. 

‘…..protein-encoding gene count of only on the order of 20,000…..’ Note the ‘protein-encoding’ here; if non-coding RNA genes are added, the count is much higher. See the GENCODE data base.

The physical size of organisms has also been noted as a correlate of complexity….’ See a relevant article by John Tyler Bonner (2004).

‘…..Biological systems are characteristically highly modular and parsimonious. / A far better account of complexity must therefore cover the entire interactome of an organism…..’ The need to address the modularity of living systems in order to fully apprehend them has been forcefully argued by Christof Koch (2012).

‘…….a promising approach to complexity centers on treating the complexity of a system as a correlate of system information content. ‘   See Ricard 2003; Szostak 2003; Hazen et al. 2007.

‘… simulations and models have suggested that evolving systems do show a trend towards increasingly complex ‘design’ .    See Adami et al 2000; Tenaillon et al. 2007; Joshi et al. 2013

McShea and colleagues have proposed the ‘Zero-Force Evolutionary Law’……’     See Biology’s First Law . Daniel W. McShea and Robert N. Brandon , 2010 , Chicago University Press , Chicago, IL, USA; also Fleming & McShea 2013.

Such a masking / unmasking phenomenon has been termed evolutionary capacitance.’    See a recent interview (Masel 2013) for a background on such capacitance phenomena, and further references.

An interesting contribution to this question highlights the importance of an organism’s population size for the acquisition of complexity.’     See Lynch & Conery 2003; Lynch 2006.

In comparison with prokaryotes, eukaryotes (from single celled organisms to multicellular states) are typically larger in physical size……’     Exceptions exist where very large prokaryotes overlap in size with single-celled eukaryotes. In such cases, the giant prokaryotes are powered by multiple genome copies. On that theme, see Lane and Martin 2010.

In a large and rapidly replicating population, under specific circumstances a paralog gene copy arising from a duplication event ……has a significantly greater probability of being deleted….’     This requires more detail regarding gene duplication outcomes: Following a gene duplication event, a resulting paralog copy can acquire deleterious mutations and be lost, or rarely acquire advantageous mutations providing a positive selection (neofunctionalization). But another possible outcome is where deleterious mutations occur in both gene copies, such that both are required for the continuing fitness of the host organism. In such circumstances of subfunctionalization, the original functions of the single encoded gene product are thus distributed between the two duplicate copies.

Another significant point with respect to the population size argument of M. Lynch and colleagues is that selectable fixation of mutational variants will always be longer in large replicating populations than in small ones. Where subfunctionalization occurs after gene duplication, additional mutational changes can occur which completely inactivate one copy, a terminal loss of fitness. In a large population base, such events will act against the species-wide retention of gene subfunctionalization much more so than in small populations. The latter, therefore, are subject to relatively increased complexification as a result of the preservation of this type of gene duplication.

This ‘explosion’ of complexity in a relatively short period of geological time has long been pondered, although molecular phylogenetic data have suggested earlier origins of many phyla.’    See Jermiin et al. 2005 for some discussion of these themes.

‘….the first evolution of ‘good vision’ was an enabling factor for the rapid evolution…..’     See Zhao et al. 2013 for a recent study, and discussion of this notion.

Fitting a [complexity] ‘law’ into what is governed by environmental changes…..’     See Auerbach & Bongard 2014 for an in silico study of environmental effects on the evolution of complexity. They find environmental complexity and model organismal complexity are correlated, suggesting complexity may only be favored in certain biological niches.

Next Post: April.

Laws of Biology and True Universality

September 8, 2013

This post takes a high-altitude look at biology, with respect to principles which govern biological organization and origins. Some such principles have often been referred to as ‘laws’, but how much can be generalized? Usually the word ‘universality’ is stated with reference to life on Earth, but here the intention is to examine how much it might be possible to claim, with any measure of confidence, that certain principles should be universal in an absolute sense – for life as it might occur anywhere in the universe itself.

A Lawful Natural World

Even the most casual student of the sciences will be familiar with the concept that physical laws govern how things work, how the universe operates.  Consider, for example, the law of gravitation, or the laws of thermodynamics. These have all been derived from empirical observations that ultimately have enabled the formulation of regular mathematical relationships (equations) between key fundamental physical factors that contribute to the observed phenomena. (For example, in the case of gravitation, distance and mass are identifiable as such factors). To qualify as a ‘law’, such a derived relationship should apply universally (literally, anywhere in the universe), and enable predictions to be made, based on inputting specific values into the mathematical terms of interest. At the same time, as formulated by humans, a physical law may not necessarily be an immutable entity, since further information may reveal specialized circumstances where its predictive utility or true universality breaks down. This of course, has been the experience with the transition from Newtonian to relativistic Einsteinian physics. The latter advance of the early 20th century has been well-validated as a better fit to the universe than its older predecessor, but Newton’s achievements worked very well and very accurately in describing both local and astronomically observable phenomena for centuries. As indeed they still work today, on an ordinary scale familiar for most human activities.

Laws are thus derived from basic observations and experimental testing, and could be viewed as a high-level procedure for organizing and making sense of empirical data, and seeking generalizations. Indeed, the process of ‘law-making’ itself may be reducible to definable pathways, as recently shown by the application of algorithms for deriving ‘free-form’ natural laws when experimental input data is provided.

Mathematics is thus an intrinsic and ineluctable part of a formulated physical law, and in great many cases, the agreement between solved mathematical equations for specific laws and observed physical events is profoundly accurate. Why there should be this ‘unreasonable’ correspondence between mathematical constructs and reality has long been a topic for speculation among both physicists and philosophers.

Physical laws are in the most general sense the domain of the science of physics, but they do not end there, at least in their specific applicability. Thus while chemistry ultimately is a manifestation of physics in terms of the electronic properties of matter, the behavior of matter is subject to many laws defining chemical reactivity. Thus, chemicals combined under the right conditions (solvents, concentrations, temperature, pressure, and catalysts if necessary) will react in highly predictable manners. And this too is expected to be the case anywhere in the universe, provided the starting conditions are the same.

Chemistry is to physics as biology is to chemistry. If biology is a very special manifestation of chemistry, just as chemistry is a particular outcome of the laws of physics, then should there not be a set of fixed biological laws as well? Should not the mathematical precision of the physico-chemical sciences be applicable in all areas of biology as well?

Biological Laws: Local vs. Universal

Yet unlike the simpler foundations of chemistry, biology has some peculiar aspects which can make formulation of truly universal laws much more problematic. The first issue might be called a ‘sample size’ problem, in that all life on Earth has a fundamentally common molecular basis, rendering universal conclusions drawn on such a single-precedent accordingly more difficult ‘Life 1.0’  (This general topic has been touched upon in some previous posts, including 21 March 2011 and 28 June 2011). A second related problem even within the set of all life on Earth is dissecting ‘chance and necessity’ in evolutionary trajectories. Even where certain biological features are universal on Earth, that may not necessarily imply that no other possible solutions exist towards the most fundamental biosystem requirements, even using the same basic informational and functional molecules of life. The most fundamental features of cellular life, including nucleic acid replication mechanisms and ribosomal protein synthesis, may become fixed at an early stage of molecular evolution and preclude their replacement by even potentially superior alternatives. (This is the ‘frozen accident’ scenario also referred to in previous posts).

Nevertheless, just as the laws of physics apply everywhere in the universe, a ‘real’ biological law should be applicable to all conceivable forms of life. But since no other ‘life sample’ is currently available, this kind of demand is difficult to fulfill with confidence. In part owing to these kinds of constraints, less restrictive definitions of biological laws have been acknowledged as having a certain practicality. These ‘lesser laws’ might also be referred to as ‘rules’ or ‘patterns’ if desired. But for the purposes of this post, I will use the categories of local vs. universal laws, where a ‘local’ law is strictly applicable only to the biological context of life on this planet ‘as we now it’, and a ‘universal’ law is just as it implies – applicable to any biosystem bound by the higher laws of chemistry and physics. Of course, making such a theoretical distinction in itself is easy; deciding on what should specifically be placed in either category is another matter entirely. And in the absence of at least one other life sample, it could be contended that it is impossible to establish such a dichotomy with complete surety. Yet it is certainly possible to appeal to logical argument, and proceed accordingly.

Locally Limited

Even very successful laws may be perceived to be local in their applicability to our familiar Life 1.0 if they apply to complex multicellular systems that are not universal even on this planet. Consider three such examples found in the field of neurobiology:

(1) The eponymously named Hodgkin-Huxley model of neurotransmission formulated in the early 1950s provided a very good mathematical accounting for the behavior of neural ‘action potential’ signaling, even before the discovery of protein-based membrane ion channels.

Although an impressive achievement, the Hodgkin-Huxley model does not account for certain thermodynamic features of nerve transmission, and has been challenged by other approaches, notably the soliton model.

(2) Sleep, or at least periods of dormancy with sleep-like characteristics, is very widespread, if not universal, in animals with even simple nervous systems. Allied to this is the ubiquitous existence of neurally-controlled circadian rhythms. It is has been suggested that the phenomenon of sleep might ultimately derive from the inherent properties or intrinsic demands of neural networks.

Despite much investigation, the functions for sleep are still enigmatic. But even aside from this difficulty. it is clear that there is a wide range in both the quantity and qualitative nature of sleep periods required even within mammals. (But more will be said in later posts on circadian rhythms as a subset of biological oscillations).

(3) Anesthesia can be induced in animal nervous systems with a disparate array of substances, even including inert gases. A long-standing correlate of anesthetic potency is found with the Meyer-Overton rule, which predicts membrane permeabilities from a compound’s partitioning coefficient between oil and water phases, in turn reflecting the molecule’s degree of hydrophobicity.  It has been observed that the Meyer-Overton rule provides a remarkable linear correlation between partitioning coefficients (irrespective of molecular structures) and minimal concentrations of molecules required for observable anesthetic activity. The range where the correlation seems to hold (over six orders of magnitude) is probably unprecedented in biology.

The keyword ‘rule’ for the Meyer-Overton correlation (as opposed to ‘law’) should be noted in particular. This is the case since there is still no convincing mechanism by which the differential membrane permeability of diverse compounds would affect their anesthetic properties. In addition, given the molecular diversity it covers, it has been assumed that the Meyer-Overton rule relates to anesthesia as a single phenomenon, but it seems more likely that a group of related pain and consciousness-ablating effects are involved.

Apart these inherent limitations, all of these three examples are premised on the basic structure of neural axons, membranes, and ion channels. Yet it is not at all clear that neurobiology must necessarily be arranged as it is, that alternative structural patterns would not be possible. Even more fundamentally, it cannot be claimed with any certainty that neural signaling is the only possible pathway for a biological realization of the kinds of activities permitted by neurobiology in Life 1.0. All of these considerations point to the above examples as being at best specifying laws of local applicability, to varying degrees.

Another general example highlighting ‘locality’ in this context is found within Mendelian laws of heredity, upon which the concept of genes and genetics is founded. At first glance, these might appear to have much better prospects as true Laws, since genetics is globally applicable, encompassing organisms totally lacking neural systems. Yet notwithstanding the importance and utility of Mendelian genetics, here again any case for universality runs aground on variation even within terrestrial life. Beneath the initial relative simplicity, many and varied instances of non-Mendelian inheritance and genetics are known, and the definition itself of a ‘gene’ has becoming correspondingly difficult. All of this emerges before it is even necessary to speculate on entirely distinct biological information transmission systems.

By now, it should be apparent that searching for deep and truly universal biological laws has much overlap with attempts to define life itself. So with that in mind, where does that lead us?

A True Universal

Consider a rapidly free-swimming macroscopic organism existing in a liquid medium anywhere in the universe. (It would be unlikely that such a medium would be anything other than water, but in principle other fluids could apply). It would seem entirely logical to suggest that, regardless of the underlying molecular constitution of the organism’s biology, or its higher-level physiological processes, a degree of friction-reducing streamlining would be apparent in its overall morphology. And one could go further, and boldly proclaim that since this effect should logically apply anywhere, it forms the basis of the “Universal Law of Streamlining”, or words to that effect.

But this kind of categorization would be misguided. For one thing, it places faith in the ‘logic’ of the initial premise without it having any explanatory power. And if the streamlining case was accepted, separate ‘laws’ might exist in parallel for other specific environments. All of these instances might indeed be universal, but all of them are subsumed by an encompassing and truly universal process, that of Darwinian evolution. Translating the streamlining example into an evolutionary outcome, we can see that individual organisms with even slightly reduced frictional drag need expend less energy to move at a given speed through their fluid medium, and are therefore at an advantage relative to their ‘normal’ fellows in terms of either getting food, or avoiding becoming food for something else. And this kind of survival advantage tends to also translate into a reproductive advantage, as a consequence of improved energetics and average longevity. The simple yet profound insights of natural selection thus supersede ‘streamlining’ or any analogous ‘laws of environmental shaping’.

So then, can we refer to evolution as a fundamental and universal biological law? Certainly Darwinian evolution has been proposed as an essential component of any comprehensive definition of life. But from first principles, what might lead to us affirming the primacy of evolution, as in effect a universal biological law?

To arise in the first place, any chemically-based life must proceed through stages of increasingly organized sophistication, which amount to a process of molecular evolution. It could be contended that the same processes that allow molecular evolution to occur in the first place do not and cannot simply stop at any particular stage of this progression, and that therefore, by its very origins, any biosystems must be inherently evolutionary. Any form of life must reproduce, and therefore any form of life must be viewable as an informational transmission system. Information, in some chemical form, must be copied and transmitted to each succeeding generations of organisms, of any conceivable description. A replicative system that is free from error in a formal and absolute sense is thermodynamically impossible, akin to a perpetual-motion machine. It is true that known cellular biosystems use sophisticated mechanisms for reducing replicative error rates and genomic damage (such as polymerase proof-reading and DNA repair processes), but there is a trade-off between energy investment in such systems and stability. Even the best of them still have a readily measurable mutational background.

In any case, the essential point to note is that such error-correction is a highly developed mechanism which can only emerge long after far simpler biosystems have originated in the first place. Indeed, it is not the absence of error that bedevils models for the origin of life, but the opposite. Emerging informational replicative schemes must somehow deal with the specter of ‘error catastrophe’, which may seem to doom further molecular evolution before it gets off the ground.

Reproductive errors provide the raw material for evolutionary processes, the stuff upon differential selection can operate and mold the formation of biological innovations and the acquisition of complexity. Natural selection is a simple process conceptually (albeit with many subtle implications) that occurs when variation occurs in competing reproductive units. It is logically compelling, and readily modeled with digital replicators in computer simulations and real-life laboratory experiments.

Combined with a vast body of work ever since The Origin of Species by Darwin, the case for universality for both reproductive errors and selective forces defend the status of evolution as a universal bio-law. Yet, having made that point, one is immediately reminded of the role of mathematics in the physical sciences, as the solid underlying description of what a physical law actually specifies. Where is the mathematical insight in evolutionary theory?

In fact, it has been noted more than once that the whole of The Origin of Species is an equation-free zone. Although the process of natural selection certainly has been mathematically modeled by others, an equation for natural selection cannot be used for specific predictive purposes, since the outcome of a selective process is entirely dependent on variable selective pressures, mutational rates, and constraints imposed by pre-existing biosystems and structures. On the other hand, it is quite possible to describe and represent natural selection as an algorithm of a special kind. Given the variables involved, the algorithm for natural selection is most accurately described as ‘incompressible’, in that no shorter operations can suffice to deliver the end result. In other words, the application of the ‘natural selection algorithm’ is already the shortest possible description of itself, and an outcome of evolutionary processes can only be determined by letting the algorithm run its course. No precise mathematical equations can enable calculations of such processes in nature purely from the outset.

This conclusion argues that there is no directionality in evolution, and that chance environmental factors are necessarily significant. In turn, a logical corollary of non-directionality is that if the evolutionary ‘tape’ was replayed, it would be overwhelmingly likely that a different set of results would emerge. Yet this has been a controversial stance among certain evolutionary biologists, through several degrees of divergence.

The most hard-core views of evolution as directional become essentially teleological, which presumes inherent evolutionary trajectories (usually) leading to the emergence of human intelligence. Such opinions, with or without overt suggestions of divine guidance (or at least a divine kick-off to the mysterious origins of life) are not in the mainstream.  But it is certainly a legitimate stance to question whether evolution may be forced to take a particular pathway under specific circumstances, or at least to proceed via a limited number of possible routes.  An important keyword in this field of investigation is ‘constraints’.  And this is directly relevant to the theme of this post, since if certain constraints are all-encompassing, they might indeed qualify as biological laws of sorts, although (as extended a little further below), these are in effect sub-laws under the general umbrella of the higher law of evolutionary selection and change.

Before moving onto that area, it should be noted that complexity ramping is possibly one ‘arrow of evolution’, where competition between increasingly complex biosystems continually and cumulatively progresses, and cannot be easily wound back without loss of fitness. But this is only a very broad generality, and far from the presumptions of directionality that have been made regarding higher-order outcomes such as human intelligence. In the present context, arguments against directionality of evolution can be side-stepped in any case, since an erroneous ‘directional evolution’ stance might be viewed as an even more overarching law than a non-directional process, and thus does not in itself dispute the ‘lawful’ status of evolution. Yet the power of evolutionary change is not related to any starting-point teleological influence.

The High Laws

So we can abide by the special status of Darwinian evolution as, in effect, a universal biological law, which might be stated in the following manner:

The Law of Evolution

“All biosystems originate and progress by means of imperfect informational transmission fidelity, selective processes imposed by environmental factors or competition, and differential reproductive success.”

Is there anything else that might stand along side this law, as opposed to secondary laws that follow as a necessary consequence of evolution? The property of molecular self-organization is often cited as a fundamental requirement for the origin of life on Earth, and, by logical inference, for the origin of life anywhere. Some biological self-organizing systems may become (directly or indirectly) encoded agents of Darwinian selective processes (as considered further below). Yet for this to occur, the Darwinian replicative systems must have become established in the first place. Furthermore, self-organization in the form of molecular complementarities is a fundamentally important facet of genetic copying itself. There is accordingly a case to be made for self-organization / self-assembly as a fundamental requirement for any chemical life, and therefore attain the status of a Law.

I suggest that a ‘Law of Alphabets’ can be also proposed, to the effect that autonomously replicating biosystems cannot originate and progress unless their molecular components are composed of subunits of a specific set of smaller compounds, termed ‘molecular alphabets’. Although Darwinian evolution is certainly applicable on a molecular scale, it needs a mechanism for replicative information transmission, which in turn is difficult to conceive without an ‘alphabetic’ system. Since it is argued that the establishment of such a system must be a primary event in the origin of life, alphabets should stand as another fundamental biological law. Moreover, at least some alphabets (as with nucleic acids) participate in self-assembly mediated by molecular complementarities, so the alphabetic and self-assembly ‘laws’ have regions of overlap. In any case, both the themes of self-organizational and alphabetic primacy will be developed further in a later post.

But it is important to note once more that these three high-level laws do not translate into specific systems at lower levels, but only specify how such systems will develop in broad terms. Top level laws direct the formation of bottom level systems and processes, but only impose the broadest of limitations on what particular forms the bottom levels may take (the so-called secondary laws noted above). Expressed another way, by the definition of universality, the higher laws would be applicable to any extrasolar Life 2.0, 3.0 (and so on), but could say nothing about what molecular form these alternatives would take.

Tools of Evolution and Constraints

Natural selection and evolution make use of diversity generation in replicative biosystems that are composed of various subunits. From first principles, if certain subunit combinations inherently possessed self-organizing or self-assembling properties, such a feature might afford a selective advantage for a biosystem possessing them over competitors which did not. It might in turn be inferred that selective forces will inevitably result in a gravitation towards biosystems harboring such self-organizing features. A number of investigators have championed the fundamental importance of self-organizing phenomena, particularly with respect to the origin of life, as noted above. Critics of this stance often suggest that too much emphasis is placed by such proponents on theoretical models, and not enough on real-world chemistry. But regardless of the details of origins, once replicative systems became established for biological information transmission, it can be argued that self-organization and self-assembly are properties of certain molecular entities that are themselves  subject to evolutionary processes. Many examples of biological self-assembling systems can be cited, ranging from membrane components (synthesized by evolvable genomically-encoded enzymes) and directly encoded protein complexes. Insofar as self-organizing systems provide selective advantages, biosystems will be ‘taken there’ by selection itself. By this reasoning, some forms of self-organization are then tools of evolutionary selection, and should be distinguished from the more fundamental self-assembly processes (noted above) logically presumed necessary for abiogenesis and genetic replication.

Although the abundant diversity of life on this planet is a testament to the power of Darwinian evolution to shape life, it requires no large leap of intuition to come to the conclusion that evolution is highly likely to have definite limits in terms of what can be achieved. Evolutionary constraints can exist at many levels, from the molecular to macroscopically morphological. An often-cited example of the latter is the absence of macroscopic wheels in any biological context, as opposed to a variety of other organs of locomotion. With respect to constraints in general, it is important to distinguish between processes or structures that are truly biologically impossible (which may be difficult to prove in practice) and those that are evolutionarily inaccessible. Since evolution cannot foresee a future benefit, all intermediate states (between a precursor form and a potentially superior successor arrangement) must themselves confer fitness benefits. If such improved intermediates are not possible, crossing a valley of lowered fitness in order to reach a higher peak on the other side may require a ‘single-step’ genetic jump of such improbability that it is essentially untenable. A biological macro-wheel, for example, is evolutionarily inaccessible for terrestrial mammals, but not formally proven to be biologically impossible per se.

It is highly likely that some constraints are so universally applicable that they themselves may seem to be laws of their own. The above-noted ‘streamlining law’ is a case in point, itself of course a subset of a general channeling of evolutionary morphology by environmentally-determined selective pressures. Since in effect these dictates all act as determinants of evolutionary pathways in a universal sense, they may be referred to as ‘sublaws’ or secondary laws to the primacy of evolution. Although this will be the theme of a subsequent post, as a preview the main such secondary laws to discuss are:

Scaling and Power Laws (System Channeling)

Self distinction and self protection

Environmental feedback

Spatial and temporal oscillations

And more will also be said regarding the role of self-organization and molecular alphabets as potentially universal ‘laws of life’.

On that note, this post concludes with a tripartite (bio)polyverse offering of relevance to this theme:

The question:

 Is Life Science governed by law?

By fixed rules, by dictates, or more?

Does life have a trend?

A knowable end –

Through patterns that show us the score?


A cautionary caveat:

Notions of biological laws

May be burdened with inherent flaws

Necessity or chance

May take turns in the dance

While confusing the search for a cause.


Yet one conclusion reached:

Evolution is Law Number One

Under ours, or indeed any sun 

Biosystems abide

This Darwinian guide

Its processes rule, still they run

References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).

Why there should be this correspondence between mathematical constructs and reality…..’     A famous disquisition on “the unreasonable effectiveness of mathematics in the natural sciences” was published by the physicist and mathematician Eugene Wigner in 1960, available here. See also a very interesting paper on this subject  by R. W. Hamming.

‘…the application of algorithms for deriving ‘free-form’ natural laws when experimental input data is provided….’     See Schmidt & Lipson 2009.

‘…..chance and necessity……’     A famous and influential book of the same title (in its English translation from the original French) was published in 1970 by the Nobel-prize winning biologist Jacques Monod.  As an outcome realized within the physical universe, any biology must be underpinned by basic physical law; the deterministic question revolves around the extent that biological (evolutionary) outcomes are constrained by higher-level biological laws, and whether these are literally universal, or only local for a specific manifestation of biology. Although it is difficult to assert with surety, many  biological results are likely to be highly contingent, where the higher-level ‘biological laws ‘ only serve to place very broad limits on the possible alternative outcomes. These trends are depicted schematically in the Figure below.


Interplay between universal physical law and biosystem evolutionary outcomes. Each bar in the series is dependent on underlying laws represented by the preceding bar to the left. The dotted line within the Biosystems bar depicts a boundary between events where higher-order structure outcomes may be influenced by contingent factors as much as intrinsic laws molding any possible biology.


‘…..less restrictive definitions of biological laws have been acknowledged as having a certain practicality.’      For example, See Dhar & Giuliani 2010, who used the more relaxed law definition of “a frequently observed regularity that allows for a substantial improvement of our prediction ability in well-defined systems”.

‘……Hodgkin-Huxley mathematical model of neurotransmission……’ See their 1952 paper.

‘……the soliton model.’     See Andersen et al. 2009.

‘…..phenomenon of sleep might ultimately derive from some inherent property or intrinsic demands of neural networks. ‘   See Krueger et al. 2008.

‘………a wide range in both the quantity and qualitative nature of sleep periods required….’     For a discussion of the variation in mammalian sleep, see Siegel 2005.

‘……the Meyer-Overton rule provides a remarkable linear correlation between partitioning coefficients and minimal concentrations of antibiotics…..’     For more information on the history of the Meyer-Overton rule, see Heimberg & Jackson 2007.

‘…..anesthesia as a single phenomenon…..’     For background on this (as well as neurotransmission models), see Andersen et al. 2009.

‘…….the definition itself of a ‘gene’ has becoming correspondingly difficult…..’      See Pearson 2006; Pennisi 2007.

‘…….Darwinian evolution has been proposed an essential component of any comprehensive definition of life.’      This has been noted in many circumstances, including the so-called NASA definition, as discussed by Benner 2011.

A replicative system that is free from error in a formal and absolute sense is thermodynamically impossible….’     In this context is of interest to note that under some circumstances, certain organisms appear to have evolved strategies for actively increasing mutation rates in transient and targeted ways. For more background on such ‘adaptive mutation’ phenomena, see Roth et al. 2006; Galhardo et al. 2007.

Note also that while mutations are an important source of the raw material for selection, genetic diversity can be generated in other ways, particularly by recombination processes.

‘……the specter of ‘error catastrophe’…..’     This concept was introduced by Manfred Eigen, a pioneer in theoretical early molecular evolution; see Eigen 1971.

‘….It is logically compelling, and readily modeled with digital replicators in computer simulations …….. ‘     For background on digital genetics and evolutionary simulations, see Adami 2012. With respect to laboratory studies on experimental evolution, see Elena & Lenski 2003; Kawecki et al. 2012.

‘…..the whole of The Origin of Species is an equation-free zone.’  /  ‘ ……the process of natural selection certainly has been mathematically modeled…..’     See Schuster 2011.

‘…….it is quite possible to describe and represent natural selection as an algorithm……’     The status of the evolutionary process as algorithmically incompressible has been emphasized by Stuart Kauffman in his books, The Origins of Order and At Home in the Universe. See also Searching for Molecular Solutions, Chapter 2. However, here it is also noted that there is no formal proof that a shorter algorithm cannot exist.

‘……No precise mathematical equations can enable calculations of such [evolutionary] processes……’     The fact that the overall natural evolutionary process cannot be reduced to predictive mathematical formulas should not be confused with highly successful mathematical modeling for the effects of fitness advantages, kin selection, and so on. See Schuster 2011.

‘……the evolutionary ‘tape’ was replayed…..’     The late Stephen J. Gould was a noted proponent of the ‘diverging tape’ view, made eloquently in his book Wonderful Life. In an abstract representation of chemical systems, ‘replaying the tape’ was found to reproduce certain self-organizing phenomena, but nothing was specified beyond this most fundamental level (Fontana & Buss 1994). Theoretical arguments favoring inherent limits in evolutionary predictability have also been raised; see Day 2012.

‘……inherent evolutionary trajectories (usually) leading to the emergence of human intelligence. ‘     A noted proponent of teleology in an evolutionary context was the French Jesuit and paleontologist Pierre Teilhard de Chardin. Some of his opinions in this regard are presented in his 1959 book The Phenomenon of Man.

A number of investigators have championed the fundamental importance of self-organizing phenomena……’     The above-noted Stuart Kauffman is also prominent in this regard, also described in The Origins of Order.

‘……too much emphasis is placed by such proponents on theoretical models…..’      For example, see a refutation by the late Leslie Orgel of models of self-organizing prebiotic metabolic cycles (Orgel 2008).

Next post: January 2014.

1953 – A Real ‘Annus Mirabilis’

April 25, 2013

This post is an unusual one by the standards of biopolyverse. The molecular and cellular science herein is mainly featured from a historical point of view, and other scientific achievements in much broader contexts are noted as well. A motivation for this stems from the significance of 2013 as the 60th anniversary of the discovery of the structure of DNA by Watson and Crick, as published in Nature in 1953. There already has been publicity regarding this, and much more will ensue before this year’s end. So it is not my main purpose to simply jump on this bandwagon, although it is certainly an occasion worth highlighting as much as possible. Rather, it is to make note of how 1953 seems to tower above almost all other years as an epic time for scientific endeavor across many disciplines, not least of which are included the biomedical sciences.

Years of Wonder or Miracles

Years that ‘stand out’ are often nominated as opinionated interpretations of history, or for purely personal reasons. The 17th century poet John Dryden wrote the poem Annus Mirabilis in reference to the year 1666, in which the Great Fire of London occurred, following closely on a terrible epidemic of bubonic plague in 1665.  The ‘wonder-ful’ (mirabilis) aspect of 1666 may have been either an ironic construct or a portent for a better future, but in any dispassionate view, 1666 was not a particularly auspicious year. This is especially so if one takes a more global stance than would normally be assumed for an Englishman of that era, long before rapid communication technologies. Although over the course of any one year, a blend of varied high and low personal events may result in neutrality, many people (at least those of middle-age or later) could nominate one particular year of their lives as their favorite. This individual perspective would seem to be enshrined within another poem entitled Annus Mirabilis, specifically lauding 1963 for rather personal reasons of the author, Philip Larkin.

But here I contend that a truly wondrous year should be recognizable objectively, and not have to be sifted through the vagaries of fractious historical debate, let alone merely personal factors. Here the deciding factor is long-term influence, and scientific and technological advance is among the most influential molders of human civilizations. It is important to note that true long-term and concrete influence on societies is quite distinct from individual fame, whether transient or long-term. Only a small minority of randomly-chosen people could answer questions like “Who first discovered the structure of DNA?”; “Who first formulated the principles of information theory?” or “Who won a Nobel Prize for the discovery of X-rays?”. Yet the influence of these discoveries and their technological off-shoots has been profound, with global impact reaching indefinitely into the future. Obviously, a very large number of alternatives could have been substituted for the above questions, but in each case, the scientific or technological advances are far more meaningful and long-term than the vast majority of short-lived political developments. Of course, this is not to say that non-scientific aspects of history are insignificant, but typical dissections of the course of history often pay little notice to the relative importance of scientific progress.  This has been a pattern even among learned historians, let alone among popular perceptions. From these observations, one can conclude that an ‘Annus mirabilis’ for scientific discovery could easily ‘fly under the radar’ from gaining widespread recognition, and the year 1953 AD is a strong contender in this regard.

Pin-pointing discoveries

Before we can start picking out years of scientific discovery, it is necessary to think a little about what a ‘discovery’ process means, or when to specifically locate a scientific advance in time. It is often not as simple as a straightforward historical event. For example, in 2013 one can say, ‘President Kennedy was assassinated fifty years ago in 1963’, or ‘the structure of DNA was discovered sixty years ago in 1953’. The first of these is an unassailable historical fact – that’s when that specific event happened. But the scientific history of DNA is not quite so precise, if one wants to account for the timelines for the structural discovery itself. Most scientific achievements are based on background work which may have taken many years, and ‘breakthroughs’ may be more a series of incremental steps than a sudden revelation. As a result of this, attempts to dovetail certain major scientific advances into a single year may appear contrived.

A simple defining point is a publication date, where a description of at least the essence of the discovery is first presented in scientific literature. Thus, much of Watson and Crick’s work leading to the DNA double-helical structure was done in 1952, but the crucial publication emerged in Nature in 1953.  Likewise, Salk’s work towards a polio vaccine comprised of inactivated viruses was achieved in 1952, but the first preliminary results with a small sample of people was published in 1953 in the Journal of the American Medical Association. (Extensive trials followed in 1954, leading to the public release of the vaccine in 1955).

What happened in 1953?

With this guideline in mind, then we can examine scientific events associated with 1953, and those that stand out are listed in Table 1 below.

Table1-1953SciAdvTable 1. The record of 1953 for major scientific accomplishments. All of the above were first published in 1953 except for the bottom-most entry (orange highlight). The latter corresponds to the starting point of long-standing biomedical investigation into the patient ‘HM” (identified as Henry Molaison after this death), who underwent an operation in 1953 aimed at reducing his intractable epileptic fits, which removed part of the brain region known as the hippocampus. While the anti-epileptic goal was largely successful, the patient suffered severe short-term memory deficits from that point on for the rest of his life. HM cooperated with many studies which showed the critical importance of the hippocampus for short-term memory formation, and other important aspects of memory. Such studies, however, were not published until several years after the initial operation.


Of course, the determination of the double-helical structure of DNA, and the immediate insight it afforded into the mechanisms of DNA replication and genetic encoding, stand out by a wide margin. But the additional discoveries noted above also shine strongly. For example, another very famous experiment was published in 1953, and that is second in the above Table’s list. This work of Stanley Miller (and Harold Urey) in the generation of simple biological building blocks under ‘early earth’ conditions struck a chord at the time, and has been a scientific landmark ever since. Less famous, but also very significant, is the first successful freezing of sperm, a milestone for reproductive biology. And the rationale for classifying the first successful polio vaccine by publication date was outlined above.

In the general field of evolutionary biology, 1953 was very significant as the time-point when a major adjustment to the study of human evolution was made. The prior ‘discovery’ of the Piltdown man ‘missing link’ in 1912 was proven to be a hoax, thus removing a major impediment to the proper understanding of human origins. Although this may seem only the demolition of an artificial barrier rather than a real advance in knowledge, it demonstrated the efficacy of the scientific method in eventually pinning down the truth and allowed other loose ends in the field to be tied together. Moving from human biological evolution to the ‘evolution’ of human cultures, 1953 also saw a major advance in the decipherment of the hitherto uninterpretable script Linear B, samples of which were discovered previously in archaeological sites investigating Mycenaean Crete (~1600 – 1100 BCE).  The revolutionary decipherment was accomplished by Michael Ventris, also in collaboration with John Chadwick.

Outside of science with any direct biological connections, 1953 saw a major advance with the first physical detection of neutrinos (as opposed to their previous theoretical prediction). Also, in the same year, the mid-Atlantic rift was detected, of high significance in the ultimate confirmation of the theory of continental drift and plate tectonics.

In modern times, every year without exception has regular science-related events as a matter of course, the most well-known of which is the annual awarding of Nobel prizes. Other events can be noted as scientifically newsworthy, but of a more incidental nature than important scientific advances. Although Table 1 lists the main occurrences relevant to consideration of the scientific productivity of 1953, Table 2 lists some accompanying happenstances of a more secondary nature. This is not to suggest, of course, that the scientific achievements of the 1953 Nobel Prize winners, or those who happened to die in that year, were in any way ‘secondary’ in terms of their long-term impact. It is simply that their major accomplishments were prior to 1953, and thus winning a prize or ending a career through death are secondary to the actual dates of such advances themselves. By way of comparison, Watson and Crick (in conjunction with Maurice Wilkins for his associated DNA X-ray crystallography) were awarded a Nobel Prize for their elucidation of the structure of DNA, but not until 1962.


Table 2. ‘Incidental’ science-related events of note occurring in 1953.  Hermann Staudinger won the Nobel Prize for chemistry for “discoveries in the field of macromolecular chemistry”. Hans Krebs and Fritz Lipmann shared the Nobel Prize for Medicine for the metabolically important discoveries of the citric acid cycle (Krebs Cycle) and coenzyme A, respectively. Fritz Zernike was awarded the corresponding Physics prize for the development of phase contrast technology, and the phase contrast microscope. Edwin Hubble is famed as the astronomer who showed the expansion of the universe, and Robert Millikan was the physicist who measured the charge on the electron via renowned oil-drop experiments, winning a Nobel for this in 1923.


Given the central importance of 1953 in the history of molecular biology, it is ironic that the advances associated with the 1953 Nobel prizes for chemistry and physics have both had direct impact on the life sciences. (Of course, the corresponding prize for medicine is of obvious biological significance). Phase-contrast microscopy has long been an important tool in many areas of biology, of which bacteriology is a key beneficiary. But chemistry advances pioneered by Hermann Staudinger are of particular interest in the context of this blog, since an adaptation of a chemical joining procedure now bearing his name has recently gained prominence in chemical biology. This ‘Staudinger ligation’ can be performed with reagents which do not interact with normal chemical components of living systems, and thus finds application within very recent developments in ‘bio-orthogonal chemistry’, the major preoccupation of the previous post.

Some items from Table 2 may need a little expansion. By 1953, the major technological innovation of the transistor had moved from the laboratory (where it was invented in late 1947) into wide-scale commercial applications, and for this reason that particular year was named ‘The Year of the Transistor’ by Fortune magazine. My nomination of the publication of a book, Removing the Causes of War, by Kathleen Lonsdale may seem a surprising choice. But she was scientifically very notable for her direct physical confirmation of the planar structure of the important benzene molecule by X-ray crystallography. Although this occurred well before 1953 (in 1929), the publication of her peace-related book in 1953 serves as an incidental reminder of her scientific achievements. Also, this dual interest in both physical chemistry and peace activism inevitably brings to mind Linus Pauling, who won a chemistry Nobel the following year, and a second Nobel Peace Prize in 1962. In the latter sphere Pauling  published a book called No More War. He is also noteworthy in this context as a competitor with Watson and Crick for the elucidation of the structure of DNA. Although Pauling’s initial models were incorrect, his previous astute insights in structural biology (such as the assignation of protein α-helical and β-sheet motifs), suggests that he would have soon succeeded with DNA if Watson and Crick had not arrived there first in 1953.

……Amid the Backdrop of the Times

The subject of peace could serve as a reminder that while scientific endeavors might sometimes be represented as ivory-tower enterprises aloof from mundane concerns, that is never truly possible. The general tone of the times must always have an influence, if only at minimum via the extent of public funding made available to science, or the freedom to pursue ‘pure’ research goals without interference from government or other agencies. Table 3 lists several notable events of the year of interest, most of which need no further explanation. The death of Stalin and the execution of the Rosenbergs for ‘atomic espionage’ serve as a reminder of the Cold-War background to the general zeitgeist, with the conquest of Mt Everest one of the more up-lifting events of the year, so to speak. Stalin’s death was a positive development in the Soviet Union not merely in removing a murderous tyrant, but also for science there. He had championed the crackpot ideas of T. D. Lysenko regarding the mechanisms of heredity, and forcibly suppressed the pursuit of proper genetic research. Only with Stalin’s passing could Soviet science begin to recover from this irrational deviation.

It is perhaps regrettable that the reference to the Nobel Peace Prize awarded to George C. Marshall might need a little explanatory amplification, since the level of his renown is disproportionately low compared to his achievements and influence. Among a long list of accomplishments, he was the US Chief of Staff during World War II, and subsequently formulated the Marshall Plan for the reconstruction of Europe. This historically important project (ending in 1951) ensured Western European stability for decades following.


Table 3. Some general historical events of note occurring in 1953.


Of all the above general historical events, the first ascension of Mt Everest can be compared with some types of scientific discovery. After a mountain has been climbed, others can follow, perhaps by alternative and more difficult pathways to the top. But irrespective of how a mountain is climbed, once the summit has been reached for the first time, that singular achievement can never be repeated. When a scientific discovery has been made, unlike a mountain ascension, it must be shown to be repeatable before it becomes consensus knowledge. In some cases, though, a scientific discovery fills in existing gaps in knowledge in such an elegant and convincing fashion, that it is rapidly accepted. Irrespective of this, a confirmed scientific discovery is analogous to a ‘climbed mountain’, in that it can never be done again as ‘a first’. The discovery of the structure of DNA in the same year as the first ascension of Mt Everest, is an excellent example if this. There remained plenty more to find out about DNA structure in the wake of Watson and Crick’s famous paper, and the topic is certainly not exhausted even today. Yet the most vital and compelling information was ‘climbed’ in 1953, and can never be so ascended again.

Sometimes general history can appear to have a more or less random aspect, epitomized by the famous saying that it is just “one damn thing after another”. (This has been attributed to various different people, including Winston Churchill). Although many historical outcomes are clearly predicated on what took place beforehand, other events may indeed seem like ‘wild cards’. The latter could be exemplified by major natural disasters, epidemics, or major political changes caused by single individuals. Certainly scientific advances may come serendipitously and from unexpected quarters, but such progress is always built upon preceding generations of successive refinements. Even when a temporary aberration diverts the accumulation of knowledge (as seen for example, with the above-mentioned Piltdown Man hoax), it will eventually be rectified by the self-correcting nature of international scientific enterprise. No scientific discovery is then ever a complete ‘blue sky’ event, in the same sense that some historical occurrences have been.

Is there a real ‘1953 Effect’?

The above material makes a case for the assertion that 1953 ‘stands out’ as an exceptional year with respect to scientific discovery. But is this really the case, or does it come down to some kind of sampling bias? Picking exceptional years for science becomes problematic in the post-war modern era, when the pace of scientific and technical change is so fast that it might be thought a fairly uniform process. As an example of the contrast with previous times, the year 1543 is often cited as the original scientific  ‘Annus Mirabilis’, based on publication of the astronomical findings of Nicholas Copernicus and anatomical studies of Andreas Vesalius. These were indeed spikes of achievement in an otherwise largely flat background, with no comparison to the modern continuous ferment of change and innovation.

One way to attempt to test the alleged pre-eminence of 1953 in an objective manner is to look at the levels of major productivity for specific periods on either side of that year. So, Fig. 1 shows ‘major advances’ for seven years on either side of 1953, including of course the year in question itself. The spike in 1953 is obvious, and supports the contention that its special significance for discovery does not result from cherry-picking or other biases.


Fig. 1. Numbers of major biomedical / social science advances in 1953, and 7 years before and after it. Information for each year is below in References & Details. Here ‘advances’ are restricted to the biomedical field and social sciences for simplicity.


But ‘support’ in this context is not proof, and certainly the data of Fig. 1 could be disputed on the basis of what is included or excluded. For example, the achievements of 1953 include an archaeological triumph (the decipherment of Linear B by Michael Ventris), and this field is sometimes regarded as having a foot in both the social sciences and the humanities. Yet even excluding this event, 1953 still predominates. Since at least two very significant advances included within in physics, chemistry or geology were also listed in Table 1, it is unlikely too that addition of achievements within these fields would radically change the observed pattern in Fig. 1. But even if it did, 1953 would still shine as an Annus Mirabilis within the biomedical field alone.

Obviously, it is possible to include years further in the future to 1960, but changes in the total background of research output also need to be considered. In Fig. 2, the general trends of overall publications by year in Medline after 1945 are shown. (Medline is a large component of PubMed, the free publication database from the US National Center for Biotechnology Information. PubMed itself includes a significant number of non-biological journals, but not in a comprehensive manner over the entire time-period of this survey. Medline (as the name implies) focuses on the biomedical sphere, which also includes important generalist scientific journals). A low level of biological scientific endeavor at the end of World War II is expected, followed by an upward curve of productivity. Yet this has some distinct aspects which are somewhat surprising. Rather than a steady progression of increase, the publication rate in the 1950s reached a plateau. In fact, the level in 1951 was not (slightly) exceeded until 1960, and 1953 was by no means any high point within this period. This plateau effect has not been observed for all the succeeding years since then. A more or less constant rate of increase (with a few wrinkles) is seen for four decades (1960-2000), after which an acceleration of the rate is seen, almost as though the new century and millennium was inspirational for scientific endeavor.


Fig. 2. Distribution of all publications contained within Medline by year 1945-2008, showing that the plateau during the 1950s has not occurred since. Four separate trends (approximate slopes shown with red lines) are apparent: 1940-1950; 1950s; 1960-2000; >2000.


The 1950s biomedical publication pattern is shown in more detail in Fig. 3 below, both for Medline and the more general PubMed. The ‘plateau’ effect is evident in both cases.


Fig. 3. Distribution of all publications contained within PubMed (top) and Medline (bottom) by year 1945-1965. A curve-fit is used for PubMed data, while the Medline data ‘trends’ of Fig. 2 are again shown with red lines.


So, what can we make of these patterns, especially in the light of the theme of 1953? Various explanations might be offered as to why the biomedical publication rate leveled off in the 1950s after an initial post-war surge, but that is not the main issue. Rather, it simply places 1953 in the context of its times. It is quite clear that that the 1950s in general, and certainly the year 1953 itself, were not characterized by an outburst of accelerating productivity in gross terms. Rather, the take-home message is quality over quantity: the outstanding results of 1953 did not occur against a general global back-drop of surging quantities of published scientific data. The innovations of Fig. 1 (Between 1946 and 1960) then all share an approximately equal research publication background rate.

Biology at that time in fact had a considerably lower profile than in the present age. There were ‘hard’ sciences, exemplified by physics and chemistry….and much softer ones. The triumphs of physics in the atomic and quantum domains were fresh and ongoing, and its prestige accordingly very high. In general attitudes, biology was often relegated to the softer sciences, if not always spelt out as such. This should be qualified by noting the many subdivisions within biology itself, which themselves were seen to span a spectrum of ‘hardness’. For example, biochemistry as an experimental science was certainly established long before the 1950s. This ‘harder’ area of biology stood in contrast with ‘softer’ areas such as descriptive zoology and botany, or psychology. But even given biochemical knowledge, biology in general was often considered a poor relation to the physical sciences in terms of its general rigor. Some might say that the essence of this still exists, but the development of molecular biology has done much to dispel such prejudices. Yet in the 1950s, molecular biology had not yet arisen to its current prominence, and its offspring of biotechnology was primordial at best . (Obviously, this depends on how one defines ‘biotechnology’ itself. In its most-frequent current sense where it is heavily underpinned by molecular biology, it clearly did not exist at all, but if traditional selective breeding and other long-standing approaches are included, then it certainly had a presence). In fact, a key feature of the ‘take-off’ of biological science in the 1960s (Fig. 1) and beyond is the increasing application of molecular approaches to a wide variety of studies.

Why 1953?

But even accepting the special nature of 1953 for scientific achievement, does it have any other significance? Or in other words, was there some external factor which somehow contributed to this spike of productivity? A simple answer is, ‘No’. It is simply a chance cluster of independent events without any causal linkages. There is little point in attempting any further analysis, since a statistical cluster is a data set with similar properties, and this is very hard to apply when classifying scientific value. In terms of impact, even acknowledged ‘major’ scientific discoveries clearly have widely differing short and long-term impacts, and attempting to quantitate such factors is fraught with difficulty, even if one accepts that it is possible. But, since the discovery of DNA structure and the Miller-Urey experiment alone both fell in 1953, along with several other notable achievements, there can be little doubt that this year does indeed ‘stand out’. And an ‘Annus Mirabilis’ is no less full of wonder if it is conferred by chance alone.

To conclude an unusual post, an unusual biopoly(verse) is offered, as a salute to both to the central biological accomplishment of the year of interest, and the above-noted poem by Philip Larkin:


A Larkin Lark in Another Annus Mirabilis


DNA, of course, began

In nineteen fifty-three

(Which was earlier than me)

Between the end of the Marshall Plan

And Einstein’s mortality


Up till then there’d only been

Half-baked imagining

A wrangle for the thing

A molecule no-one had seen

A structure for a king


Then all at once (let’s be frank)

Everyone saw the same

And biology became

Molecular money in the bank

With DNA a household name


So labs were never better than

In nineteen fifty-three

(And yet no good to me)

Between the end of the Marshall Plan

And Einstein’s mortality


References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).

‘……60th anniversary of the discovery of the structure of DNA by Watson and Crick…….’     In fact, the date of this post (April 25th) is the exact publication date of their paper in 1953.

….. ‘Only a small minority of people …..Who first discovered the structure of DNA     From personal experience, despite their scientific fame, the names of Watson and Crick are poorly known in comparison with political figures or celebrities.

‘……the scientific or technological advances are far more meaningful and long-term than the vast majority of short-lived political developments….’     An interesting book in this regard is The 100: A Ranking of the Most Influential Persons in History, by Michael H. Hart. Hart Publishing, NY. 1978. Of the total, 37% were directly science-related.

‘… can conclude that an ‘Annus mirabilis’ for scientific discovery could easily ‘fly under the radar’ from gaining widespread recognition….’     In support of this, it can be noted that the Wikipedia entry for ‘Annus Mirabilis’ lists many years which have been cited as such over a 500-year range, but 1953 was not numbered among them, as of April  25th,  2013.

‘……Salk’s work towards a polio vaccine ……. was published in 1953 in the Journal of the American Medical Association. ‘ See Salk 1953.

Table 1 Publications:

DNA structure: Watson & Crick 1953.

Early earth organic synthesis (Miller-Urey): Miller 1953.

Sperm freezing: Bunge & Sherman 1953.

Polio vaccine: Salk 1953.

Piltdown man hoax exposure: See Oakley & Weiner 1953. For a very detailed annotated bibliography relating to the Piltdown fraud, see Turrittin 2006.

Linear B decipherment: Ventris, M. & Chadwick, John (1953). “Evidence for Greek Dialect in the Mycenaean Archives”.  J. Hellenic Studies 73: 84–103. For more detail on Ventris papers, see this source.

Demonstration of neutrinos: See Reines & Cowan 1953. Confirmation was achieved by 1956.

Demonstration of mid-Atlantic rift: See Ewing et al. 1953.

Original operation on HM (1953): See the New York Times 2008 obituary of this patient.  For a scientific account of studies with HM (up to 2002), including citation of the first publication of the case in 1957, see Corkin 2002.

‘… of Stanley Miller (and Harold Urey) ……. has been a scientific landmark…..’      This Miller-Urey experiment was also referred to in a previous post, from the perspective of it having characteristics of a ‘Kon-Tiki’ type experiment. (An experiment where the feasibility of a proposed pathway towards a known system or state is tested by recapitulating the pathway itself under defined experimental conditions. But a successful ‘Kon-Tiki’ experiment alone can only demonstrate possibility, and is by no means proof that the particular pathway studied is that which actually was used, whether in biological evolution or human history.

‘…..the Piltdown man ‘missing link’ in 1912 was proven to be a hoax…..’     The identity of the fraudster has never been proven beyond reasonable doubt, although the discoverer of the ‘fossils’, Charles Dawson, has long been a prime candidate. For a book length account of the original events and the controversy regarding the perpetrator, see John Evangelist Walsh, Unravelling Piltdown, Random House, 1997. See also a 100th-anniversary article on this subject by C. Stringer.

‘…..the major technological innovation of the transistor……’      See the paper of Pinto et al 1997. (hosted by Imec corp.)

‘….Kathleen Lonsdale….’     See an article by Julian 1981 with respect to her benzene work. Removing the Causes of War (1st Edition) was published in 1953 by George Allen & Unwin Ltd.

‘…..George C. Marshall might need a little explanatory amplification……’     See an essay by David Brin that gives more detail on Marshall’s significance.

‘…..the first ascension of Mt Everest can be compared with some types of scientific discovery. ‘     This general point was raised in the additional information provided with the book Searching for Molecular Solutions (‘Cited Notes for Chapter 8, Genomic Introduction – From Stone Age to ‘Ome Age’) in the associated ftp site.

Information for Fig. 1

Year |   Achievements

1946:   1 (Animal behavior; discovery of bee dance communication. See Von Frisch 1946)

1948:   1 (Identification of the role of acetaminophen [paracetamol] in analgesia. See Brodie & Axelrod 148)

1949:   3 ([1] Development of carbon-14 dating. See Arnold & Libby 1949; Libby et al. 1949. [2] Assignment of penicillin    structure. See Hodgkin 1949.  [3] Development of lithium treatment for mania. See Cade 1949.)

1950:   1 (Identification of the Calvin cycle in photosynthesis. See Bassham et al. 1950; Calvin et al. 1950.)

1952:   2 ([1] Renowned Alan Turing paper on chemical basis of morphogenesis. See Turing 1952. [2] Hershey-Chase experiment demonstrating informational role of DNA. See Hershey & Chase 1952.)

1953:    As for Table 1 (excluding physical / geophysical achievements)

1954:   2 ([1] First successful human organ (kidney) transplant by Joseph Murray. See his New York Times obituary. [2] The clinical deployment of antipsychotic phenothiazines (chlorpromazine / largactil). The history of chlorpromazine discovery as an antipsychotic agent is convoluted [See López-Muñoz et al. 2005], and assigning 1954 as the date of this innovation may be arguable, but this year is noted for three North American publications describing its efficacy. In this regard, see Bower 1954; Lehmann 1954, and Winkelman 1954.)

1956:   1 (First accurate finding of the human chromosome number. See Tjio & Levan 1956; also a historical overview by Harper 2006).

1957:   1 (Determination of myoglobin structure by X-ray crystallographic studies.  See Kendrew & Perutz 1957. Full myoglobin structure published in 1960; See Kendrew et al. 1960).

1958:   1 (Frog cloning by somatic nuclear transfer. See Gurdon et al. 1958).

1959:   1 (First successful human organ allotransplantation – See J. Murray Obituary).

1960:   1 (Total synthesis of chlorophyll. See Woodward et al. 1960).

DNA of course began‘…. Note to pedants: The ‘beginning’ of DNA of course refers to its structural elucidation; knowledge of its genetic significance preceded this. As an example, the famous Hershey-Chase experiment of 1952 is among the ‘major’ developments of that year in Fig. The Marshall plan ended in 1952 (as noted above); Einstein died in 1955. 

Next post: September.

Bioorthogonal Chemistry and Recognition

February 24, 2013

This post further extends some themes raised in previous posts, concerning xenorecognition  and its corresponding biological repertoires in relation to bioorthogonality (first noted  in the context of the possibility of ‘weird life’ in a shadow biosphere on Earth, and further considered later  in the context of the recognition of xenobiotics). Previously, the concept of true biological orthogonality was posed as a hypothetical state of complete ‘invisibility’ of a chemical entity and a biological system with respect to each other. Here, we are concerned with an artificial application of growing importance, where orthogonality refers to the specificity of chemical reactions of interest. This particular topic, noted briefly in earlier posts (see 30 May 2012), is thus further developed as the theme of this post. At the same time, it should be noted that it is not the intention to provide extensive chemical detail, for which many excellent sources are already available. Primarily, this post provides an overview with some additional views which are relevant to xenobiotic molecular recognition and the general concept of orthogonality in biology. .

Chemistry and Life

Curious parallels and contrasts can be made between the modern concept of chemical bioorthogonality and non-scientific (pre-molecular) notions regarding the nature of life. In early times it seemed self-evident that living things were so fundamentally different from non-living matter that life must be constituted of radically different stuff, whatever that might be. This supposition was frequently interlinked with the doctrine of vitalism, or the need for a ‘vital spark’ to animate anything that can be called ‘alive’.  Nevertheless, the old belief in the ‘otherness’ of life would result in ‘bioorthogonality’ (in a literal sense) following on as a natural corollary. This statement carries the assumption that the supposed living stuff would not only be distinct from non-living matter, but also would remain aloof from it. By this reasoning, any conventional inorganic reaction (insofar as they were known in earlier times) would fail to impact on a living organism.

But this viewpoint could be shot down very rapidly even in the earliest of times, with just a little thought. A great many natural substances can burn, damage or kill a living organism, ranging from acids and alkalis to toxic metals. For example, natural deposits of arsenic or mercury ores could easily be prepared and administered as poisons – the point  being that such deadly agents come from the non-living world,  irrespective of their chemical natures. Therefore, the lifeless material world can obviously impact dramatically on ‘life-stuff’, and ‘extreme bioorthogonality’ (the belief in the complete independence of life from non-living matter) could never have been a tenable proposition. (This might seem obvious as well from the need take in water and air, but these facts might be more easily rationalized away than the dramatic action of inorganic poisons).

But having conceded this, to many early thinkers it still seemed ‘obvious’ that living things, though influenced by the non-living world,  could not be made of the same stuff. It was well known from observing the effects of combustion that carbon was present in anything alive, and as a primordial discipline of chemistry gradually emerged from nonscientific alchemy, it became clear that many pure materials could be obtained from specific living sources. These were accordingly viewed as ‘organic’ in the literal sense, but such products were held to be special by virtue of their origins within things that (at least before their demise) possessed the mysterious property of life. Clearly it was possible to influence a living organism with the products of another. Any number of natural medicines or poisons from (mainly) plants and (less commonly) animals could easily make this point. Yet a pre-modern ‘bio-philosopher’ could simply claim that all such products were made by life, and thus possessed the enigmatic ‘organic’ quality which set them apart from everything else. True, the above-noted action of many never-living materials proved that inorganic matter could modulate life, but apparently in a crude and battering-ram sort of way.

This mental picture of organic material as partitioned from everything else received its first major challenge when the German chemist Friedrich Wöhler prepared urea in the laboratory in 1828, without any recourse to the urine of animals. (Urea, of course, derives its name from the excretory source from which it was first isolated, and it therefore stood as a classic organic substance ‘impossible’ to synthesize, at least until Wöhler showed otherwise). Since that time, the science of organic chemistry has seen a series of increasingly intricate synthetic triumphs, where all manner of complex natural compounds have yielded to the ingenuity of synthetic chemists. Obviously, all notions of the ‘special’ nature of organic compounds and their dependence on a ‘life-force’ have long since been consigned to the proverbial trash-can of history. We retain, though, an echo of this through the retention of the word ‘organic’ in reference to covalent carbon compounds in general, whether or not they derive from living biosystems.

The demolition of the special status of the chemistry of life, and its integration with chemistry in general, might seem at first glance to remove the possibility of ‘orthogonality’ from biology.  If life is, in essence, nothing more than souped-up chemistry, how could it stand aside in any aspect from general chemical reactivity? Would not the incredible complexity of living organisms demand an equally complex repertoire of chemical reactions? And if the latter was true, would not virtually any chemical reaction engineered by humans already have its biological counterpart? Clearly, were that to be the case, then the concept of a ‘bioorthogonal’ chemical reaction would be illusory, since any reaction component would be able to find a biological target with which to react. Nevertheless, this supposition is emphatically not the case at all; it is demonstrably incorrect.

It is undeniable and obvious fact that life is a phenomenon of consummate complexity (often described as the most complex known system in the universe, with the emergence of  intelligence at its apex). Yet while an enormous number of distinct chemical transactions exist across the biosphere, the vast majority of these are catalyzed by protein enzymes composed of a relatively small ‘alphabet’ of amino acids (usually the canonical 20) linked together by peptide bonds in highly specific sequences. Many such reactions have an absolute requirement for additional cofactors as well as the relevant protein enzymes, in the way of metal ions, small organic compounds (vitamins) or metal-sulfur clusters. Even so, a ‘take home message’ from biology is that incredible complexity can emerge from combinations and permutations of a relatively small set of building blocks.

A corollary of this biological insight is that it should not be surprising to find that there are many precedents for chemical reactions which have no known biological parallels, although an important caveat here is the need to add the qualifier “under normal physiological conditions”. It is obvious that chemistry only operating under conditions of temperature and pressure far removed from those tolerated by life cannot be applicable within living systems. Nevertheless, even after this restriction is enforced, chemical candidates for non-biological reaction status still remain. A pair of such interacting chemical groups could thus potentially ‘only have eyes for each other’, in a metaphorical sense, as long as they did not react with the plethora of functional groups commonly encountered in biological systems. This notion is shown schematically in the Figure below:


Figure 1: Depiction of a bioorthogonal reaction pair, in the presence of examples of common biological functional groups (not exhaustive), which do not interact with the participating non-biological groups. Geometric shapes represent any biological molecules to which the bioorthogonal groups have been appended in vitro, prior to introduction to the intracellular environment.


Many applications of bioorthogonal reactions have been envisaged, such as specific labeling of biocomponents and tracing of the intracellular traffic of specific molecules. But for such systems to work in practice, bioorthogonality of the reactions is necessary but not sufficient: the reactions must also be highly efficient under intracellular conditions. The required reaction properties of high yield and product specificity are satisfied by a class of reactions termed ‘click chemistry’, envisaged as ‘spring loaded’ to rapidly and efficiently proceed down a specific reaction pathway. Some such reactions require catalysis to operate under normal ambient conditions, but this need can be obviated within certain molecular arrangements. A classic case of this is the reaction between alkynes (bearing a carbon-carbon triple bond) and the dipolar azide group. While this reaction involving a linear alkyne group requires a cuprous ion (Cu[1]) catalyst to work well at normal temperatures and pressures, if the triple bond is placed under strain within a cyclic structure (an eight-membered cyclooctyne ring) then the reaction is greatly accelerated. These observations are presented in the following Figure:


Figure 2: Alkyne-azide bioorthogonal reaction pairs, for (A) both linear-alkynes (Cu[1]-catalyzed) or (B) strained-alkyne octyl ring (uncatalyzed).


But the rate at which any intermolecular chemical reactions proceed is necessarily driven by the respective concentrations of reactants. Within the intracellular environment, the total concentration of solutes (all macromolecules and small molecules included) is very high, and at relatively low concentrations, even highly self-reactive and bioorthogonal groups may have slow kinetics of interaction. A useful approach which solves this issue is to engineer the reaction process such that a pair of bioorthogonal participants are brought into close spatial proximities with respect to each other, as a direct consequence of the targeting strategy. This is illustrated in the Figure below:


Figure 3: Bioorthogonal click reaction engendered through spatial proximity produced through the targeting choice. Two related proteins (1 and 2) are depicted with two proximal sites (in principal, either for natural ligands, or sites for which artificial ligands can be designed). If two such ligands equipped with mutually interactive bioorthogonal click groups are introduced into the intracellular environment, then only Protein 1 can bind the correct pair such that desired labeling product is formed.


Spatial proximity for promotion of specific reactions can be engineered in other ways, most notably via nucleic acid templating, a very large field beyond the scope of this post. But having given this very brief taste of how bioorthogonal reactions are practically engineered, let us move on to some general issues of the nature of chemical bioorthogonality.

Chemically Possible Bio-reactions

Chemical reactions that are ascribed as bioorthogonal could in principle be compromised in two broad ways: by the discovery of biocatalysts capable of accelerating the reaction between the ‘orthogonal’ pair of chemical groups, or capable of promoting reactivity between either (or both) of the foreign functional groups and one or more host molecules. Either possibility would require recognition by an enzyme of at least one of the ‘foreign’ functional groups.

Here it is appropriate to reiterate a point noted in a previous post, to the effect that orthogonality in biological systems is a relative concept. This referred to the observation that while a xenobiotic compound might be completely orthogonal within an isolated biological system, that status certainly is not necessarily the case within the biosphere as a whole. With respect to the potential biorthogonality of chemical reactions, a particular contrast can be drawn between the relatively limited repertoire of most eukaryotes in comparison with the astonishing catalytic diversity of the prokaryotic world. A good example of this is the observation that the complex core structure of Vitamin B12, an essential nutritional cofactor for many (although not all) multicellular eukaryotes, is made solely by prokaryotes.

If a pair of inter-reacting functional groups attained ‘pan-biospheric bioorthogonality’, then all known biological systems would necessarily lack this kind of chemical functionality. But the absence of a reaction from biology certainly does not mean that it could never exist. A case in point also noted in the same previous post  has been the apparent extremely rapid evolution of the bacterial enzyme phosphotriesterase. In fact, such apparent large jumps in an imaginary ‘catalytic space’ are not necessarily difficult in terms of the parallel evolutionary moves within protein sequences. A recent demonstration of this comes from the artificial adaptation of a specific cytochrome P450 enzyme (one of a large class involved with xenobiotic recognition and modification) from its normal catalytic transfer mechanism into one involving a hitherto non-biological process. The adapted cytochromes in this case were selected from a relatively small set of artificial enzyme variants.

Another useful precedent is that of the catalysis of the Diels-Alder reaction, a chemical cycloaddition of great importance in synthetic chemistry. By rational design and directed evolutionary approaches, both protein enzymes (catalytic antibodies) and ribozymes have been derived which are capable of catalyzing this reaction, albeit not with very high efficiency. Yet although there is suggestive evidence from certain biosynthetic pathways, there is as at present no definitive evidence that a natural precedent exists for this particular catalytic activity. While this is a significant evolutionary question, the artificial generation of the specific catalytic activity in diverse macromolecular frameworks already shows its inherent biological feasibility.

So, with these precedents in mind, the assignation of a reaction process as ‘bioorthogonal’ should always be defined with specific limits of space and time. Spatially, in the sense that the orthogonality of a reactive pair can be assigned for a specific cell type, whole organism, phylogenetic group, or even the entire existing biosphere (‘pan-bioorthogonality’ once more). Temporally, in the sense that a ‘hole’ in biological rendition of a chemical reaction may be plugged through evolution (either natural or artificial), where the required time can be surprisingly short in at least some instances. Of course, the evolution of other challenges might require more fundamental protein adaptation. A specific example can be found from the above Figure 2A (showing the Cu(1) catalyzed linear alkyne-azide cycloaddition reaction), if an enzyme evolved which could accomplish both catalysis (possibly via a copper ion itself bound near the reactive site, or through some other cofactor) and engendering spatial proximity (through binding of reactants within an active-site pocket). This is portrayed below:



Figure 4: Schematic depiction of a hypothetical protein catalyst for an alkyne-azide cycloaddition reaction.


In fact, the alkyne (‘acetylenic’) triple bond is not at all absent from biological systems, being found in metabolic products of diverse organisms, ranging from prokaryotes to plants and certain animal lineages. Organic azides, on the other hand, are believed to be totally absent from known biosystems. Thus, although the hypothetical catalysis of Fig. 4 might be artificially engineered, in the absence of biological azides, its prospects for arising naturally would be remote. So any theoretical threat to the bioorthogonality of such reactions as a pair is most unlikely to be realized by natural agency. In principle, though, if organic azides were artificially introduced into specific cells synthesizing certain natural products bearing strained acetylenic bonds in ring structures (members of the enediyne natural product family), then an azide-associated reaction with the enediyne product could occur. In such circumstances, the resulting cycloaddition reaction, deriving from a natural host product and a non-biological reactant, would lack complete orthogonality.

Levels of Orthogonality

It should be kept in mind that the above considerations regarding bioorthogonality are at the level of chemical reactions, which is a distinct concept from biorthogonality at the level of molecular recognition of initial reactive groups or the products by xenobiotic recognition systems. Certainly these two categories have regions of overlap, as when a natural enzyme must bind (show molecular recognition for) a substrate as well as enact a catalytic change upon it. But binding per se is not necessarily linked with catalysis, as demonstrated by a vast array of ligand / receptor interactions, as well as antibodies and their cognate antigens. It was in fact noted in a previous post  that the physiological action of some xenobiotics arises from non-covalent interactions with normal host receptors, with concomitant aberrant signaling induction.

Nevertheless, in terms of recognition of xenobiotics by natural defense mechanisms, binding and catalysis are usually linked, since catalytic modification of foreign compounds is an important part of detoxification processes (as discussed in another previous post. In any case, if we return to the example of Fig. 2, then assignment of the reaction as essentially bioorthogonal is acknowledged (with the above caveat of rare possible acetylenic host product interactions). Yet orthogonality may not necessarily exist at the level of recognition of either the reactants or the product(s) in terms of xenobiotic recognition, even in the same host organism. And at both reactant and product levels, recognition precedents exist for the components within Fig. 2.

In some organisms, alkyne groups are produced and recognized through biosynthetic machinery, as noted above for metabolites such as the enediynes. But more generally, substituted alkynes are clearly interactive with specific cytochrome P450 enzymes (important for xenobiotic processing, also noted above). In fact, certain compounds with carbon-carbon triple bonds have been well-characterized as P450 inhibitors, through the formation of covalent adducts. The action of a P450 enzyme on a simple substituted alkyne, propargyl alcohol, has been found to be responsible for converting it into the toxic propiolaldehyde, another example of the self-activation of a deleterious effect of a xenobiotic (this general phenomenon was discussed by way of an analogy to autoimmunity in a previous post).

While the azide functional group appears to be absent from biology, it can certainly have profound physiological consequences when artificially introduced as a moiety in certain molecular contexts. Probably the best-known example of this is the compound AZT (3’-azido-3’deoxythymidine), formerly used as a mainstay of treatment for HIV patients, although now superseded by more effective drugs. A major limitation of AZT was its tendency to produce often serious side-effects in certain patients. While ‘off-target’ activities are a bane to patients and clinicians, for the present purposes they help make the point that azido-compounds are far from orthogonal when introduced into complex (human) biosystems. Side-effects do not arise randomly; they result from unwanted interactions of drugs with otherwise-irrelevant specific host structures. Such interactions themselves surely provide a back-up demonstration of ‘non-bioorthogonality’ at the level of molecular recognition.

In addition, AZT is known to be recognized and metabolized by P450 enzymes. The final component of the reactions of Fig. 2, the 1,2,3-triazole group, is likewise recognized in a variety of biological contexts, including P450 enzymes as well. The 1.2.3-triazole isomer is also a central moiety of a variety of drug candidates for a wide range of potential targets.

All of these pieces of information are proffered as evidence that each functional group involved as reactants or products in a well-known bioorthogonal reaction are not at all orthogonal at the level of molecular recognition within an entire complex multicellular organism (where xenobiotic processing enzymes are expressed), as opposed to an isolated single cellular system. But the mutual interaction of bioorthogonal chemical components (such as azides and alkyne groups) and their avoidance of interaction with host functional groups in the vast majority of cases classifies the chemical reaction itself as bioorthogonal in almost all useful contexts.

The correlation of reactivity per se and orthogonality was considered in a previous post, where the remarkable fact was noted that highly unreactive compounds (such as saturated fluorocarbons) and even inert gases cannot be considered to be bioorthogonal, through their potent effects in inducing anesthesia. Therefore, even the complete preclusion of possible covalent bond formation does not necessarily exclude a chemical agent from the capability of strongly perturbing a biological system, the complete antithesis of a generalized notion of bioorthogonality. At the level of designing bioorthogonal chemical reactions, on the other hand, covalent bond formation is of central and defining importance. And new bioorthogonal reactions are being actively sought and found.

To finish up, a biopoly(verse) comment on an ideal biorthogonal reactive situation:

Make a molecule to enter a cell

Yet trigger no metaphorical bell

(Meaning bio-inert –

And thus causing no hurt)

While reacting with designed targets well.

References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).

‘…….it is not the intention to provide extensive chemical detail, for which many excellent sources are already available.’    Some reviews from the lab of a major pioneer of bioorthogonal chemistry (and related areas of click chemistry), Carolyn Bertozzi, are very useful. See Sletten & Bertozzi 2009; Jewett & Bertozzi 2010.  For another good recent review, see Best 2009.

‘…..Friedrich Wöhler prepared urea in the laboratory in 1828…..’    This occurred serendipitously while he was attempting to make ammonium cyanate (the ammonium salt of cyanic acid), which can undergo a conversion rearrangement from an ionic salt to a simple covalent organic compound:


A good source of further information is the relevant site from the Chemical Heritage Foundation. Here, a quote from Wöhler states,” the ammonium salt of cyanic acid is urea”. Of course, this is not literally correct as such, since the compounds have the same atomic compositions but very distinct structures and bonding states. But this discovery helped to frame the concept of chemical isomers, as coined by the Swedish chemist Berzelius, a contemporary of Wöhler.

‘………chemistry only operating under conditions of temperature and pressure far removed from those tolerated by life cannot be applicable within living systems.’     An example of this with a biological reply is the artificial Haber process for the reaction of atmospheric nitrogen with hydrogen to form ammonia, which requires both catalysts and high temperatures. But natural evolution has found a way to ‘fix’ nitrogen using specific enzymes (nitrogenases) and essential metal (typically iron and molybdenum, but sometimes vanadium) – sulfur clusters (See Hu and Ribbe 2011 for a recent review). For details on the Haber process itself, see Modak 2002.

‘………a class of reactions termed ‘click chemistry’…..’      For reviews on this topic, See Kolb et al. 2001; Best 2009.

‘……if the triple bond is placed under strain within a cyclic structure (an eight-membered cyclooctyne ring)……’  The smaller the ring group, the more strain that is placed on the normally linear alkyne bond, and in fact an eight-membered ring is the smallest that is stable while retaining a carbon-carbon triple bond under normal conditions.

Spatial proximity for promotion of specific reactions can be engineered in other ways, most notably via nucleic acid templating…..’      For a review, see Li & Liu 2004.

‘……the complex core structure of Vitamin B12, an essential nutritional cofactor for many (although not all) multicellular eukaryotes, is made solely by prokaryotes.’      And this also demonstrates but one aspect of the dependence of ‘higher’ multicellular organisms upon the activities of prokaryotes. For details on Vitamin B12 biosynthesis, see Martens et al. 2002.

‘…..the artificial adaptation of a specific cytochrome P450 enzyme ………. from its normal catalytic transfer mechanism into one involving a hitherto non-biological process.’     For the recent paper, see Coelho et al. 2013. Their work resulted in the normal oxene transfer (equivalent of atomic oxygen) mechanism of the P450 enzyme of interest converted into an isoelectronic (but non-biological) carbene (equivalent to a substituted methylene) transfer.

‘…….the Diels-Alder reaction a chemical cycloaddition reaction of great importance…..’      The eponymous title of this reaction (named for the two German chemists who originated it) has also lent itself towards the naming of the insecticides Dieldrin and Aldrin.     ‘…..both protein enzymes (catalytic antibodies) and ribozymes have been derived which are capable of catalyzing this [Diels-Alder] reaction.’       For catalytic antibody Diels-Alderase, see Heine et al. 1998; for an example of ribozyme Diels-Alderase work see Agresti et al. 2005.     ‘…..there is at present no definitive evidence that a natural precedent exists for this particular catalytic activity [Diels-Alderases]….’      See Kim et al. 2012.

‘……alkyne (‘acetylenic’) triple bond is not at all absent from biological systems, being found in metabolic products of diverse organisms……’      A note on terminology: strictly speaking, an ‘alkyne’ would only refer to a hydrocarbon containing at least one carbon-carbon triple bond. Here it denotes substituted alkynes, where one or both of the hydrogen atoms of an acetylene (C2H2) are replaced by other chemical groups, which could be as large as a protein molecule, or which could join cyclicly to form ring structures. Triple-bonded carbons within larger molecules are also often referred as ‘acetylenic’ after the prototypical alkyne molecule. For a good review of natural molecules containing such alkyne groups, see Minto & Blacklock 2008.

‘  Organic azides, on the hand, are believed to be totally absent from known biosystems.’     See Sletten & Bertozzi 2009.

‘……….if organic azides were artificially introduced into specific cells synthesizing certain natural products ….. (members of the enediyne natural product family), then an azide-associated reaction with the enediyne product could occur.’     The natural enediynes have 9- or 10-membered ring structures, less strained then cyclooctynes (as in Fig. 2 above), but still significantly strained relative to a linear alkyne. See Kim at al. 1993 for the structure of an enediyne antibiotic when complexed with its specific carrier protein.

‘……….. the resulting cycloaddition reaction, deriving from a natural host product and a non-biological reactant, would lack complete orthogonality.’      This hypothetical example is another illustration of how bioorthogonality may exist within a specific closed and defined biosystem, but not necessarily for the biosphere as a whole.

‘……alkyne groups are recognized through biosynthetic machinery……’      See Van Lanen & Shen 2008.

‘……certain compounds with carbon-carbon triple bonds have been well-characterized as P450 inhibitors….’     See Lin et al. 2011.

‘…..the best-known example of this is the compound AZT (3’-azido-3’deoxythymidine)…..’      AZT is a nucleoside analog, in common with numerous other antiviral compounds, aimed at impeding the function of virally-encoded polymerases.

‘…..limitation of AZT was its tendency to produce often serious side-effects…..’      One such side-target of AZT is human mitochondrial RNA polymerase II; see Arnold et al. 2012.

‘…..1,2,3-triazole group, is likewise recognized in a variety of biological contexts, including P450 enzymes….’    See Conner et al. 2012.

‘……1.2.3-triazole isomer is also a central moiety of a variety of drug candidates……’     For example, See Röhrig et al. 2012 (a dioxygenase target in cancer contexts), Manohar et al. 2011 (malaria), and Jordão et al. 2009; Jordão et al. 2011 (antiviral).

‘…….new bioorthogonal reactions are being actively sought and found.’     As an example, see Sletten and Bertozzi 2011.

Next post: April.

Synthetic Biology and ‘Kon-Tiki’ Experiments

December 16, 2012

In this post, we change tack from those of the past few months and return to a theme raised previously (Posts of 30th May, 7th June, and 20th June 2011) which centers on synthetic biology. Here the power of future synthetic biological science will be highlighted by considering a potential application of it towards the analysis of ancient evolutionary pathways. At the same time, exploration of this theme has certain interesting implications concerning the accomplishments that natural biological evolving systems can and cannot achieve.

 The Kon-Tiki Expedition and Its Goals

The central theme of this post, which ultimately concerns the future of synthetic biology, is approached by means of an analogy. As a tool of thought, the stratagem of analogy-making should not be trivialized. Indeed, some opinion elevates it into a premier position in cognitive processes, without which progressive complex thinking would be stymied. At the heart of any analogy is cognitive ‘chunking’, where small concepts are joined together into successively bigger ones, to facilitate the manipulation of increasingly complex mental structures. (By this kind of analysis, ‘chunking’ itself thus necessarily involves a degree of chunking in order for it to emerge as a coherent thought).

The analogy to be made in this context comes from science itself, but within a very different and necessarily speculative field, which might optimistically be termed ‘experimental archaeology’. In 1947 the Norwegian adventurer and ethnographer Thor Heyerdahl led an expedition on a balsa raft from Peru in a westward direction across the Pacific to the Tuamoto archipelago, a string of islands among the very many which were long ago colonized by the Polynesians. Heyerdahl believed that these bold seafarers originated from South America and spread westwards, rather than eastwards from an Asian origin as usually thought. His ‘Kon-Tiki’ expedition was a grand adventure and a great story, and it directly proved that it was indeed possible for people to build a raft with stone-age technology and float across the Pacific from South America to Polynesia in a westerly direction. Later expeditions led by Vital Alsar even extended the demonstrable rafting range across the entire Pacific (the La Balsa and Las Balsas expeditions in 1970 and 1973, respectively). But impressive as these feats were, in themselves they had no bearing on how human colonization of the Pacific actually happened.

Genetic, linguistic and cultural lines of evidence converge on the conclusion that all Polynesian settlements have resulted from migrations radiating in a general eastward direction across the Pacific. It is ultimately Asia, rather than South America, from which the direct ancestors of Polynesians originally derive. Accordingly, the vast majority of researchers in this field reject the central tenet of  Heyerdahl’s hypothesis of colonization in the opposite direction by South American peoples. This was so even in 1947, and has only been reinforced across the intervening decades.  What Heyerdahl and his La Balsa successors showed is evidence of possibility. Their daring feats demonstrated that determined human beings, armed with only a low level of seafaring technology, could in principle accomplish voyages across even the greatest ocean of this planet. But evidence of possibility is not evidence of occurrence, and hypotheses, however logically appealing and dear to the hearts of those who propose them, must inevitably bow before the weight of factual evidence.

Yet the complete story may well be more complex. Genetic and carbon-dating evidence has been proffered suggesting that chickens found their way into domestic use by South Americans through the agency of Polynesians, and if true would prove that Polynesian sailors ranged beyond Easter Island onto the western coast of South America. Other data have nevertheless challenged this study, and the proposal remains controversial and unproven. Another item suggestive of South American interaction is the origin of the sweet potato grown for food in some Polynesian societies, but this too remains controversial. Since Polynesians indisputably did reach the remote Easter Island, the possibility of Polynesian pre-Columbian communication with South America is hardly an absurd proposition in itself. But this is obviously a different matter to suggesting that such interactions originated in the reverse direction from South America.

In any case, applying the history of the Kon-Tiki expedition (and its like) as an analogy for other kinds of scientific investigation leads to an operational definition for the present post: A ‘Kon-Tiki’ scientific experiment is here defined as a scientific endeavor which attempts to demonstrate the inherent feasibility of a proposed past event, system, or process (necessarily historical or evolutionary) by re-creating the substance of the item of interest. Of course, if the analogy with Heyerdahl’s expedition was taken to its maximal extent, a ‘Kon-Tiki’ experiment would verify feasibility but ultimately be discredited as representing any true recapitulation of an actual historical or evolutionary pathway. Here the operational definition accordingly refers only to the ‘verification of feasibility’ aspect of a ‘Kon-Tiki’ test. In other words, any generic successful ‘Kon-Tiki’ experiment by the definition of this post proves feasibility; it may or may not be compatible with other lines of evidence obtained independently. Following on from this, it is important to note that there is nothing inherently negative about a ‘Kon-Tiki’ investigation in its own right. A successful demonstration of the feasibility of a system or process can be very powerful, and (depending on the circumstances) may provide important evidence favoring a specific hypothesis which accounts for an evolutionary development.

While positive ‘Kon-Tiki’ data is not of proof of past events, in combination with other information it may become compelling (a point which is considered further below).

But on the other hand, failure of a ‘Kon-Tiki’ test of any description proves nothing. Had Heyerdahl and his friends failed to reach Polynesia, it could not be used as evidence that oceanic raft voyages by early South Americans were impossible. (Here, of course, we are only considering the rafting possibility itself, in isolation from the weight of evidence which favors eastward Pacific colonization). And the same principle would apply even if the subsequent La Balsa expeditions had also failed in their objectives. Naturally, if a dozen succeeding adventurers all had their rafts sink separately in the eastern Pacific, the likelihood that ancient South Americans could have succeeded where they failed would seem vanishingly small. Yet the possibility could not be formally excluded that conditions in the past were significantly different in some crucial manner which favored success: for example, a change in oceanic winds or currents. Thus, only positive ‘Kon-Tiki’ results can be taken any further.

 ‘Kon-Tiki’  Synthetic Biological Experiments

An early and famous experiment which has ‘Kon-Tiki’ aspects was published by Stanley Miller in 1953. The ultimate aim of this work was nothing less than an understanding of the origin of life, approached by designing experimental conditions such that they resembled what was believed to correspond to conditions on the prebiotic Earth in those times of the remote past. A mixture of water vapor, methane, ammonia, and hydrogen was subjected to a heat source and continual electrical spark discharges (simulating lightning), and after several weeks, the contents of the reaction flask proved to harbor many of the amino acids found as constituents of proteins. So this experiment has a ‘Kon-Tiki’ flavor in the sense that it tested the feasibility of a proposition (that organic building blocks could arise in prebiotic conditions) and used assumptions as to what the initial starting state should be (the above gaseous mixture). To relate the latter starting-point issue to the original Kon-Tiki and other rafting expeditions, Heyerdahl and colleagues made reasonable assumptions as to the level of technology available to early pre-Colombian South Americans. In fact, in this respect the rafting ‘experiments’ were probably on safer ground than Miller’s work, since the nature of the early prebiotic atmosphere may have contained considerably more nitrogen and carbon dioxide than Miller (and his supervisor Harold Urey) assumed.

But for present purposes, the intention is to relate ‘Kon-Tiki’ experimentation with what might be achieved in the relatively near future through powerful applications of synthetic biology. Indeed, it can be proposed that it is precisely the anticipated advanced state of synthetic biology during this century which will really enable ‘Kon-Tiki’ experiments to come into their own. In a previous post (of 20th June 2011) entitled ‘Synthetic Life Part III – Experimental ‘Dark Biology’, a number of ambitious future synthetic biological projects were listed. With that in mind, it’s time to consider that there are different levels at which the analogy can be made between the formal goals of the Kon-Tiki expedition and biological projects, which are noted in the Table below.


Table: Different levels of analogizing biological experiments to ‘Kon-Tiki’ investigations. In Type 1, the match with the Kon-Tiki approach is closest, corresponding to : (1) Making the best possible assumption about a previous biological or historical state based on available information; (2) Formulating a hypothesis to account for a system or process which occurred during the early conditions of (1); (3) Constructing a system or process in order to assess whether hypothesis (2) is possible at least in principle; and (4) Defining in advance criteria for a successful test as in (3). This matching occurs when experimentation is specifically designed to assess the possibility of a past hypothesized state. The examples of Type 1 also have the characteristic of having been completed, or are actively under current development. On the other hand, it is abundantly clear that synthetic biology can proceed far beyond merely attempting to recapitulate postulated states in early molecular evolution. Type 2 projects envisage synthetic accomplishments where there is not necessarily any link between the products and previous evolutionary forms, even in principle. The example given is the generation of different nucleic acid backbones, with different sugar moieties, including adaptation of polymerases such that the novel nucleic acids can be replicated. (Certainly other examples could have been used, including the generation of nucleic acids with novel base pairs). Type 3 projects have a more clear-cut ‘recapitulation of proposed evolutionary events’ trajectory as for Type 1, except that their feasibility is currently beyond our immediate abilities. (Consider that the specific example given for ‘Type 3’ here, ribocytes [described in a previous post] would be completely dependent on the prior recapitulation of a range of ribozymes, not least of which would the ribozyme polymerase example of Type 1. And although there has been considerable recent progress, even that has a way to go before a truly self-replicating ribozyme polymerase is created).

See References & Details below for some links and more information on items in this Table.


Are we to exclude ‘Type 2’ experimental projects as ‘Kon-Tiki’ projects, where there is no underlying attempt to re-create a hypothetical state or system which existed once in the past? (Such projects necessarily include those involving structures for which no evidence whatsoever exists for prior involvement in known molecular evolution). Well, that depends on one’s willingness to extend the ‘hypothetical states’ beyond the boundaries of terrestrial biosystems. One could generalize synthetic biological studies to embrace not merely life on this planet, but all potential forms of chemical life anywhere. Accordingly, studies of any artificial biosystem configuration could be seen as testing hypothetical ‘others’ not just in terms of time (the past history of Earth), but elsewhere in space. As such, this kind of research can be viewed as an adjunct of astrobiology (a topic of a previous post). But of course, such theoretical considerations need not be a primary motivation for such work, which can have the simple and pragmatic aims of advancing the understanding of nucleic acid chemistry, and the generation of possibly useful novel molecular forms.

Major Pathways and Sidelines

The Kon-Tiki expedition and its aftermath remind us that mere re-creation of a past possibility (whether historical or evolutionary) can at best be only suggestive. Skepticism towards Heyerdahl’s theories regarding the South American origins of Polynesian peoples came not from any aspect of this voyage itself, but via comprehensive additional information, principally linguistic and genetic. A proposed historical or evolutionary pathway may enter the realm of greatly enhanced plausibility if its inherent feasibility was conclusively demonstrated experimentally, but additional information is essential to shore it up as a convincing hypothesis. (If multiple lines of evidence are in agreement and no attempts to falsify the hypothesis succeed, then by consensus it may become regarded as fact, in the provisional sense that all science is a continuous fine-tuning of our understanding of the universe, where current theories may need refinement in the light of newly emerging knowledge).

It is important point to note that any hypothesis which purports to explain present arrangements through ancient underlying processes necessarily must restrict itself to broad trends for which a prima facie case can be made. The hypothesis might stipulate that an ancient state A resulted in modern state C through a definable intermediate state B, inferred from certain features of C (the known and familiar) with the postulated primordial state A. Any number of side-events peripheral to such a ‘main game’ could in principle occur without leaving discernible traces in modern times, and without corroborating evidence it is profitless to speculate on them.

By an overwhelming majority, the ‘jury’ of modern paleoanthropologists have concluded that Heyerdahl was wrong with respect to his proposed migratory path for Pacific colonization, but the ‘side issue’ of pre-Columbian contact between Polynesians and South Americans cannot be so easily dismissed. (Although as noted above, both the proposed chicken and sweet potato precedents for cross-cultural transfer remain controversial). Again, numerous voyages in historical times bridging these cultures may possibly have occurred, either in an easterly direction by Polynesians (consistent with their known ocean-going prowess, and their attainment of the remote Easter Island) or a westerly direction by South Americans (rendered feasible by the voyages of Heyerdahl and his successors). But even if so, the long-term influence of such contacts must be inferred to be minimal, and therefore should be clearly demarcated from the ‘main game’ of Pacific history.

Are there evolutionary analogs of historical side-issues? In animal biology, we can compare major body plan ‘design schemes’ of evolution with the innumerable variations on such themes which can emerge. Thus, all of the major animal body plans arose during the ‘Cambrian explosion’ of around 545 million yeas ago, and have been maintained ever since. This preservation of higher-order bodily organization of distinct major groups (termed phlya) is in marked distinction to the steady turnover of species within each phylum itself. (It has been often noted that species have finite lives just as do the individuals which comprise them). So here basic body plans are the ‘main game’, and the profusion of variations on specific body plan designs are the ‘side issues’. At an even more fundamental level of life system design are the basic operations of biological information storage and its expression, founded in all existing cellular organisms on DNA, RNA, and protein. Regardless of the possibility of alternative biological fundamental configurations, the familiar arrangements are likely to be fixed through the inherent unfeasibility of radical change within a pre-existing system.

Heyerdahl, of course, was not attempting to demonstrate the possibility of a minor historical anomaly but rather a major influence. But as we noted above, showing that something is possible in itself does not enable one to discern its genuine long-term impact. Thus, to use the ’Type 3’ example from the above Table, if ribocytes were shown to be physically feasible by advanced synthetic chemistry, then the case for their possible early existence in molecular evolution would become stronger. Yet even these ‘RNA cells’ could in principle be ‘side-issues’ rather the ‘main game’ in the development of life, if the precursor to modern DNA-RNA-protein cellular life split from even earlier developmental lineages. (In other words, peptide, proteins and DNA might have emerged during an earlier phase of the RNA World, and this lineage progressed towards modern biosystems. Other RNA-based lineages could have progressed independently towards becoming functional ribocytes, but constituted an evolutionary dead-end through competition from more efficient DNA-RNA-protein protocells).

So is that all that could be said about a successful synthetic biological ‘Kon-Tiki’ experiment? Perhaps, if it remained standing in isolation. But just as the proposed (and physically feasible) historical pathway of eastward Pacific migration from South America was discarded in the light of additional information, so too can synthetic biological insights be tested and tempered by aligning them with other biological information sources. A combination of these (perhaps involving multiple synthetic biological experiments) have the prospects for reconstructing the most probable molecular evolutionary pathways that progressed in the remote past. This higher-level data integration would necessarily involve assigning relative fitness levels to alternative molecular systems, where the bottom line for ‘fitness’ is replicative efficiency. To achieve the latter fundamental status, other systems (such as nutrient acquisition and energy generation) also have to compete for relative efficiencies. Thus, if multiple lines of synthetic biological evidence suggested a proposed specific pathway in molecular evolution had maximal fitness, its real former existence would become a solid hypothesis.

Chance vs. Necessity

But an important caveat has to be noted, which comes in at the level of deciding what in fact is a viable ‘alternative system’. If one is attempting to reconstruct ancient molecular evolutionary pathways (rather than ‘free-form’ synthetic biology paralleling Type 2 projects of the above Table), then a proposed biosystem always has to be placed in the context of its immediate precursors; or how it developed in its own right. In other words, from an assumed starting state A (for example, the early RNA World), several alternative biosystem pathways might be proposed (B, C, D…) with the necessary restriction that all must be viable precursors for later cellular stages of life. It might be possible to recreate each theoretical state B, C, and D by synthetic biology (demonstrating their physical feasibility), but to be useful for evolutionary studies, a transitional pathway leading from state A must be given as a feasible model. So while B, C and D might be physically attainable, evolutionary constraints ‘lock in’ certain alternative successors (B might be the only viable outcome). In more precise terms, going from A to C or A to D might be achievable only with intermediate forms of reduced fitness; the transition to B is favored if all intermediate forms have progressively greater fitness.

These kinds of restrictions on what are feasible evolutionary steps (even if alternative systems are physically possible) are ‘historical’ in the sense that they arise from the inherent nature of the pre-existing systems upon which future evolutionary change must build. An organized biosystem that has ‘invested’ in proceeding down one broad pathway may not be able to switch to another once the initial ‘system platform’ has been laid down. But at some earlier stage of molecular evolution, before such irreversible commitment has taken place, a point would exist where branching in either direction was feasible in principle. What determines the ‘road taken’? In an ideal analysis, among a population of competing biosystems, the complete range of different alternatives will emerge (both ‘roads’ in this analogy), where only the most reproductively competent ‘road’ (the fittest; in the face of environmental challenges) will predominate. But in practice, it may not be so simple. The initial emergence of one successful configuration may result in biosystems bearing it getting a ‘head start’ and developing even further, allowing them to out-compete potential earlier-stage alternatives showing up later. All subsequent building upon the successful system is then ‘locked in’ to its design plan. In evolutionary parlance, such scenarios have frequently been referred to as ‘frozen accidents’.

But all such early alternative systems at the outset may share certain invariant subsystems which cannot change without global loss of fitness in their normal environments. Debate over the interplay between such necessary features and chance events has been an ongoing theme in evolutionary science. ‘Chance’ in this context really refers to evolutionary ‘choices’ that are influenced by contingent factors which are difficult or impossible to predict. Such contingencies in principle could range from stochastic (purely probabilistic, flip-of-a-coin) events at the molecular level (such as the occurrence of different random mutations with very different resulting evolutionary trajectories), to external global-scale catastrophes (such as the bolide strike associated with the extinction of much of the dinosaurian lineage).

In one sense (as has been pointed out), invoking contingency in evolution says little, because selection for fitness is always contingent on environmental pressures, which inevitably will change with time. Proponents of the importance of contingency thus are usually referring to the intervention of chance-based and unpredictable strikes (metaphorically or literally) which profoundly shape evolutionary landscapes. This theme will be developed further in the succeeding post.

So what can ‘Kon-Tiki’ synthetic biological experiments, however sophisticated, have to say about evolutionary contingency? A positive result in a synthetic biological reconstruction of a postulated early biosystem demonstrates its physical possibility, in the same manner as for Kon-Tiki rafting. If more than one such systems can be devised as viable alternatives, then in principle from starting state A, these ‘options’ are B1, B2, B3….and so on. Then, the challenge is further reconstruct intermediate states between A and the alternative Bs, where each step is both evolutionary feasible (by mutation or recombination) and of demonstrable fitness value. Each of these sub-projects then is a Kon-Tiki test in its own right. Failure to find viable intermediates between A and (say) B3, but success with the others, would suggest (although not prove) that B3 was physically possible but a low-probability outcome from state A. Where is contingency in this kind of hypothetical advanced technological analysis? Well, consider two alternative evolutionary pathways (from starting point A) as B1 and B2, where the former corresponds to ‘real world’ circumstances, and the latter is an artificial construct (physically validated by synthetic biology). Now, if both of these appeared to have equally plausible evolutionary intermediates and equivalent fitness, then the actual success of B1 over its potential rival B2 could be interpreted as having occurred through the intervention of an unpredictable contingency. For example, this might operate through the extinction of an emerging population of B2 biosystems through a local environmental disaster which spared a sample of B1 competitors. The progeny of the B1 alternative could then continue to develop into a state more or less unassailable by any new emerging rivals, and take over all available habitats.

But it would be hard to make conclusive decisions in this regard, and that is why the above wording included the qualifier “appeared to have”, with respect to perceived relative fitnesses. It could continue to be argued that judging the relative fitness of alternatives B1 and B2 can only be accurately done with a full knowledge of (selective) environmental conditions in the ancient era of interest, which clearly cannot be given with complete confidence. Nevertheless, repeated tests under a range of different conditions could themselves help to resolve this, until a general consensus of scientific opinion is obtained. All of these hypothetical studies, of course, fall into the ‘Type 3’ category in the above Table, as a consequence of their ‘Kon-Tiki-like’ aspects, but which still lie in the future of the synthetic biological field. Yet the future very often reaches us much faster than we anticipate…..

Finally, a biopolyverse salute to the inspirer of this post, despite the lack of acceptance of his proposals:

Thor Heyerdahl took the Kon-Tiki

On a voyage that some thought was freaky

He took his bold notion

Across half an ocean

The idea, not raft, proved leaky

References & Details

(In order of citation, giving some key references where appropriate, but not an exhaustive coverage of the literature).

‘…..some opinion elevates it [analogy creation] into a premier position in cognitive processes….’    See an online discussion of the importance of making analogies in thought (“Analogy as the Core of Cognition”) by Douglas Hofstadter, the author of Gödel, Escher, Bach – An Eternal Golden Braid.

‘…..Thor Heyerdahl…..’    See the 2002 New York Times obituary. Heyerdahl’s best-selling account of the voyage was published in 1950. (The Kon-Tiki Expedition [various editions, including George, Allen and Unwin, London] 1950).

‘……Vital Alsar ……. expeditions….’    An account of the La Balsa expedition was published by Alsar in 1973. (La Balsa: The Longest Raft Voyage in History, by V. Alsar [with E. H. Lopez translation from Spanish]  Reader’s Digest Press, 1973). The subsequent Las Balsas expedition (involving three rafts) was even longer than the single-raft La Balsa. The Las Balsas voyage across the Pacific from South America ended in the coastal town of Ballina, New South Wales. One of these rafts is currently displayed in a local Ballina museum.

‘……a string of islands among the very many which were long ago colonized by the Polynesians….’     The term ‘long ago’ can have a very fluid definition, even on the time-scale of human existence. Although the origins of the oldest westerly Polynesian settlements can be measured in millenia, recent evidence suggest that the most remote colonizations (principally Hawaii, Easter Island [Rapa Nui] and New Zealand) occurred historically at much later times than previously thought, quite possibly as recently as 800 years ago (Hunt & Lipo 2006; Wilmshurst et al. 2011). By this reckoning,  even at the time of the Norman conquest of England, all these islands still existed in their pristine pre-human contact states, with their rich array of uniquely evolved flora and fauna.

It is ultimately Asia, rather than South America, from which the direct ancestors of Polynesians originally derive. ’   Although the general eastward direction of colonization is not in dispute, the details of when, how and from exactly where this took place have been more controversial. But in general, an interesting feature of Pacific colonization is its extreme biphasic nature: it encompasses the earliest movement of modern humans out of Africa (into Australia and New Guinea) and the most recent colonization events (into eastern Polynesia). For a recent review on this topic, see Kayser 2010. Two competing models have been the slow diffusion eastward from Melanesia as the main pathway (‘slow boat’), vs. a relatively rapid pulse of migrations stemming from Taiwan south-easterly through Melanesia and then eastward for the colonization of Polynesia (‘express train’). Most recent evidence appears to support the latter view, from both linguistic (Gray et al. 2009) and human genetic evidence (Friedlaender et al. 2008). The ‘express train’ stance is also consistent with evidence from the spread of different strain profiles of the pathogenic human stomach bacterium Helicobacter pylori (Moodley et al. 2009).

‘……evidence has been proffered suggesting that chickens found their way into domestic use by South Americans through the agency of Polynesians….’     See Storey et al., 2007 for presented evidence, and Gongora et al. 2008 for analyses contrary to the conclusions of Storey et al.  ‘……the proposal remains controversial and unproven.’    See Storey et al. 2012.

‘….the origin of the sweet potato grown for food in some Polynesian societies, but this too remains controversial …..’     It has been noted that the sweet potato could have made its way to Polynesia from South America not necessarily by human intervention, but through drifting of viable tubers on ocean currents. Recent modeling (Montenegro et al. 2007) has been offered in support of this.

‘……the possibility could not formally excluded that conditions in the past were significantly different in some crucial manner ……. for example, a change in oceanic winds or currents.’     With this point in mind, it is interest to note that is has been proposed that Polynesian voyages were enabled at a specific historical times through favorable westward winds produced through El Niño effects (Anderson et al. 2006).

‘…..experiments which have ‘Kon-Tiki’ aspects were published by Stanley Miller in 1953.‘    For Miller’s original paper, see Miller 1953. (Miller was working in the laboratory of the Nobel Prize winner Harold Urey at that time, so although he was the sole author on this paper, this classic work is often referred to as the Miller-Urey (or Urey-Miller) experiment. Miller and others continued work of this general theme with numerous refinements for many years after the initial publication.

‘…..the rafting ‘experiments’ were probably on safer ground than Miller’s work….’     While (as noted above) Heyerdahl’s beliefs regarding Pacific colonization have long been discredited, the physical ‘ancient’ design of the Kon-Tiki raft itself has not been a significant bone of contention.

‘…..the nature of the early prebiotic atmosphere…..’     There is evidence that carbon dioxide levels were very significant, and indeed without the resulting greenhouse effect, the weak sun of that ear could not have kept the mean Earth temperature above freezing (see Kasting & Ackerman 1986).

Table reference details:

‘…..generation of different nucleic acid backbones, with different sugar moieties, including adaptation of polymerases……’     Although a number of different labs have been involved in this field, in a recent study (Pinheiro et al. 2012) nucleic acid backbones with unnatural sugars have been generated, with accompanying adaptation  of polymerases which enables their faithful complementarity-based replication.

selection of novel aptamers ’   Aptamers are functional nucleic acids selected for binding specific ligands through repeated rounds of directed evolution in the laboratory. See Ellington & Szostak 1990; and Tuerk & Gold 1990 for the original papers in this area. The adaptation of polymerases to accommodate the unnatural nucleic acids of the above Pinheiro et al. (2012) study allowed the isolation of corresponding ‘unnatural’ aptamers.

‘….ribozyme polymerase……..considerable recent progress……’   See Wochner et al. 2011 . This paper and the achievement presented was also referred to in a previous post. (3 May 011).


‘….Debate over the interplay between such necessary features and chance events…..’     See the classic book Chance and Necessity (Vintage Books translation, 1971) on this theme by the French Nobel prize-winning molecular biologist Jacques Monod.

Next post: February 2013.