Skip to content

Natural Molecular Space

July 19, 2011

Many large sets of physical or theoretical entities can be modeled as ‘spaces’, through mapping of values for their specific properties as multi-dimensional arrays. Such modeling is useful to examine networks of similarities and inter-relationships between members of the comprehensive set of interest, and ‘chemical space’ for small molecules is one such conception. When we make use of natural products for specific purposes, we are in effect sampling from a local natural molecular space, which in turn is a tiny subset of an enormous space of all chemically possible molecules. This ‘local’ set of molecules is but a tiny corner of the larger universe, but the special conditions on Earth where life flourishes render its molecular endowment vastly more complex than what has been found in the universe elsewhere to date. And this will remain the case until truly independent alien life is identified, if it indeed exists. (See an earlier post on the subject of astrobiology.)

The previous post looked at ‘chrestomolecules’, or molecules classifiable as economically useful in some way or another, irrespective of their sizes or origins. The present post will primarily focus on small biologically-derived molecules (less than ~3500 Daltons, based on the largest known naturally-made examples of this type. As we will see at the end, though, a size-based distinction can become somewhat arbitrary when certain classification criteria are applied.

The Nature of Biomolecules on Earth

At this point we should consider in more detail some special features of natural molecular space as a whole. It is possible to classify all biomolecules by many schemes, including their chemical natures (protein, lipid, carbohydrate, nucleic acid, and so on). One way is to sort them by function, and two very broad functional categories are molecules which are required (directly or indirectly) for all biochemical activities of an organism relating to normal survival and growth, and those involved with some form of interaction with the external environment. The latter products constitute the subject material for the burgeoning fields of ‘molecular ecology’ and ‘ecological biochemistry’, and most natural products with medicinal value to human beings fall within this territory. The great majority of biomolecules employed by humans (especially until recent times) are of relatively low molecular weight and (as noted previously) often categorized as secondary metabolites (products of metabolism). Although still widely used, the term ‘secondary metabolites’ (as distinct from the products of ‘primary’ metabolism) does not meet with universal approbation, and has ‘no simple one-line definition’. Traditionally, secondary metabolites are products of certain plants, fungi and bacteria, but it is clear that biomolecules of demonstrated or potential medicinal use are found in a range of both vertebrate and invertebrate animals as well.

A fundamental aspect of these compounds embodied in the ‘secondary’ tag is that they are not essential for the growth and survival of the organism per se, and are found in very variable levels between different taxonomic groups. When present, however, secondary metabolites are synthesized from universally available biological precursor molecules, by specialized enzyme protein molecules, often in a series of sequential steps. Each stage in such a ‘biosynthetic pathway’ involves a separate enzyme operating on successively modified metabolic products until the final version(s) are made. The functions of secondary metabolites have been controversial, but they have usually been presumed to provide a selective advantage under some conditions, acting as toxins against predators, competitors, or parasites. The evolutionary origins of secondary metabolites have also been a long-standing source of debate, especially given that their specific functions are often poorly defined. Organisms producing these compounds are generally associated with competitive environments which may promote rapid diversification in response to strong selective pressures.

Notes for Figure: Utility of specific host modification for allowing broad-spectrum inhibition of targets of competing organisms. Following target molecule A modification (by an enzyme present in organism A but not competing organisms), a secondary metabolite binding to a conserved region can be produced, and will bind to the conserved sites of homologous targets X, Y and Z.

_______________________________________________________________________

Microbial populations (especially among soil organisms) may show great diversity in secondary metabolite production, but gene homologies between different organisms suggest common origins for some biosynthetic enzymes, probably by gene transfer. Gene duplication and divergence is acknowledged as a probable important driver of diversity in the enzymatic machinery of secondary metabolite production. Genes encoding enzymes which control the production of antibiotic secondary metabolites tend to cluster and to be selected as a group. A good example of this is biosynthesis of the antibiotic erythromycin in the bacterial organism Saccharopolyspora erythrea, which involves a complex series of reactions mediated by many proteins all encoded within a single large contiguous segment of its genomic DNA. In order to avoid self-toxicity, this organism also has a gene which confers resistance to erythromycin itself. (This works by the action of the gene’s protein product, which specifically modifies the organism’s ribosomal RNA where protein synthesis occurs, such that it is not susceptible to the antibiotic. The general principle of this is depicted in the above Figure). This situation is reminiscent of bacterial restriction-modification systems, where enzymes targeting specific DNA sequences are produced in order to destroy incoming foreign viral or other DNAs, with host DNAs protected by a specific chemical modification (usually methylation). Restriction-modification enzymes of bacteria confer an ‘immunity’ to invading DNA, and secondary metabolite production has also been compared (at least in some respects) with an immune system against competitors.

One observation which must be reconciled with any evolutionary hypothesis is that the diversity of the metabolites (the ‘metabolome’) produced by specific organisms appears at first glance to exceed their genetic capacity to encode the relevant diversity of biosynthetic enzymes. With the increasing number of microbial and other genomes which have been completely sequenced this issue has become well-defined if not comprehensively resolved. The metabolome / genome size discrepancy has been attributed to relatively low substrate or catalytic specificities (‘catalytic promiscuity’) of the enzymes involved in the pathways of metabolite biosynthesis. This means simply that the product molecules resulting from the enzymes’ actions are not precisely defined, although in practice they will usually be related molecules. Low substrate specificity indicates that an enzyme is less choosy about the starting molecule that it will act on (its ‘substrate’) than in the case of a high-specificity enzyme. Low catalytic specificity refers to the range of enzyme action itself. A biosynthetic enzyme with low catalytic specificity might be capable of transferring multiple types of chemical groups to its substrate molecule(s), or might have the ability to transfer the same chemical group to different sites on such substrate(s). The end result is that a limited number of enzymes with reduced specificity can synthesize many different metabolite molecules. Although these products may be closely related chemically, small differences in chemical structure can radically change their biological effectiveness. Also, provision of alternative precursor substrates or alteration of culture conditions can induce dramatic changes in the types of product yielded from the same organism (‘one strain, many compounds’). In fact, deliberate inhibition of specific enzymes in a multi-enzymatic pathway can be a useful route towards artificial end-product modification.

Many secondary metabolites have shown at best only moderate measurable activity in their presumptive biochemical roles. An interesting hypothesis accounts for the diversity of apparently sub-optimal metabolites from a single organism by postulating that a selective advantage is conferred by maintaining such diversity, provided it is accomplished with minimal metabolic cost. This is based upon the supposition that the likelihood of inhibiting a competing organism (with a multitude of potential target molecules) is maximized by producing a wide range of compounds, some of which may have moderate but useful affinity for a target. In contrast, a high-affinity interaction between a metabolite ligand and a protein is a relatively rare event. One problem with this model is that failure to pinpoint a selectable (evolutionary fitness-conferring) function of any specific metabolite of an organism does not prove that such a function is absent. This is especially so given the complexity of environments in which secondary metabolites are typically formed, and the difficulty of screening all possible competing organisms. (Only a quite small component of all soil organisms appear to be amenable to in vitro cultivation at present). Also, it has been suggested that secondary metabolites may have different effects at low concentrations (for example, by altering behavior of competitors rather than a direct toxicity), calling for more ‘ecologically relevant’ assays. Moreover, it’s not all about competition. In relatively recent times, the importance of bacterial ‘quorum sensing’ has been increasingly appreciated, and intensively investigated. In this ‘social’ effect, some aspects of gene regulation of specific bacterial species are controlled by their cell densities, through the action of secreted ‘autoinducers’. Chemical ‘cross-talk’ between different bacterial species has also been implicated, and even between bacteria and eukaroytic organisms (‘cross-kingdom’ communication). And at least in some circumstances, synergism between different bacterial species in the production of secondary metabolites has been documented. So the roles of metabolites produced by highly diverse bacterial communities may be very diverse as well.

While acknowledging these complexities, competition is still a central ecological principle. Many competitors of any given organism may be closely related, with associated conservation of many potential target proteins. In fact, a great many targets which could be potentially bound by a metabolite will be shared by the host organism itself, especially at highly conserved active sites such as those in essential enzymes. So any metabolites modulating the activities of such shared conserved target proteins will be selected against (an organism producing a compound which is even slightly toxic to itself will be at a strong competitive disadvantage and rapidly eliminated). One way around this problem, as we have seen with erythromycin-producing organisms or the conceptually related restriction-modification systems, is for the host to specifically modify itself to avoid self-damage (as in the above figure). In this case, choosing a highly conserved target (protein synthesis in the case of erythromycin) is beneficial by maximizing the sweep of the counter-competitor response. An alternative solution with the same outcome is for the host to synthesize a third-party gene product which neutralizes the activity of the toxic bioproduct only while it is internalized, leaving it free to inhibit conserved targets in the environment. For example, some snake species appear to cope with their own venom production in this manner. The problem of avoiding host damage while targeting a pathogen (or distinguishing self from not-self) is also fundamental to the vertebrate immune system, which has evolved sophisticated (albeit not universally successful) mechanisms for eliminating or suppressing self-reactive immune responses. There is evidence that selection can positively drive the co-production of even structurally unrelated secondary metabolites, by means of synergy between the compounds at the functional level.  Thus, chemically unrelated clavulanic acid (an inhibitor of β-lactamase enzymes which break down penicillins and related antibiotics) and β-lactam (penicillin-family) antibiotics themselves are co-produced with a frequency and pattern highly suggestive of strong selective pressures rather than chance.

In a complex environment, the dynamics of interactions between mutually competing organisms will also result in complex selective pressures on secondary metabolite production, with the potential for ‘arms races’ between offensive strategies and defensive countermeasures. An example of the latter can be found in adaptations of herbivorous insects for circumventing the action of repellent or toxic plant metabolites. In microbial populations, acquisition of resistance to counteract the effects of toxic secondary metabolites can be via mutation of target proteins themselves or through the modifying activity of another gene product, acquired through evolution ‘from scratch’ or (more commonly) lateral gene transfer from a different organism. Such resistance genes can operate in a variety of ways, including direct destruction or chemical modification of a secondary metabolite to render it inactive (as with the b-lactamases we noted above), prevention of the uptake of metabolites into cells or active export of them, or modification of a cellular target of the toxic metabolite. By way of example of the latter, the gene protecting erythromycin-synthesizing bacteria from self-toxicity will also confer erythromycin resistance if expressed alone in otherwise sensitive organisms.

We noted above the comparison of defensive / offensive secondary metabolite production with an ‘immune system’. Generation of ‘response diversity’ by degenerate syntheses of metabolites with promiscuous enzymes has some parallels with innate immune systems of invertebrates, although not with adaptive immune responses which have elaborate systems for selecting, amplifying and fine-tuning molecular solutions for recognition of a foreign antigen. A specific secondary metabolite can only be fine-tuned towards optimal target recognition by repeated rounds of selection for organisms with improved fitness as a result of suppressing competitors or predators. Where inhibition of a specific single target confers a strong fitness advantage, successive generations will be selected towards production of compounds with increasingly improved target-interactive properties. As we have seen, however, in complex environments selection may also tend to favor production of a wider range of structurally-related compounds.

The Molecular Hardware of Ecology

Since we have noted that natural product therapeutic molecules generally fall within the category of bioproducts with an ecologically-based function, a different and more general term than secondary metabolites would be useful. ‘Ecobiomolecules’ is a reasonably self-descriptive word (an abbreviation of ‘ecological biomolecules’) which is defined in this context as any biological molecule whose function involves direct interaction with molecules deriving from an organism’s environment. This term therefore encompasses all secondary metabolites with known functions, but also many more biomolecules beyond the generally-accepted secondary metabolite scope. (Here we use the definition of ‘ecology’ as the interactions of an organism with other organisms and the environment in general).

If we assume that all secondary metabolites originate from various environmental selective pressures, then it is automatically true that all secondary metabolites are ecobiomolecules, but not vice versa.  (It could be pointed out that all biomolecules (or more accurately, the genes that directly encode them, or genes which encode the enzymes which synthesize them) can be considered as molded by the environment over time in response to selective pressures, to maximize the fitness of the lineage of organisms in which they are expressed. So when referring to ecobiomolecules, we can further restrict the definition to biological molecules whose major normal function involves an interaction with a physical or chemical environmental factor. Within the total ecobiomolecular set, we can find molecules whose function is concerned with sexual or other forms of communication (such as pheromones or molecules mediating the above-mentioned phenomenon of quorum-sensing), defense (plant products selected to reduce herbivore activity; anti-bacterial products; sprayed or ejected chemical deterrence as with skunks, etc.), or offense (various animal toxins).  A vast array of arthropod pheromones and defense / offence toxins are included here. Indeed, the success of arthropods among all animal groups (‘phyletic dominance’) has been attributed at least in part to their virtuosity in the production of such molecules, with associated improved levels of survival. In some cases, rather than synthesizing useful compounds themselves, insects can acquire a useful property (such as acquired toxicity towards predators) from ingesting host plant material. This phenomenon of ‘chemical sequestration’ involves the evolution of specific transport systems for host plant toxins.

We can see that an organism which possesses a toxic phenotype, whether through an endogenous toxin or an actively acquired one, gains a survival advantage if the presence of the toxin is a deterrent to predators. As a modulator of the interaction between two or more organisms, the associated toxin can be thus termed an ecobiomolecule, and its presence confers an ‘immunity’ of sorts. But what of the intricate immune systems of higher organisms? It would seem that we must technically also classify all components of both the innate and adaptive vertebrate immune systems as ecobiomolecules, as they have evolved to cope with the environmental pressures of a wide variety of pathogenic organisms, and are not intrinsically required for growth and metabolic survival. From this stance, the size of any participating molecule is irrelevant (Antibodies, for example, are large multimeric proteins). It should be noted, though, that some functions of immune systems are concerned with internal homeostasis (such as surveillance against tumors), so not all aspects of immune systems can be viewed as ecologically based. But antibodies in particular have been extremely useful to humans in both basic research and therapeutic applications, and we therefore should group them in the broad category of ecobiomolecules which also includes small molecule secondary metabolites.

To conclude with another glance at chemical space, through a biopoly-verse lens:

Consider a figurative notion:

The Earth as a molecular ocean

– Yet still but a trace

Of chemical space

Whose vastness might inspire emotion.

…But we’ll see in subsequent posts that as far as the human usefulness of natural product molecules on Earth is concerned, it is very definitely a case of quality over quantity (even though the quantity in practical terms is still a huge and largely untapped resource, whatever its size with respect to the general potential of chemical space).

References & Details

(In order of citation. In most cases, only sample references are provided from many existing in the field.)

‘…….modeled as ‘spaces’…… and ‘chemical space’ for small molecules is one such conception.’    See Lipinski & Hopkins 2004; Rojas-Ruiz et al. 2011. Estimates of the number of chemically possible members of small-molecule chemical space have been made, with figures of up to 10200 compounds postulated (See Fink et al. 2005).

‘…….the largest known naturally-made ‘small’ molecules…’    Currently the marine bioproduct maitotoxin (molecular weight 3422) holds this record. (See Nicolaou et al.  2008). Such molecules are ‘non-alphabetic’ in the sense that they are sequentially built by a series of enzymatic reactions, and are not macromolecules such as proteins or nucleic acids composed of specific sequences of distinct subunit ‘alphabets’. Neither are they long-chain carbohydrates built by enzymatic linking of repetitive monosaccharides.

‘……ecological biochemistry..’    See Bell 2001; Attardo & Sartori 2003.

‘…..the term ‘secondary metabolites’ (as distinct from the products of ‘primary’ metabolism) does not meet with universal approbation….’    See Firn & Jones 2009;  ‘……and has ‘no simple one-line definition    See Challis & Hopwood 2003.

‘…….they [secondary metabolites] are found in very variable levels between different taxonomic groups…’    See Wink 2003.

‘………functions of secondary metabolites have been controversial, but they have usually been presumed to provide a selective advantage under some conditions….’    See Cavalier-Smith 1992; Maplestone et al., 1992.

‘………gene homologies between different organisms suggest common origins for some biosynthetic enzymes, probably by gene transfer. ’    See Vining 1992.

‘…..Gene duplication and divergence is acknowledged as a probable important driver of diversity in the enzymatic machinery of secondary metabolite production. ‘    See Cavalier-Smith 1992; Kliebenstein et al. 2001.

‘….Genes encoding enzymes which control the production of antibiotic secondary metabolites tend to cluster and to be selected as a group…’    See Maplestone et al., 1992; Challis & Hopwood 2003.

‘…….biosynthesis of the antibiotic erythromycin….’    See Donadio et al., 1991.

‘……secondary metabolite production has also been compared ….with an immune system against competitors…’    See Firn & Jones 2003.

‘……..metabolome / genome size discrepancy has been attributed to……‘catalytic promiscuity     See Schwab 2003; Bornscheuer & Kazlauskas 2004.

‘…..provision of alternative precursor substrates or alteration of culture conditions can induce dramatic changes in the types of product…. a useful route towards artificial end-product modification…’    See Bode et al. 2002.

An interesting hypothesis accounts for the diversity of apparently sub-optimal metabolites….’    See Firn & Jones 2003.

Only a quite small component of all soil organisms appear to be amenable to in vitro cultivation at present. ’    This has been partially overcome by metagenomic techniques, discussed in a previous post’s References & Details.

‘…….more ‘ecologically relevant’ assays…’    See Engel et al. 2002.

‘……..the importance of bacterial ‘quorum sensing’ has been increasingly appreciated….’    See West et al. 2006; Ng & Bassler 2009.

Chemical ‘cross-talk’ …….even between bacteria and eukaroytic organisms……’     See Williams 2007.

‘…….synergism between different bacterial species in the production of secondary metabolites has been documented.’    See Angell et al. 2006.

‘……some snake species appear to cope with their own venom production in this manner….’     See Smith et al. 2000.

There is evidence that selection can positively drive the co-production of even structurally unrelated secondary metabolites….’    See Challis & Hopwood 2003.

‘…..adaptations of herbivorous insects for circumventing the action of repellent or toxic plant metabolites…’    See Wittstock et al. 2004.

‘……the gene protecting erythromycin-synthesizing bacteria from self-toxicity will also confer erythromycin resistance if expressed alone in otherwise sensitive organisms…’    See Teuber et al. 2003.

‘……the success of arthropods …….has been attributed at least in part to their virtuosity in the production of such molecules…’    See Meinwald & Eisner 1995.

‘…….insects can acquire a useful property …..from ingesting host plant material….’ See  Termonia et al. 2002.

‘…..phenomenon of ‘chemical sequestration’ involves the evolution of specific transport systems for host plant toxins……’     See Kuhn et al. 2004.

Advertisements
No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s