DNA: The Software of Life?
In 1651, Thomas Hobbes, inspired by the rapid development of mechanical devices in his age, wrote in his Leviathan "What is the Heart, but a Spring; and the Nerves, but so many Strings; and the Joynts, but so many Wheeles, giving motion to the whole body..?". Rene Descartes, in his Treatise of Man (1629) likewise analysed the human body explicitly as a mechanical device. For Hobbes, not only was the human body a mechanical device, but so was human society.
The microscope and biochemical research subsequently proved that the body was not so simple and that the analogy was poor â€“ but this did not stop scientists in the early 20th Century in turn from interpreting the human brain in terms of the latest technology, the telephone exchange. There is, and always has been, in scientific circles an irresistible urge to explain the functioning of everything in terms of the latest technology â€“ and thereby present life as capable of rational control by the developers of these technologies. And so now, in the age of computing and genetics, we everywhere we see DNA being described as "the software of life". But is it?
AGTTCGTCAAGTGCAT... and 11101000110101100... The vast three billion base-pair sequences of Adenine, Cytosine, Guanine and Thymine that constitute the 23 chromosomes of the human genome are seductively similar to the equally vast binary sequences that lie at the heart of computer software. Richard Dawkins, for example, in The Blind Watchmaker (1986) says "There is very little difference, in principle, between a two-state binary information technology, like ours, and a four-state information technology like that of the living cell." The analogy is especially seductive in the world of bioinformatics â€“ where computing and genetics come together. If software is a sequence of instructions that controls the functioning of a computer then surely gene sequences do exactly the same in living organisms? Surely each gene will have a single specific function within an ordered program of routines and subroutines? And surely decoding these genes will allow us to completely understand and redesign living organisms? So the genetic determinist of the Dawkins school would have us believe.
The Lego brick model
The genetic determinist, in his zeal to oversimplify reality, often makes assumptions that cannot be justified even by this questionable analogy. For example, genetic engineers generally assume that each gene has one specific function and that when it is transferred into another organism it will perform the same function. This is the Lego brick model of genes. A programmer might be more cautious about claiming that a section of code from one subroutine would have the same effect if inserted into another subroutine, built for a different purpose, with different parameters, or in a different language. And if the function of the code had been identified vaguely as "file management" then the programmer would simply laugh at the sloppiness of the logic. We might as well suggest that a phrase could be taken from one play and be inserted in another one and have the same effect â€“ it might, but it might well not, and the second play might not even have been written in the same dialect or language.
As genes help construct specific proteins we can assume that they might still do this if inserted into a chromosome of a sufficiently closely related organism, yet there are many genetic engineers who will simply assume that a gene for a vague trait such as "frost resistance" may simply be transferred from an animal to a plant and function without adverse or unpredictable consequences. I remember attending a Linnean Society lecture by a past Director of London's Science Museum, at which he asserted that the transfer of a gene from any species to any other one would result in the transfer of the desired trait. I challenged this, but only when he was specifically asked whether a gene affecting "intelligence" in humans would have the same effect in cabbages did he admit that the process was not as simple as he was claiming. The gene may have produced a protein that had an indirect influence upon brain functions in humans, but how that protein might interact with the biochemical environment of an entirely different species is anyone's guess.
Sometimes such gene transfers may lead to the desired result, but often they do not â€“ and the scientists involved would be overstating their comprehension if they claimed to understand exactly why it works in some cases and not in others. A classic example of this was reported in 1990. Petunias, genetically modified with an additional "red" gene, failed to display uniform red flowers. Half ended up white or changed colour during their development in response to stress or temperature changes. The phenomenon, known as "gene silencing", is now more clearly understood, but other unpredicted side-effects keep cropping up.
The gene shortage
In 1999 research in the USA revealed that soya beans with an added gene for herbicide resistance (it expresses an enzyme that breaks down glyphosate) suffered from severe stem splitting. They were found to have 20% more lignin than normal in their stems, making them brittle, even though the inserted gene had no known link to lignin formation. The death of gene therapy volunteer Jesse Gelsinger in 1999, as a result of immune reaction to the protein his body had been reprogrammed to create, was a massive setback to the development of the technology. Bacteria genetically modified to produce Showa Denko's nutritional supplement, L-tryptophan, unexpectedly produced a modified version of the molecule â€“ leading to the deaths in the 1980's of around 100 people and the crippling of 5-10,000 more (Newsday, 14/08/1990; Smith, J.M., 2003). The unpredicted consequences of treating genes as transferable Lego bricks are rarely as serious as this but they are very common.
Genes express proteins (indeed a chain of amino acids is only called a gene if it does this) and it would be neat and tidy if each gene produced one corresponding protein â€“ but it's not as simple as that. One puzzle when the human genome was mapped in February 2001 was that there seemed to be nothing like enough genes there to enable the formation of humans. There were only about 27,000 genes [plus or minus 10,000] in the human genome â€“ less than in rice plants â€“ but the human body contains over 200,000 proteins during its life. 98 per cent of the three billion base-pair sequence, had no genes, as defined above, so was classified as "junk DNA".
We now know that genes can directly or indirectly produce many different proteins â€“ they will help produce different proteins at different life stages or different proteins in response to different stresses or different biochemical environments in the cell. If such a gene has been inserted into a different species to produce one of these proteins how are we to know that it will not produce another one, possibly an unknown one, instead?
Scientists are discovering, unsurprisingly, that junk DNA was not nearly as inactive as it had been assumed to be. It clearly influenced the behaviour of cells in various different ways, some of these now understood, but if this discovery is to be incorporated into genetic theory then genes will have to much more flexibly defined or else abandoned as the basic "building block" of life. John S. Mattick, Director of the Institute for Molecular Bioscience in Queensland, Australia, says "I think this will come to be a classic story of orthodoxy derailing objective analysis of the facts, in this case for a quarter of a century." (Gibbs, 2003).
Does not compute
Rarely does a week pass without the supposed identification of a "cancer gene", a "gay gene", an "obesity" gene or an "intelligence gene". Unsurprisingly, these claims are often based upon the loosest statistical correlation (often under 10 per cent) and usually they are unsubstantiated by any biochemical explanation. Alarmingly, employers want to use this information to select employees and the health insurance industry want to use it to exclude "high risk" policyholders, but with the present state of knowledge about human genes they might be better off relying on palm reading and graphology.
So genes and their actions are not as predictable as the genetic determinists would have us believe, but can we not still think of them as software? The most fundamental difference between genomes and software becomes apparent when one thinks about how each of them came into being. Software has designers. If it is open source software it may have been written and improved repeatedly by a large number of programmers â€“ but it is still hopefully the product of intelligent minds. Any software worthy of the name will have a clear logical structure â€“ it will be dividable into routines and sub-routines, all of which will have clear and distinct structures and functions.
Genomes do not have any designers â€“ a fact that obviously has massive implications. Genomes exist purely because they work; they don't give a damn about being tidy or easily comprehended. As long as it has no negative implications for the survival of a species, a genome will happily contain several different genes with similar or identical functions - or a single gene that has several unrelated functions â€“ or a group of interacting genes that only function in cooperation. Genes that in the distant evolutionary past had one function may now have a different function or none at all.
According to genetic orthodoxy the DNA is the source code of life and the cell is its product. In the computing world this is clearly so â€“ we write the program and it then carries out procedures. Programs do not, however, build computers or replicate themselves. Living self-replicating organisms, though, introduce a chicken-and-egg question: Did the DNA create the cell or did the cell create the DNA? All the DNA that we have access to has been created in living cells. We have no 'primordial DNA'. DNA on its own is an incredibly inert substance that could create nothing, so if, hypothetically, there was some kind of 'primordial DNA' that created the first cell then it would be nothing like the DNA we know today. Even in theory we could not work out how to create a cell given only a complete strand of DNA. As the leading Harvard geneticist, Professor Richard Lewontin, says in his recent attack on genetic determinism, The Triple Helix (2000b), "If we had the complete DNA sequence of an organism and unlimited computational power, we could not compute the organism because the organism does not compute itself from its genes."
Mix and match?
In reality there was no 'primordial DNA'. The very first cell-like entities probably replicated without DNA. The first DNA could only have developed within such a cellular environment. This DNA would never have replicated by being ejected naked into the environment to get on with the job of building new cells. DNA always goes off in the company of a whole interacting package of other biochemicals. Change these other biochemicals and the replication will not work. Take out all the DNA, and nothing else, and put it into the cell of an entirely different species and that cell will be incapable of following the 'instructions' of the alien DNA. The inescapable fact is that the unit of inheritance is the entire DNA in conjunction with the other cell contents. DNA and its cell co-evolved â€“ inheritance is impossible with DNA alone.
Richard Dawkins, the high priest of genetic determinism, goes to great trouble in his books â€“ such as The Blind Watchmaker (1990) and Climbing Mount Improbable (1997) â€“ to show how highly advanced life forms (unlike watches and computers) can develop without the intervention of an intelligent designer. It's an argument that has surely long been accepted - everywhere outside Kansas. However, as his introductory quote above reveals, he fails to recognise the vast differences between genes and software and between machines and living organisms that result from the very different ways in which they have come into being. Seeking, quite legitimately, purely evolutionary explanations for the way we are, he and his colleagues have reduced life to a simplistic model of mix-and-match genetic instructions â€“ a model that simply does not stand up to the rigours of reality.
How has this happened?
Thomas Kuhn, in his now classic book, The Structure of Scientific Revolutions (1962), describes the process by which major breakthroughs in science begin as heresies and end as superstitions. Normal science progresses by increments but eventually more and more facts may be observed that fail to fit in with the orthodox theory. The flat earth model cannot explain Eratosthenes' observation that the sun is higher in the sky in south of Egypt than in the north; the earth-centred universe model cannot explain Galileo's observation that moons orbit Jupiter. The defenders of orthodoxy may well at first ignore the inconvenient observations and ridicule any scientists who stick their necks out. Eventually, however, if the inconvenient observations are sufficiently numerous and consistent, and if they can be explained by a refined scientific theory, then more and more scientists endorse it and the remaining advocates of the old theory are forgotten by history.
Often scientific revolutions, the abandonment of old theories and their replacement by improved ones, can be painfully slow â€“ physicists remained attached to concepts of cosmic ether long after theyuld have been abandoned, chemists likewise clung to their imaginary phlogiston long after the theory had become a nightmare, and geologists resisted continental drift theory for decades despite overwhelming evidence for it. There are many reasons for this â€“ reluctance to abandon the security of a tidy and simple theory, refusal to believe that a scientist from another discipline could be right or fear of ridicule by peers are just a few.
There are additional reasons why simplistic theories of genetic determinism remain prominent way past their sell-by date. Not least is worrying recent trend of scientists being funded by corporations rather than Governments. Professor Richard Lewontin (2000a) notes that "No prominent biologist of my acquaintance is without a financial stake in the biotechnology business". When major corporations have spent billions on finding and patenting genes and when thousands of geneticists have earned their living identifying the supposedly unique functions of individual genes it is more than just status that motivates them to defend the old theory.
The biotech industry, however, is probably regretting the distortion of science that results from corporate funding. Bad science does not lead to profits and the US$100 billion over-optimistically poured into the 'biotech revolution' has so far yielded a catastrophically small return of around US$40 billion (Paul & Steinbrecher, 2003). Whole sectors of research, such as gene therapy and cloning, have provided virtually no return at all in response to massive investment.
How inheritance works
None of the problems with genetic determinism mean that bioinformatics should be abandoned. Information itself never harms, and there is no justification for refusing to find out more about how our genes work â€“ or for refusing to publish this information. Those who have the greatest capacity to abuse genetic information, for example to create bioweapons, also have the greatest access to unpublished information. Making genomes open source will enable independent scientists to highlight the deliberate or accidental misinterpretation of their structure and function.
When a scientific theory is falling apart what is needed is more research not less. Indeed, as Kuhn notes, it is at such times of scientific revolution that the most interesting and most important research takes place. It will be necessary, however, to abandon the narrow-minded attempt to identify single functions for single genes and to adopt much more lateral thinking about how inheritance works and how DNA might be structured. Those involved in the next phase of bioinformatics will have to come to terms with the fact that much of the information about humans and other species is not within individually patentable genes but is more deeply encrypted and dispersed within DNA than had previously been assumed or, in many cases, may be elsewhere in the cytoplasm and not within the genome at all.
Junk DNA and epigenetic markers
The orthodox model of the gene recognises only sequences that express proteins as units of heredity. The "non-coding" sequences that constitute maybe 98% of the human genome have been dismissed as "junk DNA â€“ a relic of its past history. The model does not explain why this junk has not gradually been lost during billions of years of evolution. Now transcriptional units are being identified within the 'junk' that produce RNA that then interacts with other RNA, DNA or proteins in the cell. Research on the mouse genome reveals that the number of such units may exceed the number of conventional genes.
Another 'layer' of information is being found not in the double-helix of the DNA itself, but in the matrix of protiens that tightly surround it. These 'epigenetic markers', which unlike the DNA can vary during the life of an organism, have been shown, for example, to affect the coat colour of otherwise genetically identical mice. (Gibbs, 2003)]
Does commercial confidentiality prevent bioweapon development?
It has been argued that genetic data could be misused if published â€“ for example to develop genetically enhanced bioweapons. We should not forget, however, that the USA, UK, Russia and Germany have already made such weapons for "purely defensive" purposes (www.sunshine-project.org/bwintro/gebw.html) using unpublished data. One such product was the US 'military grade' anthrax posted to politicians in 2001; others include bacteria genetically engineered to degrade enemy explosives. In the USA "biodefense" research is contracted out to private companies and institutions â€“ such as the University of Texas. The cost of such research is enormous and, presumably, enormously profitable to some. Impoverished Afghanis and Iraqi insurgents, in contrast, cannot afford to genetically engineer bacteria from scratch using gene sequences found on the internet, and if they chose instead to culture natural pathogens they could do so without knowing anything about their gene sequences. The nations that can afford to create genetically enhanced bioweapons have done so already or could do so any time by sequencing their own pathogen genomes or by buying existing data from biotech corporations. For more information visit the websites of The Sunshine Project and the Council for Responsible Genetics.
The Sunshine Project
Council for Responsible Genetics