Recombinomics | Elegant Evolution

Home Founder What's New In The News Contact Us

Paradigm Shift

Viral Evolution

Intervention Monitoring

Vaccine Screening

Vaccine Development

Expression Profiling

Drug Discovery

Custom Therapies


H5N1 Random Mutations Are Not Random Or Recent Mutations 

Recombinomics Commentary

March 21, 2006

As the number of bird flu sequences increase it becomes more apparent that the gentic drift of H5N1 is not due to random mutations, but is due to recombination.  The widely held belief that the drifting is due to copying errors becomes less and less tenable as new sequences are made public.

Much attention has been focused on the Qinghai strain of H5N1 because it has become resident in long range migratory birds and is rapidly spreading worldwide.  Most recent confirmed cases are in Europe, the Middle East, and Africa, but the same strain is likely involved in Asian countries such as India, Pakistan, and Afghanistan also.

The Qinghai strain was initial isolated at the Qinghai Lake nature reserve in central China.  The strain was readily distinguished from other H5N1 isolates from Asia.  Its HA cleavage site was GERRRKKR instead of the more common RERRRKKR found in Asia.  In addition, the PB2 polymorphism E627K was in all 16 isolates.  Prior to Qinghai Lake, this changed had never been reported in H5N1 isolated from birds.  It was in all H1, H2, H3 human isolates or H5N1 from mammals such as humans, wild cats, or brains of experimental mice, but not from birds.  Since Qinghai Lake, all H5N1 Qinghai strain PB2 sequences containing the 627 position have had the E627K polymorphism.  Moreover, all of the Qinghai strain isolates are over 98% homologous to the Qinghai strain and possess a number of Qinghai specific polymorphisms in all 8 gene sequences.

Because of the heightened interest and rapid spread of the H5N1 Qinghai strain, a number of sequences have been made publicly available and include sequences from Russia. Mongolia, Nigeria, Italy, France, Iraq, Iran, Turkey, and Denmark.  All are greater than 98% homologous to the Qinghai strain and have the expected characteristics described above.  However, each has a series of small changes which are collectively called genetic drift and have been characterized as random mutations.

The vast majority of the small changes is not new mutations.  These changes are readily found in the flu database.  Moreover, they are not random.  Most are from other recent H5N1 sequences from Asia and provide a roadmap of where the Qinghai strain isolate had been previously.  Some of these polymorphisms trace back to other serotypes in North America from the 1970’s and 80’s.  This tracing is facilitated by the recent release of these sequences from isolates collected 25-30 years ago.  Some polymorphisms trace back to the earliest sequence from 1902.  Most of these historical tracks are single nucleotide polymorphisms, but their distribution is far from random and their presence in earlier isolates shows that the changes are not recent mutations.  Although the copying of genetic information does produce errors, these errors are only rarely stably incorporated into circulation influenza. 

The vast majority of the changes are easily found and mapped and due to acquisition by recombination.  Recombination happens when two distinct viruses infect the same host and part of one gene is copied and the part of the other homologous gene is then copied, producing a gene with genetic information from both parents.  Recombination is distinguished from reassortment, which also requires a dual infection, but the newly emerging virus has a mixture of complete genes from each parent.  The individual genes do not have new sequences but are exact matches of one of the two parental genes. Dual infections can produce reassortment and recombination and the two process are not mutually exclusive.

Recent H1 swine sequences from Canada are good examples of isolates that have both recombined and reassorted.  The recent series of H1N1 and H1N2 isolates have a PB1 gene that is human demonstrating that each virus is a reassortant.  The remaining 7 genes are either all avian or five of the seven are avian (the other two, H and N are also human in a couple of the isolates).

Seven of the recent isolates contain 5 or 7 avian genes and all 7 isolates have PB2 and PA avian genes.  In each case, both genes are recombinants, containing significant portions of gene sequences found in earlier isolates. 5 of the 7 PB2 genes contain sequences from an H1N1 Tennessee swine isolated in 1977.  Similarly, 6 of the 7 recent isolates also have PA sequences from the same 1977 isolate from Tennessee.  The Tennessee sequences are striking because they cover a large portion of the 1977 gene and are exact matches.  Thus, these portions of the 1977 gene were copied with absolute fidelity for over 26 years and are unchanged in the 2003 and 2004 Canadian swine isolates.

Past arguments that used differential evolution to explain minimal changes in portions of genes do not explain the recent data because the matches are identical and cover a number of different regions.  In the PB2 gene, all five sequences contain a core region of identity at positions 1008-1326.  Thus, one argument would hold that this region was important, and even a single change would make the gene less fit and would be selected against.  Therefore those changes would never be detected in circulating virus.  However, only two of the recent isolates have 1977 identity limited to this region.  A third isolate has 1977 identity over the positions 768-1354.  Thus, there is identity on both sides of the essential core sequence. This additional information was not essential to the first two isolates, but was still copied with absolute fidelity in the third isolate.  This absolutely accurate copying of “non-essential” regions is even more obvious in the remaining two isolates.  One has identity over positions 274-1880 while the other has identity over the region 274-1931.  Thus, each of these isolates has over 500 BP of identity on either side of the core sequence. Although these 1000 BP were not retained in two of the isolates, the sequences are exact matches with the 1977 isolated indicating the non-essential regions were copied with absolute fidelity for over 26 years.

The same type of conservation of sequences from the same 1977 isolate is present in the PA gene.  For this gene 6 of the 7 recent isolates have large portions of the 1977 PA gene which are exact matches.  All six sequences overlap over a core region from position 992-1319, where all six are exact matches of the 1977 sequence.  However all 6 of the recent sequences have regions of identity that extend beyond this core region, so each of these extended regions was not essential for a least one of the recent isolates, yet almost all of the 1977 gene, from position 25-2016, is an exact match in one or more of the recent isolates.  Thus, once again large portions of the 1977 sequence are copied with absolute fidelity in one or more recent isolates.

These data for multiple regions of two genes from the 1977 swine isolate from Tennessee indicate that the gene copying mechanism can replicate these genes for over 26 years and the circulating viruses contain exact matches of this earlier sequence.

In contrast to this level of fidelity, circulating influenza isolates routinely have a number of sequence differences that accumulate over far shorter time periods.  This, each of the Qinghai strain isolates have a discrete number of changes that distinguish one isolate from another,  However, the vast majority of these changes are easily identified and traced in the influenza database because they are acquire by recombination and not random mutation.

Home | Founder | What's New | In The News | Contact Us

© 2006 Recombinomics.  All rights reserved.