|Home||Founder||What's New||In The News||Consulting|
Paradigm Shift Intervention Monitoring
Coronavirus Sequence Confusion
The above comment from a recent ECDC summary of the novel betacoronavirus cases helps to explain why the ECDC put forward a risk assessment that cited an animal/environmental origin for the recent human cases. The ECDC also maintained that an absence of mild cases was due to the lack of human to human (H2H) transmission. Recent results have caused the ECDC to modify its position on H2H transmission frequencies, as well as its risk assessment, but the above comment indicates that the ECDC understanding of the sequence data is still fatally flawed.
The human betacornavirus sequences are MOST closely related to bat sequences, but they are not closely related. Therefore, none of the known bat sequences are candidates for the source of a recent jump to humans or to an intermediate species that led to the human cases. The human sequences are closely related to other human sequences, but they are not closely related to known bat sequences, in spite of extensive surveys linked to the SARS outbreak in 2002/2003 or the recent outbreak of the novel group 2c coronavirus which includes HCoV-EMC/2012.
In 2003 the SARS CoV dramatically spread in the human population and closely related sequences were found in a variety of species sold in live markets in Hong Kong. Although the sequences in the animals (most notably civet cats) were virtually identical to the human sequences, the presence of the same sequences in multiple species suggested the animal infections were due to species jumping associated with the propagating or housing of the exotic animals in live markets. Attempts to find SARS CoV in these species in the wild failed, leading to a significant effort to find a natural reservoir for SARS CoV.
A wide variety of coronaviruses were identified in bats including bat SARS CoV, which had an identity of 98% with human and exotic animal sequences. Since the viruses were found in bat guano, it was technically easy to collect a very large number of samples, which could then be screened with a panel of PCR primers designed to detect all known coronaviruses. This effort identified bat sequences that were closely related to human cornaviruses, as well as novel sequences not previously identified in humans. For betacornaviruses (group 2) a human cold virus, OC43 had been isolated in the 1960’s, which was distinct from SARS CoV, which also mapped to group 2, so the OC43-like sequences were designated group 2a and SARS CoV was group 2b. The additional bat sequences that were in group 2 but distinct from 2a and 2b were classified as members of 2c and 2d. Thus, the bat surveillance created multiple coronavirus sub-groups, including those not represented by human isolates or isolates from any other sequence.
Thus, when the first novel human CoV sequence was generated from a fatal case in Saudi Arabia, it was distinct from all human coronavirus and was most closely related to the bat 2c sequences (because they were the only known 2c sequences, due to the extended animal survey which targeted bat samples). The full 2c sequences were from two species of bats found in Guangdong Province. However, there were regions of the bat sequences with virtually no identity with the human sequences, and even when the analysis was limited to the more closely related regions (which covered about 90% of the genome), the identity between the human and bat sequences was less than 90% (well below the 98% identity between the human and bat SARS sequences which covered 100% of the sequence).
A more recent survey, using probes more targeted toward group 2c sequences, was used to screen previously collected bat sequences in Europe and Africa. The survey identified additional group 2c sequences, but the closest match was an isolated from the Netherlands, which covered a short conserved region, which produced an identity of 91%, which again would not support a recent jump from bats to humans, even if the jump involved an intermediate source, as was seen for SARS CoV.
The differences between the bat sequences and the human sequences can be most easily seen in the public sequences from four confirmed cases. When the first sequence was generated by EMC for the fatal case (60M) in Saudi Arabia, the closest matches for full sequences were the bat sequences from Guangdong Province (HKU-4 HKU-5 series). The universal set of primers was used to confirm the Qatari case (49M), who developed symptoms while performing Umrah in Saudi Arabia and was diagnosed by the HPA in the UK. The 206 BP insert (the sequence between the primers) matched the EMC sequence (Human betacoronavirus 2c EMC/2012) at 205 of the 206 positions generating an identity that was >99.5%. In contrast, the bat sequences had 35 differences (83% identity), which was dramatically different than the 1 difference between the two human sequences. Moreover, the full sequence from the Qatari case (over 30,000 positions), Betacoronavirus England 1, was also 99.5% identical which involved a series of single nucleotide differences, as well as a 6 BP deletion in the Qatari sequence.
Two partial sequences from the Riyadh gym teacher (45M) were published in the paper describing that case. One sequence represented positions 18105-18414 and all 310 positions exactly matched the EMC sequence. The HPA sequence was subsequently released and it also was identical. The same result was seen for the second sequence (positions 27278-27686) so the combined set of sequences, which represented 719 positions, were identical in all three human cases (while the best bat match was 83% for the first region and 78% for the second region).
The same levels of identities were seen for the case from Qatar (45M) who was diagnosed by RKI in Germany. Sequences from two regions in that isolate (designated as the Essen sequence) have also been published. One region, designated RdRp cover positions 15073-15254. The Essen sequence matched the HPA at all 182 positions, and differed from the EMC sequence at one position. In contrast the closest match from the HKU5 series was 82%, while the best match from Europe (Bat coronavirus BtCoV/8-724/Pip_pyg/ROU/2009 from Romania), was 89%. A second region (positions 29598-29838) did not have a match in any of the bat sequences. However, the Essen sequence matched the EMC at 240 of the 241 positions. The mismatched position in the EMC sequence did match in the HPA sequence, but the Essen sequence did not have the 6 BP deletion.
Thus, each of the four sets of sequences from the four human cases (from four different locations in two countries) had identities >99.5% with the other human sequences, but had identities between 78-89% with the bat sequences.
Moreover, the identities seen in the four human public sequences will be repeated for the other confirmed cases, because they were detected with PCR tests that included probes that were specific for the novel betacornavirus. Thus, all of the human sequences are expected to be >99.5% with each other and significantly different than any known bat sequence.
There has been no data supporting an animal or environmental source for any of the confirmed cases, and as more sequences are made public, ECDC will have to further modify its risk assessment, which is based on a serious misconception regarding the sequence similarities between the human and bat group 2c sequences.