November 18th, 2020
Zhan, Deverman and Chan state that, “Our observations suggest that by the time SARS-CoV-2 was first detected in late 2019, it was already pre-adapted to human transmission to an extent similar to late epidemic SARS-CoV.” The analyses that form the basis for this conclusion are flawed.
Figure 1A depicts the early to late phases of SARS-CoV-1 emergence in humans. This phylogenetic tree and subsequent analyses are based on the false premise that SARS-CoV-1 evolution in humans during the early to late phases was monophyletic. The first eleven cases of SARS-CoV were geographically dispersed across a wide area from Foshan to Dongguan in the Pearl River Delta area of southern China and occurred over a 4-month period from November 2002 to March 2003 (1, 2). None of these eleven cases were epidemiologically linked. Therefore, it is extremely unlikely that SARS-CoV-1 from these human cases had a common viral ancestor from a human. Indeed, 7/11 of the early cases had documented contact with animals. Subsequent studies indicated that SARS-CoV-1 had already undergone considerable genetic diversification in animals before these first human cases were described (3-8). Genetically diverse SARS-CoV-1 has been found in civets and other animals from southern China as well as in civets from Hubei province where the city of Wuhan is located. The high genetic diversity in the early phase of the outbreak thus reflects multiple independent spillovers of divergent SARS-CoV-1 from animals sold in the wildlife trade (1, 9). This is well-established in the virology literature. The SARS-CoV-1 genetic diversity observed in the early phase of the outbreak is not as the authors imply due to rapid evolution during human-to-human transfer.
Contrary to the analyses in Zhan, Deverman and Chan the early and mid-phases of the SARS-CoV-1 outbreak were polyphyletic with multiple spillovers and limited expansion of several lineages that were not sustained by further human-to-human passages. This is similar to human infections by Lassa virus (LASV), a hemorrhagic fever virus that is endemic to West Africa (10). Yearly, there are thousands of independent spillovers of LASV from Mastomys natalensis and other rodent intermediates or reservoirs. In a similar manner to the early SARS-CoV-1 isolates, the high genetic diversity of LASV isolated from humans is accounted for its diversity in its rodent reservoirs (11, 12). LASV transmission chains are limited in humans and thus genetic changes occurring in humans are predominantly evolutionary dead ends as were all but one of the early to mid-phase SARS-CoV-1 lineages.
Zhan, Deverman and Chan stated that, “we cannot ensure that the early-to-mid epidemic samples did not straddle deep splits in the tree.” In fact, it is abundantly clear that the SARS-CoV-1 Isolates from early cases are indeed from different parts of the SARS-CoV-1 phylogenetic tree. That which the authors present as a caveat is the true explanation for the apparent high level of genetic diversity in the early stages of the SARS-CoV-1 outbreak. However, contrary to the authors’ flawed interpretations the genetic diversity in SARS-CoV-1 was generated in animal intermediates and the undetermined SARS-CoV-1 reservoir prior to multiple spillovers to humans.
The late phase of the SARS-CoV-1 outbreak is characterized by extended monophyletic spread involving a variant of SARS-CoV-1 that had sustained a 27-nucleotide deletion in orf 8 (13). There is a known index patient, a physician from the HZS-2 Hospital of Guangzhou (14). After being exposed to SARS-CoV-1 at this hospital, he traveled to Hong Kong. During a one-night stay at Hotel ‘M’ in Hong Kong he transmitted SARS-CoV-1 to 16 other guests (15). These individuals seeded outbreaks in Hong Kong, Toronto, Singapore and Vietnam. SARS-CoV-1 continued to spread with minimal genetic variation infecting over 8000 people in 25 countries on five continents and killing at least 774 people. The late phase of the SARS-CoV-1 outbreak is reminiscent of the spread of Ebola virus (EBOV), another hemorrhagic fever virus, during a typical outbreak (16, 17). EBOV is transmitted from human-to-human with higher efficiency than LASV. Once a spillover of EBOV occurs there can be spread of the virus over extended human-to human transmission chains. The genetic diversity of EBOV is considerably less than that of LASV (11).
Their comparison of the evolutionary dynamics of SARS-CoV-1 and SARS-CoV-2 led the authors to conclude that SARS-CoV-2 was well adapted or pre-adapted to replicate in human cells, perhaps in a laboratory. Setting aside the naivety of this argument from a virological perspective, in fact the perceived differences in genetic stability between SARS-CoV-1 and SARS-CoV-2 are artifactual. The three-month interval used by Zhan, Deverman and Chan for calculations in Figures 1 and 2 is incorrect. The genetic diversification in animals noted above occurred over an undetermined time, but likely a period of years or longer, not months. Therefore, the substitution rates of SARS-CoV-1 and SARS-CoV-2 calculated in Figure 1C based on a 3-month interval and the analyses in Figure 2 are not correct. An extensive analysis utilizing a more complete dataset and taking into account progenitor viruses showed that the substitution rates of SARS-CoV-1-like clade varies up to six-fold across the genome, with median rates between 4.0x10-4 and 1.9x10-3 (18). These substitution rates were slightly slower than those of the SARS-CoV-2-like clade a finding at variance with this preprint.
While the diversity of SARS-CoV-1 as it spread through various farmed and wild animals is known to be extensive based on available sequences (3-8), there remain gaps in this knowledge. The full extent of SARS-CoV-1 diversity in animals that existed at the time of the human spillovers in 2003-4 remains unknown. In this regard the civet viruses SZ3 and SZ16 found at the Dongmen Market (3) are not the actual progenitors of "Outbreak 1" as depicted in Figure 5 of the preprint. The unknown diversity of SARS-CoV-1 in animals effectively precludes assigning any one of several known lineages of SARS-CoV-1 found across a broad geographic range in China to any of the human cases. Civets and other farmed animals in Hubei province are supplied to restaurants across China (19). There must be multiple viral progenitors for what is incorrectly referred to simply as Outbreak 1, which is a composite of multiple spillover events. The lone exception are the four cases with mild illness diagnosed between 16 December 2003 and 8 January 2004 that the authors of the preprint refer to as Outbreak 2. These cases have definitive epidemiological links to animals sold at the Xinyuan Live Animal Market (20).
The authors wrote that “… SARS-CoV-2 appeared without peer in late 2019, suggesting that there was a single introduction of the human-adapted form of the virus into the human population.” However, Pekar and colleagues (21) provide strong evidence that there were at least two introductions of SARS-CoV-2 into humans. Two lineages designated A and B emerged via the wildlife trade in Wuhan. As was the case with the genetic divergence of early SARS-CoV-1 isolates, diversification of SARS-CoV-2 into lineages A and B occurred in animals. It is pertinent to note that the genetic diversity of SARS-CoV-2 in intermediate animals in 2019 appears to have been considerably less than the diversity of SARS-CoV-1 in wildlife in 2002-3.
Since the preprint by Zhan, Deverman and Chan was posted in May 2020 it has become clear that SARS-CoV-2 is able to infect numerous other species besides humans (22). Besides this fact and the flaws discussed above, other flaws may help to explain in part why this preprint has - to date - failed to pass peer-review. For example, the authors wrote, “The only site of notable entropy in the SARS-CoV-2 S, D614G, lies outside of the RBD and is not predicted to impact the structure or function of the protein (34).” However, the D614G mutation has a profound and important impact on the structure and function of the spike shifting it to a mostly open configuration the more effectively binds ACE-2 (23, 24). Furthermore, SARS-CoV-2 has displayed remarkable “entropy” with the emergence of numerous variants of concern.
Despite numerous errors and flawed analyses this preprint continues to be used to suggest that SARS-CoV-2 has a laboratory origin. For example, the misleading quote in the first sentence above is also found in the Prologue of a mass distribution book co-written by one of the authors (Alina Chan) (25). Matt Ridley, the co-author of that book considered that this preprint was “probably a careful and reputable study.” It is not. Nevertheless, Chan and Ridley continue to promote the false premise that this preprint provides evidence that SARS-CoV-2 may have been pre-adapted for human transmission in a laboratory (26).
References
First it seems disingenuous to claim "This phylogenetic tree and subsequent analyses are based on the false premise that SARS-CoV-1 evolution in humans during the early to late phases was monophyletic." when the authors explicitly state: "we cannot ensure that the early-to-mid epidemic samples did not straddle deep splits in the tree." Furthermore, a single phyla can easily be defined that encompases the entire tree, rendering the tree monophyletic. Instead the question seems to be whether or not the Jan-Mar 2003 cases were the result of a single spillover into humans, and how far back in time the MRCA as. Here it is worth noting that the leading zoonotic hypothesis for SARS-CoV-2 also posits multiple spillovers. While this hypothesis seems deeply flawed, especially in light of Lv 2024 (1), accepting it would render the two scenarios comparable. The commentor notes that "Pekar and colleagues provide strong evidence that there were at least two introductions of SARS-CoV-2 into humans." - as previously noted, this is contradicted by Lv 2024. Furthermore, Pekar 2022 later issued a correction in response to errors pointed out to them (https://pubpeer.com/publications/3FB983CC74C0A93394568A373167CE#1), which reduced the bayes factors by roughly 10 fold, rendering the support only "moderate". Additional errors pointed out by the same person who found the initial errors (https://pubpeer.com/publications/3FB983CC74C0A93394568A373167CE#15) in fact change the conclusion to favor a single introduction, consistent with the findings of Lv 2024. There were numerous other qualitative reasons to not regard Pekar 2022 as providing "strong" support, but it is now safe to say that the data only supports a single introduction of SARS-CoV-2.
The commentor notes "the first eleven cases of SARS-CoV were geographically dispersed across a wide area from Foshan to Dongguan", however, the characterization as being dispersed across a "wide area" seems to be potentially misleading, given that the two cities are less than 100 km apart and are part of essentially the same contiguous urban area.
The commentor notes: " Genetically diverse SARS-CoV-1 has been found in civets and other animals from southern China as well as in civets from Hubei province where the city of Wuhan is located."
Furthermore, note that the "early-to-mid epidemic phase" data used by Alina et al. comes from January to March 2003, after the initial probing infections which began in November. Jan 2003 is when a major superspreading events such as that of Zhou Zuofen at the Sun Yat-sen Memorial Hospital, started. Notably this superspreading event seems to have ultimately led to over 80% of the cases in Hong Kong, as well as the cases in Canada, Vietnam, and Singapore.
Even if the early-mid seqeunced did "straddle deep splits", this would indicate multiple probing infections before achieving efficient spread, which would contrast with the case for SARS-CoV-2, and thus the central contention of Alina et al. that "SARS-CoV-2 resembles SARS-CoV in the late phase of the 2003 epidemic after SARS-CoV had developed several advantageous adaptations for human transmission" remains valid.
Furthermore, the commentor makes basic mistakes. The statement: "civet viruses SZ3 and SZ16 found at the Dongmen Market (3) are not the actual progenitors of 'Outbreak 1' as depicted in Figure 5 of the preprint. " clearly misunderstands the figure, which does not depict SZ3 and SZ16 as "the actual progentiors of 'outbreak 1' ", but rather implies a recent common ancestor.
In summary, the conclusions of Alina et al remain well supported. Evidence does not suggest that the Jan-Mar 2003 cases straddle deep splits in the tree, and the early phase of SARS-CoV-2 does resemble the late stage of SAR-CoV-1 in many ways, the central point of Alina et al.'s article.
The major errors of Alina et. al. are with regard to the role of D614G, and the optimistic statement (keeping in mind the date of 2020) that "SARS-CoV-2 appears genetically stable and not under much pressure to adapt, which bodes well for diagnostics, vaccine, and therapeutics development" - this was clearly not the case. As SARS-CoV-2 became spread worldwide in a naïve population, it soon faced immense immune pressure that was not present at the start of the pandemic.
Still the warnings remain valid: "no precursors or branches of evolution stemming from a less human-adapted SARS-CoV-2-like virus have been detected. The sudden appearance of a highly infectious SARS-CoV-2 presents a major cause for concern that should motivate stronger international efforts to identify the source and prevent near future reemergence. Any existing pools of SARS-CoV-2 progenitors would be particularly dangerous if similarly well adapted for human transmission" and "We conclude by describing and advocating for measured and effective approaches implemented in the 2002-2004 SARS outbreaks to identify lingering population(s) of progenitor virus."
While Alina et al. did mention the possibility of lab escape (" Even the possibility that a non-genetically-engineered precursor could have adapted to humans while being studied in a laboratory should be considered, regardless of how likely or unlikely " / "The lack of definitive evidence to verify or rule out adaptation in an intermediate host species, humans, or a laboratory, means that we need to take precautions against each scenario to prevent re-emergence."), another major flaw is that Alina et al. did not go far enough in emphasizing the need to ensure that laboratory activity does not cause (another?) pandemic.
Citations
Jia-Xin Lv, Xiang Liu, Yuan-Yuan Pei, Zhi-Gang Song, Xiao Chen, Shu-Jian Hu, Jia-Lei She, Yi Liu, Yan-Mei Chen, Yong-Zhen Zhang, Evolutionary trajectory of diverse SARS-CoV-2 variants at the beginning of COVID-19 outbreak, Virus Evolution, Volume 10, Issue 1, 2024, veae020, https://doi.org/10.1093/ve/veae020
Molecular evolution of the SARS coronavirus during the course of the SARS epidemic in China. Science 303, 1666-1669 (2004).
Zhang CY, Wei JF, He SH. Adaptive evolution of the spike gene of SARS coronavirus: changes in positively selected sites in different epidemic groups. BMC Microbiol. 2006 Oct 4;6:88. doi: 10.1186/1471-2180-6-88. PMID: 17020602; PMCID: PMC1609170.
My comment was not disingenuous. All the analyses in the preprint totally ignore their own caveat that "we cannot ensure that the early-to-mid epidemic samples did not straddle deep splits in the tree."
I standby my comment that the preprint is fatally flawed. In fact, the conclusions of "Alina et al" remain unsupported by the scientific literature. The early SARS-CoV-1 diversity as analyzed by Zhan, Deverman and Chan is entirely accounted for by multiple spillovers [7 or more] over several months [great than the assumed 3 months] from an already diversified reservoir.
It is also clear SC2 spilled over at least twice in quick succession from a reservoir with more limited diversity.
The analyses of Pekar et al., 2022 have not been disproven.
I also stand by my interpretation of Figure 5. The Figure is incorrect and at best misleading.
The caption for Figure 5 and the main text make the following point, which is not correct: "In the SARS-CoV outbreaks, >99.9% genome or S identity was only observed among isolates collected within a narrow window of time from within the same species (Figure 5)."
The logic here is the same as that throughout the manuscript -- nearly identical genomes sampled within one species are claimed to be indicative of transmission within a single species subsequent to a period of host adaptation, assuming the necessity of adaptive mutations in any new host.
Figure 5 and the manuscript discuss samples SZ3 and SZ16 collected from palm civets in 2004 in Dongmen market in Shenzhen. However, sample SZ13 (accession AY304487.1) is not mentioned. It was collected from a raccoon dog -- the only raccoon dog sample reported by Guan et al (Science 2003). It is 100% identical to SZ16 over 8581 bases in a partial genome sequence inclusive of S.
Furthermore, Guan et al also reported neutralization of SZ16 infection by sera from masked palm civets (3 of 6 samples), raccoon dog (1 of 1), Chinese ferret badger (1 of 2), and human (12/55; 8/20 for humans in wild animal trade). This indicates widespread infection of diverse species by SARS with >99.9% identical genomes without requiring further host adaptation.
It's possible this error is inherited from Song et al (PNAS 2005), which twice states that SZ13 is a sample from a civet. Because, while Guan et al is cited by Zhan et al, it is done in a way that reflects not having read that paper. Zhan et al cite Guan et al here: "In contrast to the thorough and swift animal sampling executed in response to the 2002-2004 SARS-CoV outbreaks to identify intermediate hosts (37,53), no animal sampling prior to the shut down and sanitization of the market was reported." Yet, that sampling was done in May 2003, 6 months after disease onset for the earliest known SARS case; 5 months after onset of the earliest case with a known link to Shenzhen. Zhan et al posted their preprint in May 2020, 5 months after disease onset for the earliest known SARS-CoV-2 case. In reality, on-the-ground investigations into Huanan market were widely reported on 31/December/2019 — a few days rather than a few months after recognition of a novel coronavirus. Environmental sampling focused on case locations was largely on 1/January/2020 and sampling focused on shops in the wildlife trade was largely on 12/January/2020. Zhan et al neglect to describe where positive samples were disproportionately found in the report that they cite (google translate of reference 51): “There is wildlife trading in the west area of the South China Seafood Market, especially in the areas of the 7th and 8th Streets in the west area close to the interior of the market, where there are many wildlife trading shops, and the positive samples in this area are also relatively concentrated, accounting for 42.4% (14/33) of all positive samples. In summary, it is highly suspected that the epidemic is related to wildlife trading.”
There is, however, a relevant contrast that can be drawn between the SARS and SARS-CoV-2 outbreaks. What's less well known is that Dongmen market continued to be sampled after May (Yaqing et al, Disease Surveillance 2004). Civets, raccoon dogs, and badgers were still available for sampling and still positive by PCR for SARS in October, November, and December 2003. It wasn't until spillover from civets to humans was definitively shown for the Guangzhou outbreak (Wang et al, Emerg Infect Dis 2005) that Guangdong province acted to close wildlife markets and end the civet trade.
Lastly, Zhan et al write, "The human and civet isolates of the 2003/2004 outbreak, which were collected most closely in time and at the site of cross-species transmission, shared only up to 99.79% S identity.” This is also incorrect. Wang et al (Emergency Infect Dis 2005) write, "The 286-bp S gene sequences from isolates from the waitress and the physician were identical to 4 of 5 S gene sequences from palm civets from the restaurant.” The 99.79% identical sequence is instead from a different patient without a known link to the civet or other animal trade — a patient identified about one month before the first patient in the restaurant cluster.
Attach files by dragging & dropping,
selecting them, or pasting
from the clipboard.
Uploading your files…
We don’t support that file type.
with
a PNG, GIF, or JPG.
Yowza, that’s a big file.
with
a file smaller than 1MB.
This file is empty.
with
a file that’s not empty.
Something went really wrong, and we can’t process that file.