Skip to main content

Advertisement

Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis

Article metrics

Abstract

Background

MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV.

Results

Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein’s antigenic sites are found to be conserved with those in HKU4 and HKU5.

Conclusion

This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

Background

Coronavirus, the members of Coronaviridae family are the diverse group of virus which infects domestic animals, birds as well as human. Coronaviruses are enveloped RNA viruses which are classified into four genera, Alpha coronavirus, Beta coronavirus, Gamma coronavirus and Delta coronavirus [1]. HCoV-229E, HCoV-OC43, SARS-CoV, HCoV-NL63, HCoV-HKU1 and MERS-CoV are the six types of human coronaviruses evolved in between 1960 and 2015 whereas MERS-CoV is newly emerged strain. This newly emerged MERS-CoV, which is highly fatal, belongs to lineage C of the genus Beta coronavirus [2]. Human coronaviruses have been tracked down to zoonotic origin. Among the six strains of human corona-viruses, the first HCoV-229E has structural similarity with Bat coronaviruses. This phenomenon resemble to other members that are also have originated from different animal corona-virus like HCoV-OC43 from bovine corona-virus, SARS-CoV and HCoV-NL63 from bat or palm civet corona-virus and HCoV-HKU1 from Mouse hepatitis virus (MHV). Like other human coronaviruses, it is assumed that MERS-CoV has been evolved from zoonotic origin but the zoonotic source of MERS-CoV remains unknown [35].

Some studies identified some close amino acid similarity between MERS-CoV and Nycteris and Pipistrellus bat species [6]. But recent reports identified that MERS-CoV is more closely related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and Pipistrellus bat CoV HKU5 (Pi-BatCoV HKU5) [7]. MERS-CoV and Bat-CoV HKU5 bat corona-viruses shared high degree of amino acid similarity in their RNA polymerase (92.1 to 92.3 %), 3C-like protease (82 %), polymerase (92 %), and proofreading exonuclease (91 %) and nucleocapsid (N) protein (68 %) [8, 9]. But it is more closely related to Ty-BatCoV HKU4 in S and N. The major difference between MERS-CoV and these bat corona-viruses is in the region between the spike and the envelop genes. The MERS-CoV has five ORFs while the bat viruses have four in this region [35, 10].

Though the MERS-CoV is structurally related to the bat-CoV but there is no report of the sharing of antigenic sites among those corona-viruses. To better understand the evolutionary origin of MERS-CoV pathogenicity it is really needed to know in which extent they are conserved in their immunogenicity.

In this study, we identify the conserved antigenic site among MERS and Bat Corona-virus. For this, bioinformatics analyses of their spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins were done for finding the conserved antigenic sites and for mapping the evolutionary conserved antigenic sites on their 3D structures which were determined by threading modeling technique.

Methods

Retrieving MERS and Bat coronavirus protein sequences

A total of available five spike (S), membrane (M), enveloped (E) and nucleocapsid (N) protein sequence of HKU4, HKU5 Bat-CoV and 62 S, 64 E, M and 72 N protein sequences of MERS-CoV were retrieved from NCBI GenBank sequence database [11] (Additional file 1: Table S1).

Identifcation of conserved region

Retrieved sequences were aligned using EBI-clustalW program [12] to find the conserved region. This multiple sequence alignment (MSA) was done with Gonnet matrix [12] and predicted their phylogenetic relationship (Mmaximum Parsimony, MP) by using MEGA 5.0 [13] to understand the conserved regions among them. From the multiple sequence alignment, the highest number of identical and similar amino acid containing region was selected as a conserved region. That selected conserved region was then used for antigenic site prediction.

Detection of immunogenicity of conserved peptides

Immunogenicity of the conserved peptides was determined by using the B cell epitope prediction tools of The Immune Epitope Database (IEDB) [14]. Among B cell epitope prediction tools of IEDB, Bepipred linear epitope prediction method [15] and Ellipro-structural based discontinuous epitope prediction methods were applied [14]. The antigenic sites of MERS coronavirus spike, envelope, membrane and nucleocapsid proteins were also determined by using Bepipred and Ellipro analysis. Among Bepipred and Ellipro predicted epitopes, fully or at least 90 % overlapping epitopes were chosen as the desired epitopes.

Prediction of epitope conservancy

To check the conservancy of the predicted epitopes the epitope conservancy analysis tool from the IEDB analysis resource [16] was used. This tool calculates the conservancy level by searching for identities in the given protein sequence.

Prediction and evaluation protein 3D model

As the experimental structure of spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins of any MERS coronavirus isolate were not found in protein data bank (PDB), their 3D structures were predicted by using I-TASSER server [17]. I-TASSER server gives protein 3D structure by multiple threading alignments [17]. I-TASSER provided top models quality were then verified by PROCHECK analysis [18]. The model in which maximum numbers of amino acid residues were found to be in the most favorable region was selected as the best model. This model was then used to locate the epitope by using UCSF Chimera [19] visualization tool.

Results

MERS and Bat (HKU4, HKU5) coronaviruses are found to be mostly conserved in case of envelope protein

In case of envelope protein, MERS coronaviruses are found to be highly conserved with HKU4 and HKU5 bat coronaviruses (Figs. 1, 2 respectively) compared to the other proteins (data not shown). From the maximum parsimony phylogenetic analysis of MEGA 5.0, it is found that spike (S), envelope (E), membrane (M) and nucleo-capsid (N) protein MERS-CoV has relationship with Bat (HKU4 and HKU5) coronavirus (Additional file 2: Figure S1, Additional file 3: Figure S2, Additional file 4: Figure S3 and Additional file 5: Figure S4 respectively).

Fig. 1
figure1

Multiple sequence alignment of MERS and HKU4 coronavirus envelope (E) protein: Multiple sequence alignment of total 64 numbers of MERS-CoV and 5 numbers of HKU4 bat coronaviruse sequences indicates that are highly conserved in envelope (E) protein. Conservation showed here is based on 11 base scales where yellow color bar and star sign indicates the full conservation. Alignment quality was based on BLOSUM 62 substitution matrix score where yellow color indicates good quality. All the colors changes according to the conservation and alignment quality. Black bars showed the consensus sequence. This alignment was visualized by Jalview 2.8 [22] and color scheme used is Clustalx

Fig. 2
figure2

Multiple sequence alignment of MERS and HKU5 coronavirus envelope (E) protein: Figure legend is as in Fig. 1

S, E, M, N protein’s conserved regions are predicted to be antigenic

The MSA derived conserved region were used to determine the antigenic sites by using IEDB resource analysis B cell epitope prediction tool [14]. From this analysis, a total of 3 epitope from S protein, 1 epitope from E protein, 4 epitope from M protein and 5 epitope from N protein were found from the HKU4 bat and MERS coronavirus conserved region (Table 1). Similarly, 7 epitope from S protein, 1 epitope from E protein, 4 epitope from M protein and 5 epitope from N protein were found from the HKU5 bat and MERS coronavirus conserved region (Table 2).

Table 1 Predicted antigenic sites, their lengths and their conservancy using IEDB [14] analysis tool from MERS and HKU4 Bat coronavirus conserved protein region
Table 2 Predicted antigenic sites, their lengths and their conservancy using IEDB [14] analysis tool from MERS and HKU5 Bat coronavirus conserved protein region

One epitope of S, M and N protein is fully conserved among MERS and Bat coronavirus

The conservancies of all epitopes were determined by IEDB conservancy analysis tools [16]. Among the IEDB predicted epitopes, most of the epitopes are found to be >90 % conserved among MERS and Bat (HKU4, HKU5) coronaviruses (Tables 1, 2). Among these epitopes, one epitope of S, M, N proteins are found to be 100 % conserved.

MERS and Bat coronaviruses shared common B cell epitopes

From the IEDB predicted epitopes of MERS coronavirus S, E, M and N proteins (Table 3), it is found that most of the epitopes are common between MERS and Bat coronavirus. They shared approximately 100 % of E, M and N proteins epitope. In case of S protein, HKU5 shared around 70 % epitope with MERS-CoV while HKU4 shared only 30 % epitope (Fig. 3).

Table 3 MERS coronavirus spike, envelop, membrane and nucleocapsid proteins antigenic sites predicted by IEDB analysis [14]
Fig. 3
figure3

MERS-CoV shared S, E, M and N proteins epitope with HKU4 and HKU5 bat coronavirus: a MERS-CoV shared maximum number of spike protein epitope with HKU5 bat- CoV than HKU4 Bat-CoV. Here Y axis indicates the coronavirus strain and X axis indicates the epitopes. b MERS-CoV shared equal number of envelope protein epitope with HKU4 and HKU5 bat-CoV. c In case of membrane protein epitope, they shared equal number of epitope. d MERS and Bat coronaviruses shared equal number of nucleocapsid protein epitope

A tertiary structure of S, E, M, N proteins was predicted and validated using in silico approach

As the experimental tertiary structure of the S, E, M, N proteins is not available, we modeled a 3D structure by I-TASSER server [17] by multiple threading alignments. I-TASSER analysis deduced 5 different models (data not shown) for this protein. The quality of prediction of all the protein models was checked by PROCHECK analysis [18]. The model in which maximum numbers of amino acid residues were found to be in the most favorable region was selected as the best model. Using UCSF Chimera visualization tool [19], all the conserved (>90 %) epitopes are mapped on the predicted S, E, M and N protein structures (Fig. 4).

Fig. 4
figure4

3D structure of MERS-CoV S, E, M and N protein: a Spike (S) protein: Predicted conserved S protein epitopes are mapped onto protein 3D structure using UCSF Chimera [19] visualization tool. Each epitopes are labelled with red color. b Envelope (E) protein: Figure legend as Fig. 4(a). Epitopes are marked as green color. c Membrane (M) protein: Figure legend as Fig. 4(a). Epitopes are labelled with magenta color. d Nucleo-capsid (N) protein: Figure legend as Fig. 4(a). Conserved epitopes are labeled with orange color

Discussion

Coronaviruses are the most diverse groups of virus which have emerged as deadly viruses in course of time. Most of the human coronaviruses are evolved from zoonotic origin. In most cases bats are served as a reservoir for zoonotic viruses [20]. SARS-CoV has originated from animals, with horseshoe bats as the natural reservoir and palm civet as the intermediate host allowing animal to-human transmission. The HCoV-229E has structural similarity with Bat corona-viruses [21]. Similarly SARS-CoV, HCoV-229E, HCoV-NL63 have originated from the bat but the zoonotic source of MERS-CoV is still not clear [3]. Though the MERS-CoV is found to be structurally related to the bat corona-viruses (HKU4 and HKU5) but there is no report of the sharing of antigenic sites among them. To better understand the evolutionary origin of MERS-CoV pathogenicity we need to know in which extent they are conserved in their immunogenicity.

To address pathogeneic relationship, we have constructed a phylogenetic tree and analyzed the relationship of MERS and Bat coronaviruses using the spike (S), envelope (E), membrane (M), nucleocapsid (N) proteins sequences. It is found that MERS-CoV has phylogenetic relationship with HKU4 and HKU5 bat-CoV. We also predicted conserved antigenic sites and found that, MERS and HKU4 bat corona-viruses shared 30 % of S protein epitope and 100 % of E, M and N proteins epitope. And MERS and HKU5 bat coronaviruses shared 70 % of S protein epitope and 100 % of E, M and N proteins epitope. In most cases conservation level found >90 %. These findings suggested that, in case of antigenicity MERS-CoV is more closely related to HKU5 bat-CoV than the HKU4 bat-CoV. This study showed how pathogenically HKU4 and HKU5 bat-CoVare closely related to the MERS-CoV. Therefore, the level of conservation among antigenic sites provides evidence in support of their ancestry of pathogenicity.

Conclusions

This study reveals that MERS and Bat coronaviruses shared some common antigenic sites in their spike (S), envelope (E), membrane (M) and nucleo-capsid (N) protein. The shared epitopes are over 90 % conserved throughout their evolutionary process. This shared epitopes also show that, in case of antigenic sites, MERS-CoV is more closely related to HKU5 bat coronaviruses than HKU4 bat coronaviruses. The conserved antigenic sites strongly support their ancestry relationships.

References

  1. 1.

    Sharmin R, Islam AB. A highly conserved WDYPKCDRA epitope in the RNA directed RNA polymerase of human coronaviruses can be used as epitope-based universal vaccine design. BMC Bioinformatics. 2014;15:161.

  2. 2.

    Cheng VC, Lau SK, Woo PC, Yuen KY. Severe acute respiratory syndrome coronavirus as an agent of emerging and reemerging infection. Clin Microbiol Rev. 2007;20(4):660–94.

  3. 3.

    Huynh J, Li S, Yount B, Smith A, Sturges L, Olsen JC, Nagel J, Johnson JB, Agnihothram S, Gates JE, Frieman MB, Baric RS, Donaldson EF. Evidence supporting a zoonotic origin of human coronavirus strain NL63. J Virol. 2012;86(23):12816–25.

  4. 4.

    Li W, Wong SK, Li F, Kuhn JH, Huang IC, Choe H, Farzan M. Animal origins of the severe acute respiratory syndrome coronavirus: insight from ACE2-S-protein interactions. J Virol. 2006;80(9):4211–9.

  5. 5.

    To KK, Hung IF, Chan JF, Yuen KY. From SARS coronavirus to novel animal and human coronaviruses. J Thorac Dis. 2013;5 Suppl 2:S103–8.

  6. 6.

    Annan A, Baldwin HJ, Corman VM, Klose SM, Owusu M, Nkrumah EE, Badu EK, Anti P, Agbenyega O, Meyer B, Oppong S, Sarkodie YA, Kalko EK, Lina PH, Godlevska EV, Reusken C, Seebens A, Gloza-Rausch F, Vallo P, Tschapka M, Drosten C, Drexler JF. Human betacoronavirus 2c EMC/2012-related viruses in bats, Ghana and Europe. Emerg Infect Dis. 2013;19(3):456–9.

  7. 7.

    Anthony SJ, Ojeda-Flores R, Rico-Chávez O, Navarrete-Macias I, Zambrana-Torrelio CM, Rostal MK, Epstein JH, Tipps T, Liang E,Sanchez-Leon M, Sotomayor-Bonilla J, Aguirre AA, Ávila-Flores R, Medellín RA, Goldstein T, Suzán G, Daszak P, Lipkin WI. Coronaviruses in bats from Mexico. J Gen Virol. 2013;94(Pt 5):1028–38.

  8. 8.

    van Boheemen S, de Graaf M, Lauber C, Bestebroer TM, Raj VS, Zaki AM, Osterhaus AD, Haagmans BL, Gorbalenya AE, Snijder EJ, Fouchier RA. Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans. MBio. 2012;20:3(6).

  9. 9.

    Lau SK, Li KS, Tsang AK, Lam CS, Ahmed S, Chen H, Chan KH, Woo PC, Yuen KY. Genetic characterization of Betacoronavirus lineage C viruses in bats reveals marked sequence divergence in the spike protein of pipistrellus bat coronavirus HKU5 in Japanese pipistrelle: implications for the origin of the novel Middle East respiratory syndrome coronavirus. J Virol. 2013;87(15):8638–50.

  10. 10.

    Woo PC, Lau SK, Huang Y, Yuen KY. Coronavirus diversity, phylogeny and interspecies jumping. Exp Biol Med (Maywood). 2009;234(10):1117–27.

  11. 11.

    Benson DA, Cavanauqh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. Genbank. Nucleic Acids Res. 2008;41(Database issue):D36–42.

  12. 12.

    Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22(22):4673–80.

  13. 13.

    Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.

  14. 14.

    Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, Salimi N, Damle R, Sette A, Peters B. The immune epitope database 2.0. Nucleic Acids Res. 2010;38(Database issue):D854–62.

  15. 15.

    Larsen JE, Lund O, Nielsen M. Improved method for predicting linear B-cell epitopes. Immunome Res. 2006;2:2.

  16. 16.

    Bui HH, Sidney J, Li W, Fusseder N, Sette A. Development of an epitope conservancy analysis tool to facilitate the design of epitope-based diagnostics and vaccines. BMC Bioinformatics. 2007;8(1):361.

  17. 17.

    Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008;9:40.

  18. 18.

    Laskowski R A, MacArthur M W, Thornton J M. PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 1993;26:283-91.

  19. 19.

    Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. UCSF Chimera- a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–12.

  20. 20.

    Graham RL, Baric RS. Recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission. J Virol. 2010;84(7):3134–46.

  21. 21.

    Ge XY, Li JL, Yang XL, Chmura AA, Zhu G, Epstein JH, Mazet JK, Hu B, Zhang W, Peng C, Zhang YJ, Luo CM, Tan B, Wang N, Zhu Y, Crameri G, Zhang SY,Wang LF, Daszak P, Shi ZL. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature. 2013;503(7477):535–8.

  22. 22.

    Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25(9):1189–91.

Download references

Acknowledgement

This study was supported by the Department of Genetic Engineering and Biotechnology, University of Dhaka.

Author information

Correspondence to Abul B. M. M. K. Islam.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contribution

RS and AI perform the analysis. AI conceived the idea. RS and AI wrote the manuscript. Both authors read and approved the final manuscript.

Additional files

Additional file 1: Table S1.

Sequence related information. (XLSX 12 kb)

Additional file 2: Figure S1.

Phylogenetic analysis of MERS and Bat (HKU4 and HKU5) coronavirus S protein: The evolutionary history was inferred using the Maximum Parsimony method. Tree #1 out of 5 most parsimonious trees (length = 3378) is shown. The consistency index is 0.990823 (0.990823), the retention index is 0.996655 (0.996655), and the composite index is 0.987508 (0.987508) for all sites and parsimony-informative sites (in parentheses). The MP tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm with search level 0 in which the initial trees were obtained by the random addition of sequences (10 replicates). The analysis involved 72 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 1347 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [13]. (TIF 328 kb)

Additional file 3: Figure S2.

Phylogenetic analysis of MERS and Bat (HKU4 and HKU5) coronavirus E protein: The evolutionary history was inferred using the Maximum Parsimony method. Tree #1 out of 10 most parsimonious trees (length = 40) is shown. The consistency index is 1.000000 (1.000000), the retention index is 1.000000 (1.000000), and the composite index is 1.000000 (1.000000) for all sites and parsimony-informative sites (in parentheses). The MP tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm with search level 0 in which the initial trees were obtained by the random addition of sequences (10 replicates). The analysis involved 74 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 82 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [13]. (TIF 360 kb)

Additional file 4: Figure S3.

Phylogenetic analysis of MERS and Bat (HKU4 and HKU5) coronavirus M protein: The evolutionary history was inferred using the Maximum Parsimony method. Tree #1 out of 2 most parsimonious trees (length = 312) is shown. The consistency index is 0.990385 (0.990033), the retention index is 0.995940 (0.995940), and the composite index is 0.986364 (0.986014) for all sites and parsimony-informative sites (in parentheses). The MP tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm with search level 0 in which the initial trees were obtained by the random addition of sequences (10 replicates). The analysis involved 74 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 154 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [13]. (TIF 352 kb)

Additional file 5: Figure S4.

Phylogenetic analysis of MERS and Bat (HKU4 and HKU5) coronavirus N protein: The evolutionary history was inferred using the Maximum Parsimony method. Tree #1 out of 9 most parsimonious trees (length = 590) is shown. The consistency index is 0.996610 (0.996599), the retention index is 0.999179 (0.999179), and the composite index is 0.995792 (0.995780) for all sites and parsimony-informative sites (in parentheses). The MP tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm with search level 0 in which the initial trees were obtained by the random addition of sequences (10 replicates). The analysis involved 82 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 411 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [13]. (TIF 391 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sharmin, R., Islam, A.B.M.M.K. Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis. Source Code Biol Med 11, 3 (2016) doi:10.1186/s13029-016-0049-7

Download citation

Keywords

  • MERS-CoV
  • HKU4
  • HKU5
  • Epitope