Open Access

Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis

Source Code for Biology and Medicine201611:3

DOI: 10.1186/s13029-016-0049-7

Received: 21 May 2015

Accepted: 26 February 2016

Published: 9 March 2016

Abstract

Background

MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV.

Results

Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein’s antigenic sites are found to be conserved with those in HKU4 and HKU5.

Conclusion

This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

Keywords

MERS-CoV HKU4 HKU5 Epitope

Background

Coronavirus, the members of Coronaviridae family are the diverse group of virus which infects domestic animals, birds as well as human. Coronaviruses are enveloped RNA viruses which are classified into four genera, Alpha coronavirus, Beta coronavirus, Gamma coronavirus and Delta coronavirus [1]. HCoV-229E, HCoV-OC43, SARS-CoV, HCoV-NL63, HCoV-HKU1 and MERS-CoV are the six types of human coronaviruses evolved in between 1960 and 2015 whereas MERS-CoV is newly emerged strain. This newly emerged MERS-CoV, which is highly fatal, belongs to lineage C of the genus Beta coronavirus [2]. Human coronaviruses have been tracked down to zoonotic origin. Among the six strains of human corona-viruses, the first HCoV-229E has structural similarity with Bat coronaviruses. This phenomenon resemble to other members that are also have originated from different animal corona-virus like HCoV-OC43 from bovine corona-virus, SARS-CoV and HCoV-NL63 from bat or palm civet corona-virus and HCoV-HKU1 from Mouse hepatitis virus (MHV). Like other human coronaviruses, it is assumed that MERS-CoV has been evolved from zoonotic origin but the zoonotic source of MERS-CoV remains unknown [35].

Some studies identified some close amino acid similarity between MERS-CoV and Nycteris and Pipistrellus bat species [6]. But recent reports identified that MERS-CoV is more closely related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and Pipistrellus bat CoV HKU5 (Pi-BatCoV HKU5) [7]. MERS-CoV and Bat-CoV HKU5 bat corona-viruses shared high degree of amino acid similarity in their RNA polymerase (92.1 to 92.3 %), 3C-like protease (82 %), polymerase (92 %), and proofreading exonuclease (91 %) and nucleocapsid (N) protein (68 %) [8, 9]. But it is more closely related to Ty-BatCoV HKU4 in S and N. The major difference between MERS-CoV and these bat corona-viruses is in the region between the spike and the envelop genes. The MERS-CoV has five ORFs while the bat viruses have four in this region [35, 10].

Though the MERS-CoV is structurally related to the bat-CoV but there is no report of the sharing of antigenic sites among those corona-viruses. To better understand the evolutionary origin of MERS-CoV pathogenicity it is really needed to know in which extent they are conserved in their immunogenicity.

In this study, we identify the conserved antigenic site among MERS and Bat Corona-virus. For this, bioinformatics analyses of their spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins were done for finding the conserved antigenic sites and for mapping the evolutionary conserved antigenic sites on their 3D structures which were determined by threading modeling technique.

Methods

Retrieving MERS and Bat coronavirus protein sequences

A total of available five spike (S), membrane (M), enveloped (E) and nucleocapsid (N) protein sequence of HKU4, HKU5 Bat-CoV and 62 S, 64 E, M and 72 N protein sequences of MERS-CoV were retrieved from NCBI GenBank sequence database [11] (Additional file 1: Table S1).

Identifcation of conserved region

Retrieved sequences were aligned using EBI-clustalW program [12] to find the conserved region. This multiple sequence alignment (MSA) was done with Gonnet matrix [12] and predicted their phylogenetic relationship (Mmaximum Parsimony, MP) by using MEGA 5.0 [13] to understand the conserved regions among them. From the multiple sequence alignment, the highest number of identical and similar amino acid containing region was selected as a conserved region. That selected conserved region was then used for antigenic site prediction.

Detection of immunogenicity of conserved peptides

Immunogenicity of the conserved peptides was determined by using the B cell epitope prediction tools of The Immune Epitope Database (IEDB) [14]. Among B cell epitope prediction tools of IEDB, Bepipred linear epitope prediction method [15] and Ellipro-structural based discontinuous epitope prediction methods were applied [14]. The antigenic sites of MERS coronavirus spike, envelope, membrane and nucleocapsid proteins were also determined by using Bepipred and Ellipro analysis. Among Bepipred and Ellipro predicted epitopes, fully or at least 90 % overlapping epitopes were chosen as the desired epitopes.

Prediction of epitope conservancy

To check the conservancy of the predicted epitopes the epitope conservancy analysis tool from the IEDB analysis resource [16] was used. This tool calculates the conservancy level by searching for identities in the given protein sequence.

Prediction and evaluation protein 3D model

As the experimental structure of spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins of any MERS coronavirus isolate were not found in protein data bank (PDB), their 3D structures were predicted by using I-TASSER server [17]. I-TASSER server gives protein 3D structure by multiple threading alignments [17]. I-TASSER provided top models quality were then verified by PROCHECK analysis [18]. The model in which maximum numbers of amino acid residues were found to be in the most favorable region was selected as the best model. This model was then used to locate the epitope by using UCSF Chimera [19] visualization tool.

Results

MERS and Bat (HKU4, HKU5) coronaviruses are found to be mostly conserved in case of envelope protein

In case of envelope protein, MERS coronaviruses are found to be highly conserved with HKU4 and HKU5 bat coronaviruses (Figs. 1, 2 respectively) compared to the other proteins (data not shown). From the maximum parsimony phylogenetic analysis of MEGA 5.0, it is found that spike (S), envelope (E), membrane (M) and nucleo-capsid (N) protein MERS-CoV has relationship with Bat (HKU4 and HKU5) coronavirus (Additional file 2: Figure S1, Additional file 3: Figure S2, Additional file 4: Figure S3 and Additional file 5: Figure S4 respectively).
Fig. 1

Multiple sequence alignment of MERS and HKU4 coronavirus envelope (E) protein: Multiple sequence alignment of total 64 numbers of MERS-CoV and 5 numbers of HKU4 bat coronaviruse sequences indicates that are highly conserved in envelope (E) protein. Conservation showed here is based on 11 base scales where yellow color bar and star sign indicates the full conservation. Alignment quality was based on BLOSUM 62 substitution matrix score where yellow color indicates good quality. All the colors changes according to the conservation and alignment quality. Black bars showed the consensus sequence. This alignment was visualized by Jalview 2.8 [22] and color scheme used is Clustalx

Fig. 2

Multiple sequence alignment of MERS and HKU5 coronavirus envelope (E) protein: Figure legend is as in Fig. 1

S, E, M, N protein’s conserved regions are predicted to be antigenic

The MSA derived conserved region were used to determine the antigenic sites by using IEDB resource analysis B cell epitope prediction tool [14]. From this analysis, a total of 3 epitope from S protein, 1 epitope from E protein, 4 epitope from M protein and 5 epitope from N protein were found from the HKU4 bat and MERS coronavirus conserved region (Table 1). Similarly, 7 epitope from S protein, 1 epitope from E protein, 4 epitope from M protein and 5 epitope from N protein were found from the HKU5 bat and MERS coronavirus conserved region (Table 2).
Table 1

Predicted antigenic sites, their lengths and their conservancy using IEDB [14] analysis tool from MERS and HKU4 Bat coronavirus conserved protein region

Protein

Peptide

Length (aa)

Identity (%)

Spike (S)

LLSGTPPQVY

10

92.54

IADPGYMQG

9

100.00

DAVNNNAQ

8

92.54

Envelope (E)

DSKPPLPPDEWV

12

92.75

Membrane (M)

WSFNPE

6

100.00

DRLPNEV

7

92.75

SYGTNS

6

92.75

AGNYRSPPIT

10

92.75

Nucleo-capsid (N)

DRKINT

6

100

TGPEAAL

9

93.51

LRGPGDLQGN

10

93.51

TEDPRWPQI

9

93.51

HQNNDDHGN

9

93.51

Table 2

Predicted antigenic sites, their lengths and their conservancy using IEDB [14] analysis tool from MERS and HKU5 Bat coronavirus conserved protein region

Protein

Peptide

Length (aa)

Identity (%)

Spike (S)

SQYSRS

6

92.54

KSSQSSPIIPGFG

13

92.54

SISTGSRSARS

11

89.55

IADPGYMQG

9

100.00

DAVNNNAQ

8

92.54

IQSDRK

6

92.54

LLSGTPPQVY

10

92.54

Envelope (E)

DSKPPLPPDEWV

12

97.25

Membrane (M)

WSFNPE

6

100.00

DRLPNEV

7

92.75

SYGTNS

6

92.75

AGNYRSPPIT

10

92.75

Nucleocapsid (N)

DRKINT

6

100.00

TGPEAAL

7

94.74

LRGPGDLQGN

10

94.74

TEDPRWPQI

9

100.00

HQNNDDHGN

9

94.74

One epitope of S, M and N protein is fully conserved among MERS and Bat coronavirus

The conservancies of all epitopes were determined by IEDB conservancy analysis tools [16]. Among the IEDB predicted epitopes, most of the epitopes are found to be >90 % conserved among MERS and Bat (HKU4, HKU5) coronaviruses (Tables 1, 2). Among these epitopes, one epitope of S, M, N proteins are found to be 100 % conserved.

MERS and Bat coronaviruses shared common B cell epitopes

From the IEDB predicted epitopes of MERS coronavirus S, E, M and N proteins (Table 3), it is found that most of the epitopes are common between MERS and Bat coronavirus. They shared approximately 100 % of E, M and N proteins epitope. In case of S protein, HKU5 shared around 70 % epitope with MERS-CoV while HKU4 shared only 30 % epitope (Fig. 3).
Table 3

MERS coronavirus spike, envelop, membrane and nucleocapsid proteins antigenic sites predicted by IEDB analysis [14]

Protein

Peptide

Length (aa)

Spike (S)

GNFSDG

6

IQSDRK

6

SYTGSSFYAPEPITS

15

QYGTDTNSV

9

SQYSRS

6

KSSQSSPIIPGFG

13

SISTGSRSARS

11

IADPGYMQG

9

DAVNNNAQ

8

LLSGTPPQVY

10

Envelope (E)

DSKPPLPPDEWV

12

Membrane (M)

WSFNPE

6

DRLPNEV

7

SYGTNS

6

AGNYRSPPIT

10

Nucleocapsid (N)

DRKINT

6

TGPEAAL

7

LRGPGDLQGN

10

TEDPRWPQI

9

HQNNDDHGN

9

Fig. 3

MERS-CoV shared S, E, M and N proteins epitope with HKU4 and HKU5 bat coronavirus: a MERS-CoV shared maximum number of spike protein epitope with HKU5 bat- CoV than HKU4 Bat-CoV. Here Y axis indicates the coronavirus strain and X axis indicates the epitopes. b MERS-CoV shared equal number of envelope protein epitope with HKU4 and HKU5 bat-CoV. c In case of membrane protein epitope, they shared equal number of epitope. d MERS and Bat coronaviruses shared equal number of nucleocapsid protein epitope

A tertiary structure of S, E, M, N proteins was predicted and validated using in silico approach

As the experimental tertiary structure of the S, E, M, N proteins is not available, we modeled a 3D structure by I-TASSER server [17] by multiple threading alignments. I-TASSER analysis deduced 5 different models (data not shown) for this protein. The quality of prediction of all the protein models was checked by PROCHECK analysis [18]. The model in which maximum numbers of amino acid residues were found to be in the most favorable region was selected as the best model. Using UCSF Chimera visualization tool [19], all the conserved (>90 %) epitopes are mapped on the predicted S, E, M and N protein structures (Fig. 4).
Fig. 4

3D structure of MERS-CoV S, E, M and N protein: a Spike (S) protein: Predicted conserved S protein epitopes are mapped onto protein 3D structure using UCSF Chimera [19] visualization tool. Each epitopes are labelled with red color. b Envelope (E) protein: Figure legend as Fig. 4(a). Epitopes are marked as green color. c Membrane (M) protein: Figure legend as Fig. 4(a). Epitopes are labelled with magenta color. d Nucleo-capsid (N) protein: Figure legend as Fig. 4(a). Conserved epitopes are labeled with orange color

Discussion

Coronaviruses are the most diverse groups of virus which have emerged as deadly viruses in course of time. Most of the human coronaviruses are evolved from zoonotic origin. In most cases bats are served as a reservoir for zoonotic viruses [20]. SARS-CoV has originated from animals, with horseshoe bats as the natural reservoir and palm civet as the intermediate host allowing animal to-human transmission. The HCoV-229E has structural similarity with Bat corona-viruses [21]. Similarly SARS-CoV, HCoV-229E, HCoV-NL63 have originated from the bat but the zoonotic source of MERS-CoV is still not clear [3]. Though the MERS-CoV is found to be structurally related to the bat corona-viruses (HKU4 and HKU5) but there is no report of the sharing of antigenic sites among them. To better understand the evolutionary origin of MERS-CoV pathogenicity we need to know in which extent they are conserved in their immunogenicity.

To address pathogeneic relationship, we have constructed a phylogenetic tree and analyzed the relationship of MERS and Bat coronaviruses using the spike (S), envelope (E), membrane (M), nucleocapsid (N) proteins sequences. It is found that MERS-CoV has phylogenetic relationship with HKU4 and HKU5 bat-CoV. We also predicted conserved antigenic sites and found that, MERS and HKU4 bat corona-viruses shared 30 % of S protein epitope and 100 % of E, M and N proteins epitope. And MERS and HKU5 bat coronaviruses shared 70 % of S protein epitope and 100 % of E, M and N proteins epitope. In most cases conservation level found >90 %. These findings suggested that, in case of antigenicity MERS-CoV is more closely related to HKU5 bat-CoV than the HKU4 bat-CoV. This study showed how pathogenically HKU4 and HKU5 bat-CoVare closely related to the MERS-CoV. Therefore, the level of conservation among antigenic sites provides evidence in support of their ancestry of pathogenicity.

Conclusions

This study reveals that MERS and Bat coronaviruses shared some common antigenic sites in their spike (S), envelope (E), membrane (M) and nucleo-capsid (N) protein. The shared epitopes are over 90 % conserved throughout their evolutionary process. This shared epitopes also show that, in case of antigenic sites, MERS-CoV is more closely related to HKU5 bat coronaviruses than HKU4 bat coronaviruses. The conserved antigenic sites strongly support their ancestry relationships.

Declarations

Acknowledgement

This study was supported by the Department of Genetic Engineering and Biotechnology, University of Dhaka.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

(1)
Research and Development Department, Incepta Vaccine Ltd.
(2)
Department of Genetic Engineering and Biotechnology, University of Dhaka

References

  1. Sharmin R, Islam AB. A highly conserved WDYPKCDRA epitope in the RNA directed RNA polymerase of human coronaviruses can be used as epitope-based universal vaccine design. BMC Bioinformatics. 2014;15:161.View ArticlePubMedPubMed CentralGoogle Scholar
  2. Cheng VC, Lau SK, Woo PC, Yuen KY. Severe acute respiratory syndrome coronavirus as an agent of emerging and reemerging infection. Clin Microbiol Rev. 2007;20(4):660–94.View ArticlePubMedPubMed CentralGoogle Scholar
  3. Huynh J, Li S, Yount B, Smith A, Sturges L, Olsen JC, Nagel J, Johnson JB, Agnihothram S, Gates JE, Frieman MB, Baric RS, Donaldson EF. Evidence supporting a zoonotic origin of human coronavirus strain NL63. J Virol. 2012;86(23):12816–25.View ArticlePubMedPubMed CentralGoogle Scholar
  4. Li W, Wong SK, Li F, Kuhn JH, Huang IC, Choe H, Farzan M. Animal origins of the severe acute respiratory syndrome coronavirus: insight from ACE2-S-protein interactions. J Virol. 2006;80(9):4211–9.View ArticlePubMedPubMed CentralGoogle Scholar
  5. To KK, Hung IF, Chan JF, Yuen KY. From SARS coronavirus to novel animal and human coronaviruses. J Thorac Dis. 2013;5 Suppl 2:S103–8.PubMedPubMed CentralGoogle Scholar
  6. Annan A, Baldwin HJ, Corman VM, Klose SM, Owusu M, Nkrumah EE, Badu EK, Anti P, Agbenyega O, Meyer B, Oppong S, Sarkodie YA, Kalko EK, Lina PH, Godlevska EV, Reusken C, Seebens A, Gloza-Rausch F, Vallo P, Tschapka M, Drosten C, Drexler JF. Human betacoronavirus 2c EMC/2012-related viruses in bats, Ghana and Europe. Emerg Infect Dis. 2013;19(3):456–9.View ArticlePubMedPubMed CentralGoogle Scholar
  7. Anthony SJ, Ojeda-Flores R, Rico-Chávez O, Navarrete-Macias I, Zambrana-Torrelio CM, Rostal MK, Epstein JH, Tipps T, Liang E,Sanchez-Leon M, Sotomayor-Bonilla J, Aguirre AA, Ávila-Flores R, Medellín RA, Goldstein T, Suzán G, Daszak P, Lipkin WI. Coronaviruses in bats from Mexico. J Gen Virol. 2013;94(Pt 5):1028–38.View ArticlePubMedPubMed CentralGoogle Scholar
  8. van Boheemen S, de Graaf M, Lauber C, Bestebroer TM, Raj VS, Zaki AM, Osterhaus AD, Haagmans BL, Gorbalenya AE, Snijder EJ, Fouchier RA. Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans. MBio. 2012;20:3(6).Google Scholar
  9. Lau SK, Li KS, Tsang AK, Lam CS, Ahmed S, Chen H, Chan KH, Woo PC, Yuen KY. Genetic characterization of Betacoronavirus lineage C viruses in bats reveals marked sequence divergence in the spike protein of pipistrellus bat coronavirus HKU5 in Japanese pipistrelle: implications for the origin of the novel Middle East respiratory syndrome coronavirus. J Virol. 2013;87(15):8638–50.View ArticlePubMedPubMed CentralGoogle Scholar
  10. Woo PC, Lau SK, Huang Y, Yuen KY. Coronavirus diversity, phylogeny and interspecies jumping. Exp Biol Med (Maywood). 2009;234(10):1117–27.View ArticleGoogle Scholar
  11. Benson DA, Cavanauqh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. Genbank. Nucleic Acids Res. 2008;41(Database issue):D36–42.Google Scholar
  12. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22(22):4673–80.View ArticlePubMedPubMed CentralGoogle Scholar
  13. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.View ArticlePubMedPubMed CentralGoogle Scholar
  14. Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, Salimi N, Damle R, Sette A, Peters B. The immune epitope database 2.0. Nucleic Acids Res. 2010;38(Database issue):D854–62.View ArticlePubMedPubMed CentralGoogle Scholar
  15. Larsen JE, Lund O, Nielsen M. Improved method for predicting linear B-cell epitopes. Immunome Res. 2006;2:2.View ArticlePubMedPubMed CentralGoogle Scholar
  16. Bui HH, Sidney J, Li W, Fusseder N, Sette A. Development of an epitope conservancy analysis tool to facilitate the design of epitope-based diagnostics and vaccines. BMC Bioinformatics. 2007;8(1):361.View ArticlePubMedPubMed CentralGoogle Scholar
  17. Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008;9:40.View ArticlePubMedPubMed CentralGoogle Scholar
  18. Laskowski R A, MacArthur M W, Thornton J M. PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 1993;26:283-91.Google Scholar
  19. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. UCSF Chimera- a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–12.View ArticlePubMedGoogle Scholar
  20. Graham RL, Baric RS. Recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission. J Virol. 2010;84(7):3134–46.View ArticlePubMedPubMed CentralGoogle Scholar
  21. Ge XY, Li JL, Yang XL, Chmura AA, Zhu G, Epstein JH, Mazet JK, Hu B, Zhang W, Peng C, Zhang YJ, Luo CM, Tan B, Wang N, Zhu Y, Crameri G, Zhang SY,Wang LF, Daszak P, Shi ZL. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature. 2013;503(7477):535–8.View ArticlePubMedGoogle Scholar
  22. Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25(9):1189–91.View ArticlePubMedPubMed CentralGoogle Scholar

Copyright

© Sharmin and Islam. 2016

Advertisement