Document Type : Original Research Article
Authors
- Saeedeh Mohammadi 1, 2
- Esmail Doustkhah 3
- Ahmad Reza Salehi Chaleshtori 4
- Mohammad Esmailpour 2
- Farzad Zamani 5
- Ayoub Esmailpour 1
1 Department of Physics, Shahid Rajaee Teacher Training University, Lavizan, Tehran 16788-15811, Iran
2 Department of Physics, Azarbaijan Shahid Madani University, Tabriz 53714-161, Iran
3 International Center for Materials Nanoarchitechtonics (WPI-MANA), National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki 305-0044, Japan
4 Department of Medical Genetics, Faculty of Medical Sciences, Tarbiat Modares University, P.O. Box 14115-331, Tehran, Iran
5 The Institute of Scientific and Industrial Research (ISIR), Osaka University, Ibaraki-shi, Osaka 567-0047, Japan
Abstract
In this paper, the SARS-CoV-2 spike encoding gene sequences were analyzed to find the structural homology of S proteins. The S protein of SARS-CoV-2 was obtained from homology modeling and the protein-protein docking was performed to elucidate sites active in S protein for ACE2, dipeptidyl peptidase 4 (DPP4), chemokine receptor 5 (CCR5), and AXL. The two crucial binding sites of S protein, known as RBD and CTD, were investigated. Three-dimensional structures of 8 possible RBD/CTD-receptor complexes were evaluated using molecular dynamic (MD) simulations. The best simulation models of the SARS-CoV-2 S protein active sites with the receptors were obtained for the ACE2 receptor (PDB:6VW1), providing 99.5% and 98.5% coverage for CTD and RBD, respectively. The SARS-CoV-2 S protein may connect with the ACE2 receptor via the RBD sites of the S protein and the ACE2 peptidase domain (PD), which can be blocked by encoding gene sequence in the active sites of S protein, offering an attractive protection approach against this novel SARS-CoV-2 virus.
Graphical Abstract
Keywords
Introduction
In 2019, a highly contagious novel coronavirus appeared in China [1] and a mutated variant of the novel coronavirus in England [2] with a high rate of morbidity and mortality [3,4]. Due to the absence of efficient antiviral medication against SARS-CoV-2, the SARS-CoV-2 pandemic has soon become a global catastrophic risk to human civilization [5,6]. By end of 12 March 2021, over 118 million confirmed cases of the infection including approximately 2.6 million deaths have been globally reported [7].
The SARS-CoV-2 includes a positive-sense RNA [8,9]. The spike protein plays a key role in facilitating the viral entry into the cell [10,11]. This protein is generally split by host peptidases into two functional S1 and S2 subunits [12]. The S1 subdomain includes an NTD and a CTD, in which attaching to a host ACE2 receptor occurs via the CTD and RBD [13]. The ACE2 has a protective role in regulating the RAS, which is an endocrine that has an important responsibility to prevent lung injury in humans through controlling blood pressure and electrolyte balance [14,15]. It is known that reducing the ACE2 expression is realized via blocking the renin-angiotensin pathway [16]. Considering the importance of the S protein interactions with host receptors, therapeutic approaches with the possibility of inhibiting the protein binding to the target receptors can be a promising pathway to block the virus entry into the human cells. Recently, we have suggested nicotine and caffeine, which bind to possible drugs for blocking S protein [17].
Although there are many ongoing reports of inhibiting SARS-CoV-2, the interaction between the cites active, e.g., RBD and CTD, of the virus with target cellular receptors such as ACE2, dipeptidyl peptidase 4 (DPP4), chemokine receptor 5 (CCR5), and AXL through homology modeling has not yet been investigated. To pursue our investigation on bioinformatics analysis of the SARS-CoV-2 S proteins, bioinformatics analysis of the S proteins encoded of SARS-CoV-2 genes was carried out, and the results were compared with SARS-CoV spike proteins. The homology modeling was also performed to estimate all possible protein structures, in which the interactions of RBD and CTD with ACE2, DPP4, CCR5, and AXL receptors were explored.
Method
To inhibit S protein, the binding of the S protein encoded with cellular receptors, i.e., ACE2 [18], DPP4 [19], CCR5 [20-22], and AXL [23] was examined, and compared with S protein of SARS-CoV with ACE2 complex. Further, we evaluated the possible interaction and potential blocking of S protein encoded with cellular receptors.
Homology modeling of Spike protein
All genome sequences of SARS-CoV-2 S protein (NC_045512.2) were selected from the NCBI database. The nucleotide sequences were aligned through BLASTn to investigate the homology of SARS-CoV-2 genomes. Therefore, we found 32 sequences in GenBank via homology modeling.
The RBD and CTD structures were selected as active sites for the interaction with cellular receptors. The nucleotide sequence editing was carried out by the Bioedit program v7.0.5 [24], followed by aligning with ClustalW [25]. The evolutionary history was used as the Neighbor-Joining method via the MEGA‐X [26]. The phylogenetic tree was generated through the JTT evolutionary model [27]. Homology modeling was performed through the Swiss-Model [28] based on most of sequences identity and coverage, which was superimposed over the template. The RMSD was investigated in Å via SWISS‐PdBViewer 4.1.0 [29]. Structures generated were analyzed using UCSF Chimera [30,31].
Molecular docking
The molecular docking technique has been shown to be an accurate calculation to investigate the binding conditions of cellular receptors [32]. According to previous studies, DPP4 [19], CCR5 [20-22], and AXL [23] receptors serve as the key receptors for MERS-CoV, Human immunodeficiency (HIV), and Zika viruses, respectively. Considering the importance of RBD and CTD protein active pockets of SARS-CoV-2 in the virus infection [5], the interface binding of these active domains with the ACE2, DPP4, CCR5, and AXL cellular receptors were evaluated via molecular docking technique using HDOCK software [33]. The crystal structures were obtained employing the PDB for DPP4 (4L72), ACE2 (1R42), CCR5 (3ODU) and AXL (2C5D). In crystal structures reported, water molecules, ions, and ligands were removed to perform molecular docking protein-protein. Meanwhile, hydrogen atoms were added to better understand the hydrogen bonds of this study. The grid size of the system was equal to 72×76×62 and the distance between two grid points was set at 0.375 Å. The protein-protein docking was performed using a hybrid algorithm of ab initio free docking, which was then analyzed by the PRODIGY software [34]. The results were finally clustered and assessed based on both binding interacting energies and residues involved in complexes.
Molecular dynamic
In order to investigate the binding stability of the SARS-CoV-2 S protein active domains (RBD and CTD) with the receptors, molecular dynamic (MD) simulations were performed in an explicit solvent model based on GROMACS using GROMOS96 43a1 force field parameters [35,36]. The complex structures of the RBD and CTD interacted with the receptors were designed in water cube [37] with the simple point charge (SPC) water [38] model under the periodic boundary conditions, in which NaCl (0.154 M) was employed in the protein-water system to make the system electrostatically neutral. Our size of the system was equal to 10.70×8.65×6.61 . The structures of energy optimization were first performed using the steepest descent method [39] with a convergence criterion of 1000 . To make equilibration between the solvent, NaCl, and S protein-receptors complex, NVT and NPT ensemble were performed for 100 ps and a time-step of 2 fs at 300 K using Berendsen weak coupling method with 50000 steps. For NVT and NPT relaxation and energy minimization calculations, short-range interactions were constrained with a cutoff of 1.0 nm, and long-range electrostatics were constrained using the PME formalism [40]. The pressure was set at 1 bar by utilizing Parrinello-Rahman barostat. The simulation was performed using the LINCS algorithm for hydrogen bonds [41,42]. The MD simulation results were finally analyzed by GROMACS and Chimera software [43]. It is known that the system reaches equilibrium when the parameters, e.g., energy and temperature, fluctuations fall within the range of 5%. Therefore, the equilibrium of the system is represented in Figure 1, in which Figures 1a and b represent the temperature and density profile versus the simulated time for SARS-CoV-2 RBD, SARS-CoV-2 CTD, and SARS-CoV RBD interacted with the ACE2 receptor.
FIGURE 1 Temperatures and density versus simulation time for SARS-CoV-2 RBD, SARS-CoV-2 CTD, and SARS-CoV RBD interacted with ACE2 receptor
Results and discussion
Herein, we present in silico analyses of the binding affinities of the SARS-CoV-2 S protein active domains (RBD and CTD) to the cellular receptors (ACE2, DPP4, CCR5 and AXL), highlighting SARS-CoV-2 CTD-ACE2 and SARS-CoV-2 RBD-ACE2 complexes as interesting candidates for experimental evaluations.
Homology modeling of encoded SARS-CoV-2 proteins
Knowledge of protein 3D structure is a basic requirement for exploring the function of a protein. Homology modeling is the most standard in silico procedure that offers predicting the 3D structure of proteins using structural templates [44].
We obtained homology genomes based on the genome sequence of SARS-CoV-2/WHU02 that was aligned using the BLASTn online tool. Then, the phylogenetic results demonstrated a tree of 50 coronaviruses S protein, as shown in Figure 2a. We found three S proteins (97% for QIZ13861.1, QHU79173.2 and QJD20632.1) had the highest genome homology to the SARS-CoV-2 S protein. According to the recent study on SARS-CoV-2, SARS-CoV (80%) revealed the highest homology to SARS-CoV-2 [45]. SARS-CoV-2 RBD and CTD are the most active sites according to previous literature [46,47]. In this study, we compared the sequence aligned of RBD and CTD in the S protein of SARS-CoV-2 with RBD of SARS-CoV (Figure 2b). The sequence alignment results revealed that 70% of the RBD and CTD sequence of SARS-CoV-2 homology was in contrast to SARS-CoV. Then, we aligned S protein RBD and CTD sequences from SARS-CoV-2 with the PDB database.
FIGURE 2 The phylogenetic tree analysis and sequence alignment of SARS-CoV-2 S protein. (a) The homology tree of 50 nucleotide sequences alignment. (b) The comparison between sequence alignment of the S proteins RBD/CTD from SARS-CoV-2 and SARS-CoV
Structural analyses of RBD- and CTD-receptors complexes
The homology modeling on the SARS-CoV-2 S protein RBD and CTD was obtained. Then, the crystal structures of the S protein RBD of SARS-CoV and homology models of SARS-CoV-2 RBD and CTD have been docked with different receptors such as ACE2, DPP4, CCR5 and AXL. The structural representation of the interaction of SARS-CoV-2 S protein RBD and CTD with ACE2, DPP4, CCR5 and AXL receptors is shown in Figures 3 and 4. The molecular docking results exhibit that the best model was obtained using as a template the crystal structure PDB for S protein- ACE2 with code 6VW1 with 99.5% and 98.5% coverage for CTD and RBD, respectively. This crystal structure has been resolved via X‐ray diffraction with a resolution of 2.68 [48]. In the case of SARS-CoV-2 S protein-DPP4, we selected the crystal structure (PDB: 4KR0) with 76.7% and 76.3 % coverage for CTD and RBD, respectively in which 4KR0 structure has been resolved via X‐ray diffraction with a resolution of 2.70 [49]. The best models selected for RBD/CTD-CCR5 complex are 5DSG (receptor) and 6VW1 (RBD/CTD) with 99.5% and 98.5% coverage for CTD and RBD, respectively. The crystal structure PDB: 6XC3 is as the best model for RBD/CTD-AXL complex with 87.6% and 87.3% coverage for CTD and RBD, respectively.
Molecular docking
In this section, molecular docking analyses from the binding of RBD and CTD with the cellular receptors have been presented. The binding energy is due to the energies contributed by residues in the interface of target proteins. The residues energies are due to interactions, i.e., hydrogen bonding, π-π stacking, etc. [50]. The human ACE2 receptor is a protein with two main domains including NPD and CTD [13]. Therefore, the binding of ACE2 and S protein includes interaction sites active of S protein (RBD and CTD) and the ACE2 peptidase domain (PD). There are short loop-interface interactions that may play as a bridge in binding the S protein and the ACE2.
FIGURE 3 Three-dimensional schematic of SARS-CoV-2 CTD with the receptors. (a) Complex of ACE2 and SARS-CoV-2 CTD, where ACE2 and CTD are shown in green and pink, respectively. (b) The protein of ACE2 and SARS-CoV, where ACE2 and SARS-CoV are shown in orange and blue, respectively. (c) The complex of DPP4 and SARS-CoV-2 CTD, where DPP4 and CTD are shown in green and pale blue, respectively. (d) Complex of CCR5 and SARS-CoV-2 CTD, where CCR5 and CTD are shown in blue and green, respectively. (e) Simulation of AXL and SARS-CoV-2 CTD, where AXL and CTD are shown in orange and green, respectively
FIGURE 4 Three-dimensional view of SARS-CoV-2 RBD with the receptors. (a) Structure of ACE2 and SARS-CoV-2 RBD, where ACE2 and RBD are shown in green and pink, respectively. (b) The complex of ACE2 and SARS-CoV, where ACE2 and SARS-CoV are shown in orange and blue, respectively. (c) The protein of DPP4 and SARS-CoV-2 RBD, where DPP4 and RBD are shown in green and pale blue, respectively. (d) Simulation of CCR5 and SARS-CoV-2 RBD, where CCR5 and RBD are shown in blue and green, respectively. (e) The complex of AXL and SARS-CoV-2 RBD, where AXL and RBD are shown in orange and green, respectively
We found that SARS-CoV-2 CTD-ACE2 (-15.3 ) was capable of being helpful for viral infection treatment. Similarly, SARS-CoV-2 RBD was investigated as the second‐lowest binding affinity (-12.5 ) for SARS‐CoV‐2 S protein compared with SARS-CoV (-11.4 ) as shown in Figure 5a. The stability of the system using the equilibrium dissociation constant ( which , and are equilibrium concentrations of ligand, protein and protein-ligand) [51] as illustrated in Figure (5b). The dissociation constant has a key role to design adequate models for systems biology and ranking pharmaceutical in drug delivery.
Our results were consistent with the previous prediction for protein-protein docking of SARS-CoV-2 spike RBD and ACE2 [52]. The connected residues in the binding SARS-CoV-2 spike protein CTD-ACE2 and SARS-CoV-ACE2 are illustrated in Table 1. It is observed that the residues such as Asn25, Asp46, Asn16, and Pro209 act as a salt bridge between SARS-CoV-2 CTD and ACE2. In comparison to SARS-CoV, they are different which residues, e.g., Gln175, Gln180, Tyr131, Thr182, Phe168, Asn183, and Tyr171 with ACE2 are involved.
FIGURE 5 Binding affinity (ΔG) and dissociation constant (Kd) for the binding between SARS-CoV-2 RBD/CTD and the receptors
The microenvironment of the ACE2 receptor allows those amino acids in the active site to interact efficiently and produce a significant amount of minimum electrostatic stabilizing interactions. Meanwhile, there are short loops in the interaction sites active of S protein with the cellular receptor which these loops cause stability in the complex. We found these loops in our structures, though they were weak (Table 1). These results are due to the presence of hydrogen bonding between the proteins [53].
The hydrophobicity and the complementarity shape of the receptors and RBD/CTD have a crucial role in the inhibiting of the S protein [54]. As a result, these loops and mutations are capable of playing a key role in blocking interface S protein and receptors (see Table 1). The detail of interface binding RBD/CTD to receptors, i.e., ACE2, DPP4, CCR5, and AXL cellular receptors, were evaluated (Figures 6 and 7). The binding of cellular receptors in the active residues of CTD is shown in Figures 6. Moreover, in Figure 7 we observed the contact residues of the interface RBD-receptors.