Coronavirus SARS-CoV-2 Proteases

Coronaviruses (CoVs) are enveloped positive-sense RNA viruses. CoVs contain a large genome, usually 25 to 32 kb, and can infect a variety of species, including animals and humans. Infections by CoVs are usually mild unless infected humans are immuno-compromised. However, the newly emerged CoVs are now extremely pathogen. SARS-CoV caused the global outbreak of severe acute respiratory syndrome (SARS) in 2002 to 2003. MERS-CoV caused another outbreak, the Middle East respiratory syndrome (MERS) in 2012. The most recent outbreak caused by SARS-CoV-2 (COVID-19) exploded into a worldwide pandemic. Figure 1 illustrates the genomic organization of the new coronavirus SARS-CoV-2.

Figure 1: Genomic organization of the coronavirus SARS-CoV-2 (COVID-19).

SARS-CoV-2 (COVID-19) binds to angiotensin-converting enzyme 2 (ACE2) via its Spike (S protein), allowing the virus to enter and infect cells. However, for complete entry into a cell, the spike protein needs to be primed by a protease, an enzyme that breaks down proteins and peptides into fragments or smaller pieces. The transmembrane protease serine 2 (TMPRSS2) completes this process. The two independent mechanisms discussed for how TMPRSS2 facilitates viral entry are: (i) proteolytic cleavage of ACE2 thought to promote viral uptake, and (ii) cleavage of coronavirus spike glycoprotein which activates the glycoprotein for cathepsin L-independent host cell entry. TMPRSS2 proteolytically cleaves and activates the spike glycoproteins of several viruses. The glycoproteins of the following viruses are cleaved: the human coronavirus 229E (HCoV-229E), human coronavirus EMC (HCoV-EMC), the fusion glycoproteins F0 of Sendai virus (SeV), human metapneumovirus (HMPV), and human parainfluenza 1, 2, 3, 4a and 4b viruses (HPIV). This mechanism is also essential for spread and pathogenesis of influenza A virus (strains H1N1, H3N2 and H7N9). TMPRSS2 is involved in proteolytic cleavage and activation of hemagglutinin (HA) protein. HA is essential for viral infectivity.

The binding efficiency of the S protein to ACE2 in SARS-CoV-2 is 10- to 20- fold higher than that of SARS-CoV, as indicated by the Cryo-EM Structure of the SARS-CoV-2 Spike in the prefusion conformation. For SARS-CoV, the cleavage of the trimer S protein is triggered by the cell surface-associated transmembrane protease serine 2 (TMPRSS2) and cathepsin. However, it is still unclear which molecules facilitate membrane cavitation for SARS-CoV-2 endocytosis to occur.

Typical coronaviruses contain at least six open reading frames (ORFs) in their genome. Usually, the first ORFs (ORF1a/b), which are approximately two-thirds of the whole genome length, encode 16 nsps (nsp1-16). ORF1a and ORF1b contain a frameshift in between that produce two polypeptides: pp1a and pp1ab.

Two CoV proteases are instrumental for processing the polyprotein, the papain-like protease (PLpro; nsp3) located between nsp1–4 and the main chymotrypsin-like protease (Mpro; 3CLpro, nsp5) between nsp4–11/16. These proteases are essential for virus replication; hence they were extensively investigated regarding their interplay of structure and function, as well as their suitability as a drug target [Krichel et al. 2020].

All structural and accessory proteins are translated from the single-guide RNAs (sgRNAs). ORFs on one-third of the genome near the 3’-end encodes the four main structural proteins spike (S), membrane (M), envelope (E), and the nucleocapsid (N) protein. Furthermore, different CoVs encode special structural and accessory proteins, such as HE protein, 3a/b protein, and 4a/b protein, responsible for several essential functions in genome maintenance and virus replication.

SARS-CoV-2 papain-like protease

The papain-like protease (PLpro) is one of two cysteine proteases encoded by coronavirus genomes.  PLpro proteolytically processes the virus polyproteins. PLpro has significant in vitro deubiquitinating and de-ISGylation activities. However, presently the exact mechanism of how this works is still unclear.