Preprint
Brief Report

SARS-CoV-2 and the CGG-CGG Furin Site Genetic Fingerprint: Five Years Later

This version is not peer-reviewed.

Submitted:

05 March 2025

Posted:

06 March 2025

You are already at the latest version

Abstract
The key evolutionary step leading to the pandemic virus was the acquisition of the furin cleavage motif at the S protein S1/S2 junction. This insertion led to a gain of function for SARS-CoV-2, in which the virus's S protein became a substrate protein for human furin. The corresponding 12 nucleotide fragment inserted into the S gene in a SARS-CoV-2 precursor included the CGG-CGG genetic fingerprint coding the furin arginine pair. The arginine CGG codon was (still is) rare in the virus, even more two CGGs in a row. Afterwards the probable human origin of that motif has been proposed (BMC Genomic Data 24:71, 2023). Synonymous base substitutions or arginine codon usage bias at the CGG-CGG fingerprint was one of the evidences supporting the hypothesis. Based on 2025 SARS-CoV-2 isolates the aim of this work is follow the evolution of the furin site arginine pair code. From GISAID database 17,506 SARS-CoV-2 complete genomes were downloaded, with collection dates from January 1, 2025 to February 18, 2925. Using Perl programs the S gene sequences were retrieved. 62 out of 15,390 (0.4028%) S-protein sequences showed arginine codon usage bias at the S gene CGG-CGG fingerprint. The SARS-CoV-2 lineage distribution of the 2025 sample is shown. The XEC (44.5%) and KP.3.1.1 (13.8%) lineages were the majority. Lineage KP.3.1.1 was also the majority in CGG-CGG codon usage bias analyses, grouped into two main population groups of origin Japan and Canada. In the 2025 working sample 125 out of 1,620 (7,71%) Japan and 47 out of 4,793 (0,98%) Canada Ontario KP.3.1.1. isolates showed CGG-CGG optimization. The results shown are in agreement with previous studies, although in large samples the percentage (probability) of SARS-CoV-2 S gene furin site arginine codon optimization appears weak, it increases significantly when focusing on specific lineages or population groups.
Keywords: 
;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

36

Views

33

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated