Preprint Article Version 1 This version is not peer-reviewed

BacSPaD: A Robust Bacterial Strains’ Pathogenicity Resource Based on Integrated and Curated Genomic Metadata

Version 1 : Received: 10 July 2024 / Approved: 10 July 2024 / Online: 11 July 2024 (00:01:42 CEST)

How to cite: Ribeiro, S.; Chaumet, G.; Alves, K.; Nourikyan, J.; Shi, L.; Lavergne, J.-P.; Mijakovic, I.; de Bernard, S.; Buffat, L. BacSPaD: A Robust Bacterial Strains’ Pathogenicity Resource Based on Integrated and Curated Genomic Metadata. Preprints 2024, 2024070837. https://doi.org/10.20944/preprints202407.0837.v1 Ribeiro, S.; Chaumet, G.; Alves, K.; Nourikyan, J.; Shi, L.; Lavergne, J.-P.; Mijakovic, I.; de Bernard, S.; Buffat, L. BacSPaD: A Robust Bacterial Strains’ Pathogenicity Resource Based on Integrated and Curated Genomic Metadata. Preprints 2024, 2024070837. https://doi.org/10.20944/preprints202407.0837.v1

Abstract

The vast array of omics data in microbiology presents significant opportunities for studying bacterial pathogenesis and creating computational tools for predicting pathogenic potential. However, the field lacks a comprehensive, curated resource that catalogs bacterial strains and their ability to cause human infections. Current methods for identifying pathogenicity determinants often introduce biases and miss critical aspects of bacterial pathogenesis.In response to this gap, we introduce BacSPaD (Bacterial Strains’ Pathogenicity Database), a thoroughly curated database focusing on pathogenicity annotations for a wide range of high-quality, complete bacterial genomes. Our rule-based annotation workflow combines metadata from trusted sources with automated keyword matching, extensive manual curation, and detailed literature review. Our analysis classified 5,502 genomes as pathogenic to humans (HP) and 490 as non-pathogenic to humans (NHP), encompassing 532 species, 193 genera, and 96 families. Statistical analysis demonstrated a significant but moderate correlation between virulence factors and HP classification, highlighting the complexity of bacterial pathogenicity and the need for ongoing research. This resource is poised to enhance our understanding of bacterial pathogenicity mechanisms and aid in the development of predictive models. To improve accessibility and provide key visualization statistics, we developed a user-friendly web interface, accessible at https://bacspad.altrabio.com.

Keywords

Bacterial pathogenicity; Genomic metadata analysis; Bioinformatics; Microbiology research; Public health surveillance

Subject

Biology and Life Sciences, Immunology and Microbiology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.