1. Introduction
Microbial contamination is one of the main problems in the pharmaceutical industry. It is of particular concern during the manufacture of thermosensitive sterile products such as vaccines and other immunobiologicals, as these cannot be subject to terminal sterilization. Consequently, these products need to be produced aseptically according to strict quality assurance requirements [
1,
2]. In 2022, the European Medicines Agency established the Contamination Control Strategy (CCS) approach. This aims to identify microorganisms isolated from clean areas and to evaluate the risk of their presence in the environmental production. Subsequently, appropriate measures can be put in place to eliminate high-risk contamination and ensure safe and quality products [
3]. To achieve this goal, the identification of microorganisms isolated in pharmaceutical production plants is essential [
4,
5].
Various methodologies can be employed for microbial identification in the pharmaceutical industry, ranging from biochemical methods to genomics. However these are laborious and expensive to implement [
1,
2,
6]. Although biochemical methods are cheaper and easier to perform than DNA sequence-based methods, their databases are often limited to the more frequently isolated species. In addition, microorganisms isolated from nutrient-limited clean production areas which are subject to the frequent use of disinfectants may not express their standard characteristics in biochemical tests, resulting in inaccurate, or even incorrect results [
1,
4,
7]. Previous studies have shown that matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) is a significantly faster and more accurate identification system than conventional biochemical methods. The method is based on the extraction of proteins from whole microbial cells. Although MALDI-TOF MS databases have improved over the years, some environmental species cannot be identified. This limitation applies to
Bacillus and related genera, that are among the main bacteria isolated in pharmaceutical industries [
1,
4,
7,
8]. In these cases, DNA sequencing of the 16S rRNA gene and other housekeeping genes like
rpoB (encoding the beta subunit of RNA polymerase) and
gyrB, (encoding the beta subunit of DNA gyrase) can be performed [
1,
9,
10].
Until 2020 the
Bacillus genus was composed of more than 280 species. However studies by Gupta et al. [
11] and Patel and Gupta [
12] demonstrated how taxonomic and phylogenetic varied the genus was. Consequently, they proposed the reclassification of more than a hundred species into new genera [
11,
12]. Currently, the genus
Bacillus comprises of 111 species with validly published and correct names [
13].
Pharmaceutical industry facilities possess potential environments for the discovery of new species of microorganisms which have not been well studied and due to commercial reasons may not have been publicized [
1,
7,
8]. In two previous studies, Costa et al. [
1,
8] characterized 97
Bacillus and related genera strains isolated from an immunobiological pharmaceutical facility by MALDI-TOF MS and 16S rRNA gene full-length sequencing. These earlier studies include the potential isolation of new bacterial species however, further investigation was required before their formal recognition could be proposed.
The aim of this study was the taxonomic characterization of the novel
Bacillus strain B190/17 and to introduce its spectrum into two MALDI-TOF MS system databases. The strain was firstly analyzed by Costa et al [
8] and is proposed, in this study, as
Bacillus lumedeirus sp. nov., with the type strain B190/17
T (= CBAS1225
T=CCGBXXX
T). In this study, the genome sequence of B190/17 was analyzed to determine genomic and phenotypic features of the new species.
3. Results and Discussion
Costa et al [
1,
8] previously isolated the strain B190/17 from an air monitoring sample at an immunobiological production facility and proposed that it could be designated as a new bacterial species. B190/17 was not identified by VITEK
® 2 [
8], but was given the bionumber 0303101000000000 for its biochemical profile. It was only positive for the biochemical tests: alanine arylamidase, Ala-Phe-Pro arylamidase, ELLMAN, leucine arylamidase, phenyalanine arylamidase and tyrosine arylamidase. The strain was negative for all the biochemical tests of API
® 50 CH and was not motile. The VITEK
® 2 biochemical test results are present in
Table 1. A comparison of phenotypic characteristics of B190/17 with reference strains is given in
Table 2. Biochemical methods are not suitable for a reliable identification of
Bacillus and related genera species.
Due to its limited and outdated database the BCL card of VITEK
® 2 claims to identify 21
Bacillus species [
4,
7,
25] (
Table 3). However, currently the number of identifiable
Bacillus species is only seven since 14 species are now classified in other genera; for example
Bacillus megaterium to
Priestia megaterium and
Bacillus circulans to
Niallia circulans [
11,
13,
25].
In the pharmaceutical industry, accurate identification of microorganisms is very important, as a misidentification could lead to product release based on a false negative, or product withdrawal due to a false positive result [
5]. Furthermore according to EU Annex 1, microorganisms detected in the grades A and B areas (cleanrooms used for high-risk activities) should be identified to species level, as well as spore-forming microorganisms isolated from grade C and grade D areas (cleanrooms used for low-risk activities) should to enable a robust risk assessment to reach and eliminate the source of contamination [
3]. Therefore, there is a need to apply rapid, yet accurate identification methods.
MALDI-TOF MS is an appropriate methodology for use by microbiological control laboratories in the pharmaceutical industry. It is not laborious and the results can be quickly obtained. However, its database needs to be regularly expanded and updated [
1,
2,
4]. In this study, MALDI Biotyper® and VITEK® MS RUO from Bruker and bioMérieux were used as they are the only companies that sell MALDI-TOF MS in Brazil. At the time these studies, VITEK
® MS prime had not been launched. Neither of these MALDI-TOF MS systems were able to identify the B190/17 strain.
Figure 1 shows the comparison between the spectra provided by MALDI Biotyper
® for the B190/17 strain and the two closest strains of the database:
B. badius strains DSM 23
T and DSM 30822. It was apparent that the spectrum of the B190/17 strain matched just four and seven peaks, respectively, with
B. badius DSM 23
T and
B. badius DSM 30822. These matches were not enough for a reliable identification.
Most of the studies related to
Bacillus species taxonomy are based on the 16S rRNA gene sequences [
12]. If the 16S rRNA sequence of the target strain is compared to sequences in EzBioCloud or GenBank and there are no species with similarity >98.7%, then it is an indication that it may be a new species. Therefore further genomic analysis is recommended [
26]. 16S rRNA gene sequence analysis showed that all the
Bacillus strains, including the strains without validly published nomenclatural status, and
Domibacillus strains compared to the B190/17 strain showed similarity that was <98.7%. This supported the earlier proposal by Costa et al. [
1,
8] that B190/17 was a new bacterial species. By convention, B190/17 would be the type strain of the new species.
The phylogenetic tree based on 16S rRNA gene sequences revealed that the designated type strain B190/17 formed a separate branch closest to
B. badius MTCC 1458
T and
B. wudalianchiensis FJAT 27215
T strains (
Figure 2), whose similarity were 97.96 and 98.51%, respectively.
B. xiapuensis FJAT 46582
T, ‘
B. aerolatus’ CX253
T,
B. ectoiniformans NU-14
T and
B. thermotolerans SGZ8
T were placed in different clusters and showed similarities of 97.63, 98.28, 96.49 and 97.21% with the 16S rRNA gene sequence from B190/17 strain, respectively. All others type strains shared less than 96.49% 16S rRNA gene similarities in comparison with B190/17 strain. ‘
Bacillus aerolatus’ was described in 2020 by Chen et al [
27]. However, this species name has not been validated by the International Code of Nomenclature of Prokaryotes (ICNP). Although the species name is not validly published, its actual taxonomic status is referred to as ‘preferred name’ according to the List of Prokaryotic names with Standing in Nomenclature [
13]. As the species ‘
Bacillus aerolatus’ was one of the most closely related species to B190/17 strain, it was decided to retain it in this study for comparative purposes.
Bacillus smithii was used as an outgroup. This species was chosen because according to a maximum-likelihood phylogenetic tree for 303 genome-sequenced
Bacillaceae species (based on concatenated sequences for 650 core proteins), it was the closest species to the genera
Domibacillus and
Pseudobacillus [
11]. At the time of the Gupta et al. [
11] study,
B. badius and
B. wudalianchensis were classified as
Pseudobacillus badius and
Pseudobacillus wudalianchensis [
28].
Phylogenetic analysis of
Bacillus species has also been performed using other housekeeping genes [
1,
27]. Therefore, to confirm the 16S rRNA gene analysis, further analysis was carried out using the concatenated sequences of 16S rRNA (obtained by Sanger sequencing),
rpoB and
gyrB genes from the whole genome sequences of B190/17 strain. There is no cut-off percentage suggested in the literature for species and genus identification based on
rpoB and
gyrB genes analysis as there is for 16S rRNA gene. Therefore, it should be analyzed on a case by case basis by considering the phylogenetic analysis of the genes [
1]. The concatenated phylogenetic tree also showed the B190/17 strain was in a separate cluster from ‘
B. aerolatus’ CX253,
B. badius MTCC 1458 and NBRC 15713 and
B. wudalianchiensis FJAT 27215 type strains (
Figure 3). The percentage of similarity of
rpoB and
gyrB sequences of B190/17 strain and their closest neighbors, respectively, were ‘
B. aerolatus’ CX253
T (87.50%, 85.43%),
B. badius NBRC 15713
T (86.91%, 80.90%) and
B. wudalianchiensis FJAT 27215
T (87.38%, 81.17%).
The genomic taxonomy results of B190/17 are shown in
Table 4 and
Table 5. The average size of the B190/17 genome was 3.43 Mb. The DNA G+C % content was 41.6 mol% and the coverage was 73x. The assembly produced 89 contigs with a total of 3,434,160 bp, N50 of 219,177 bp and 3,544 coding sequences. Other sequenced genome metrics provided by RAST are shown in the
Table 5. The 16S rRNA gene sequence obtained with Sanger sequencing (OK586830.1; 1,499 bp) was also compared with the 16S sequence of B190/17 genome (JAUIYO000000000; 1,355 bp) and the percentage identity was 99.92%.
The ANI values of B190/17 strain and the related species ‘
B. aerolatus’ CX253
T,
B. badius NBRC 15713
T,
B. wudalianchiensis FJAT 27215
T were 79.55 %, 76.47% and 77.64%, respectively, which are lower than the cut-off value (95-96%) established to consider as belonging to the same species [
29]. Moreover, estimation of in silico DNA–DNA hybridization (in silico DDH) by using the GGDC in comparison to the same type species above were 24.00, 21.60 and 22.50, respectively. All values are also lower than the cut-off value (70%) proposed for delineation of novel species, indicating that are distinct species [
29].
The phylotaxonomic tree constructed on the TYGS server provided further evidence for the distinct taxonomic status of the B190/17 strain within the genus
Bacillus.
Figure 4 shows the position of B190/17 strain in comparison with the most closely related type strains based on whole-genome sequences. B190/17 is in a different cluster from their closer relatives ‘
B. aerolatus’,
B. badius and
B. wuadalianchiensis. These results supported the earlier conclusion that B190/17 strain represents a novel species of the genus
Bacillus. The new species is proposed as
Bacillus lumedeirus sp. nov., with the type strain B190/17
T.
In the past, the criteria used by taxonomists to classify a
Bacillus species was its ability to produce spores in the presence of oxygen. However whole genome studies of the
Bacillus genus have resulted in the reclassification of many of former
Bacillus species into new genera. Gupta et al [
11] and Patel and Gupta [
12] proposed that the
Bacillus genus should be composed only of two clades and that strains not in these clades should be transferred to new genera. The two clades being “Subtilis clade” composed of
Bacillus sensu stricto, and the “Cereus clade” containing a variety of important pathogenic species. However, some species that do not form part of these two clades are still part of the genus
Bacillus. These include species close to the B190/17 strain:
B. badius,
B. thermotolerans,
B. wudalianchiensis and
B. xiapuensis [
13]. Verma et al. [
28] proposed the reclassification of
B. badius and
B. wudalianchiensis as the new genus
Pseudobacillus, as mentioned before. Currently it remains for further studies to justify the reclassification of ‘
B. aerolatus’,
B. badius,
B. wudalianchiensis, and
Bacillus lumedeirus sp. nov. into a new genus.
Costa et al. [
1] added the B190/17 strain in VITEK
® MS RUO simply as
Bacillus spp. However, at that time the genomic taxonomy analysis had not been concluded. Based on the conclusion that the B190/17 strain was a novel species of the genus
Bacillus, it was added in MALDI Biotyper
® database. The strain was again submitted to proteomic analysis and was identified as
Bacillus lumedeirus sp. nov. with the higher score = 2.32. Therefore the addition of B190/17 to the MALDI Biotyper
® database will facilitate the identification of any further isolates.
According to 16S rRNA gene and whole genome analysis, the closest valid species to
Bacillus lumedeirus sp. nov. are
B. badius,
B. thermotolerans,
B. wudalianchiensis and
B. xiapuensis. With the exception of
B. badius, these species were not present in MALDI Biotyper
® database. The database represents only 32 of the 110 species already described [
30]. This shows how the database can be further expanded to improve the identification of
Bacillus and related genera in microbiological control laboratories. Consequently, MALDI-TOF MS analysis can be used to ensure greater safety in the release of pharmaceutical products, and for tracing sources of contamination within a pharmaceutical facility.
Description of Bacillus lumedeirus sp. nov.
Bacillus lumedeirus (lu.me.dei.rus. N.L. gen. fem. n. medeirus) of Luciane Martins Medeiros, in memoriam, a Brazilian scientist at the Institute of Immunobiological Technology (Bio-Manguinhos) of Fiocruz in Rio de Janeiro who made significant contributions to the Laboratory Microbiological Control of Bio-Manguinhos, including introducing new microorganism identification systems.
Cells are Gram-positive, spore-forming, non-motile, growth in aerobic conditions (24–48 h) at 30-37ºC (optimum temperature: 37ºC). Cream, non-mucoid, circular, irregular edge, smooth, brilliant and medium colonies (up to 6 mm in diameter) were observed after 24 h of incubation on TSA, at 37°C. The type strain was found to be positive for alanine arylamidase, Ala-Phe-Pro arylamidase, ELLMAN, leucine arylamidase, phenyalanine arylamidase and tyrosine arylamidase positive. Activities of α and β-glucosidase and β-manosidase were not detected. Regarding the acid production capacity, this was not observed for galactose, glycogen glucose, mannitol and ribose.
The genome size of B190/17 strain is estimated at 3.43 Mb with 41.6 mol % of G+C content. The genome is deposited in GenBank under access number JAUIYO000000000. The type strain is B190/17 (= CBAS 1225T = CCBG XXXXT), isolated in 2017 from air environmental monitoring in an Immunobiological production facility in Brazil.