1. Summary
In materials science, comprehensive datasets are indispensable for investigating complex materials such as covalent organic frameworks (COFs) [
1], metal-organic frameworks (MOFs) [
2], and zeolites [
3,
4]. These datasets facilitate the exploration of structural diversity and analysis of properties such as porosity, adsorption, and permeation. Moreover, they empower researchers to apply data intelligence in understanding trends in material properties and in predicting behaviors under various conditions, including applications in catalysis and sieving [
5].
Polyoxometalates (POMs) are versatile metal-oxo clusters with diverse applications in catalysis [
6], life sciences [
7,
8,
9,
10,
11], and nanoelectronics [
12,
13]. Despite their structural complexity and promising potential in smart applications, there has been limited development of curated POM datasets crucial for advancing AI-driven technologies in exploring inorganic chemical spaces. The creation of these datasets is essential for accelerating POM chemistry through the understanding of POM speciation in solution
via techniques such as nuclear magnetic resonance and mass spectrometry [
14,
15,
16,
17], utilizing POMs as building blocks in hybrid composite materials [
18,
19,
20], and developing metallodrugs [
11].
Over the past century, the designation of POM formulas has posed significant challenges at the forefront of inorganic chemistry [
21], as crystalline POM materials are among the most information-dense [
22]. The structural complexity, intricacies of coordination bonding, and charge (de)localization make common chemical identifiers like InChI (IUPAC International Chemical Identifier) inadequate for fully capturing their nuances [
23,
24]. Although SciFinder has developed some collections of POM information, the proprietary nature of this database limits access to the data [
25]. Consequently, the development of open and curated datasets of POM formulations could significantly accelerate research and collaboration within the POM community.
In this work, we present the recently completed “Curated Polyoxometalate Formula Dataset,” now published on a Git platform under a creative common license. This dataset addresses a critical data gap by linking POM formulas with their corresponding materials and provenances, specifically through Digital Object Identifiers (DOIs) [
26]. It facilitates searches based on elements, charges, molecular mass, and POM formulas. Additionally, the structure and implementation of the dataset are designed for future reusability and expansion, enabling the development of new data-driven methods for exploring POM synthesis and other related research avenues.
2. Data Description
Polyoxometalates (POMs), as cluster materials, are featured as crystallographic motifs in a variety of POM-containing or POM-based materials. For instance, the
-Wells Dawson polyoxometalate composed of P, W, and O atoms with the formula
, as illustrated in
Figure 1, exemplifies such motifs. This specific POM, reported in diverse materials like
Li6 (DOI:
10.1039/C3DT51120K), showcases the potential for various formula descriptions within the same POM motif. The given formula,
, is a coordination formula easily interpreted by chemists, indicating an overall charge of
and a composition that can be described as
. This leads to possible empirical derivations like
, which can be explored further in synthetic studies.
Consider the connectivity outlined in
Figure 2.a, which depicts a data connection schema for polyoxometalates and polyoxometalate materials. Each material is linked to its formula and a Digital Object Identifier (DOI), encapsulating a POM as a distinct entity within the class, assigned a unique identifier. This POM entity connects to various attributes such as elemental composition, from which both the molecular and POM formulas—typically serving as coordination formulas—are derived. These features, coupled with specific charge and molecular mass data, enable domain chemists to label and classify the structure effectively.
In terms of implementation (see
Figure 2.b), our JSON data structure organizes information hierarchically about a specific Polyoxometalate (POM) entity, encapsulated within a top-level object. For example, we assign a unique four-letter key (e.g.,
"POM_GUKA") as a POM identifier, referencing the phosphotungstate Wells-Dawson structure from
Figure 1. This key points to a nested JSON object detailing the POM’s properties such as formulas, elemental composition, and related materials. Keys like
"POM Formula" and
"Molecular Formula" store formulas as strings,
"Contains Elements" maps to an object listing each element’s count,
"Molecular Mass" is recorded as a string for precision, and
"Charge" as a floating-point number. An array under
"Labels" collects descriptive tags, while
"POM Material" includes nested objects with unique identifiers detailing each material’s formula and DOI, enhancing data retrieval for scientific analysis and database integration.
3. Methods
For the creation of the curated polyoxometalate formula dataset, we have meticulously compiled chemical data on polyoxometalates (POMs) by analyzing over 1300 distinct research papers, each identified by a unique Digital Object Identifier (DOI). The resulting dataset currently includes 1984 unique entries, linking to more than 2500 POM-containing materials. Given the crucial role of nuclearity and charge in characterizing POMs, our data are presented in terms of overall charge plotted as a function of molecular mass, as illustrated in
Figure 3.a. This panel depicts the broad distribution of molecular weights and corresponding charges, highlighting the chemical diversity within polyoxometalates. Notably, extreme examples such as the polyoxomolybdate
[Mo368O1032H16(H2O)240(SO4)48]48−[
27], the polyoxotungstate
[
28], the dense polyoxouranate
[
29], and the robust polyoxoniobate
[
30], are easily distinguishable, demonstrating the current extents of the POM chemical space.
A further qualitative analysis of the plot in
Figure 3.b reveals that as the charge increases while the molecular mass decreases or remains constant, there are fewer or no POMs observed. This trend suggests an over-reduction of the addenda centers, which likely requires coordination changes not captured by this dataset. Nonetheless, this pattern can serve as empirical guidance to evaluate the feasibility of proposed POM formulas. Focusing on the region characterized by a charge of less than -75 and a molecular mass below 30,000 g/mol, we observe that this is where the majority of POMs are situated. By employing different colorings for analysis, it becomes apparent that many polyoxomolybdates are located in the domain of high molecular mass but relatively low overall charge. Conversely, polyoxoniobates appear as one of the most negatively charged species with lower molecular weights, highlighting the diverse charge and mass relationships within POMs.
The current set of POM entries is primarily dominated by 1023 polyoxotungstate formulas. The prevalence of tungstates is likely due to tungsten’s ability to form strong, non-labile bonds in high oxidation states and to generate lacunary POM species that facilitate further chemical reactions, enhancing the diversity of tungstate-based POMs. Following tungstates are polyoxomolybdates with 482 entries, polyoxovanadates with 206, polyoxoniobates with 102, and polyoxotantalates with 30 entries; an additional 141 entries are categorized under "other," including diverse metal-oxo clusters. The dataset showcases chemical diversity with 75 unique elements across all entries, illustrating a broad range of structural combinations. Specifically, the distribution of unique elements per POM varies from two to eight, with the majority of structures, 603, containing five elements, often reflecting the integration of heteroatoms and hybrid species involving carbon, which underscores the complex compositional variety within this dataset.
4. Conclusions
This paper presents a comprehensive dataset of nearly 2000 POMs, currently featuring details such as molecular formulas, charges, and DOI identifiers, with an aim for future expansion to include topological, synthetic, and functional properties in the near future. The dataset is designed to evolve, incorporating charge distributions and mass projections to enhance material data projects through machine learning and cheminformatics. Upcoming enhancements will also cover polyoxometalates’ speciation in various environments to assess protonation levels, isomerism, and charge distribution for automated computational analyses, as well as integrated components for retrosynthetic strategies, significantly advancing both the fundamental and applied research in polyoxometalate chemistry.
Author Contributions
AK, NG, and AR conceptualized the need for the dataset. AK designed the data structure. AK and NG were involved in data curation. NG performed validation checks to ensure the reliability of the data. The manuscript was completed with contributions from all authors.
Funding
This research was funded in part by the Austrian Science Fund (FWF) [DOI 10.55776/P33089 (to A.R.); DOI 10.55776/P33927 (to N.G.)].
Data Availability Statement
Acknowledgments
The University of Cambridge and the University of Vienna are thanked for research support. We extend our gratitude to Ella Duvanova for her invaluable assistance with data verification.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
POM |
Polyoxometalate |
DOI |
Digital Object Identifiers |
CSV |
Comma-separated values |
JSON |
JavaScript Object Notation |
References
- Ongari, D.; Yakutovich, A.V.; Talirz, L.; Smit, B. Building a consistent and reproducible database for adsorption evaluation in covalent–organic frameworks. ACS Cent. Sci. 2019, 5, 1663–1675. [Google Scholar] [CrossRef]
- Moghadam, P.Z.; Li, A.; Wiggin, S.B.; Tao, A.; Maloney, A.G.; Wood, P.A.; Ward, S.C.; Fairen-Jimenez, D. Development of a Cambridge Structural Database subset: a collection of metal–organic frameworks for past, present, and future. Chem. Mater. 2017, 29, 2618–2625. [Google Scholar] [CrossRef]
- Yang, S.; Lach-Hab, M.; Vaisman, I.I.; Blaisten-Barojas, E.; Li, X.; Karen, V.L. Framework-type determination for zeolite structures in the inorganic crystal structure database. J. Phys. Chem. Ref. Data. 2010, 39. [Google Scholar] [CrossRef]
- Zheng, C.; Li, Y.; Yu, J. Database of open-framework aluminophosphate structures. Sci. Data 2020, 7, 107. [Google Scholar] [CrossRef] [PubMed]
- Kancharlapalli, S.; Snurr, R.Q. High-throughput screening of the CoRE-MOF-2019 database for CO2 capture from wet flue gas: a multi-scale modeling strategy. ACS Appl. Mater. Interfaces 2023, 15, 28084–28092. [Google Scholar] [CrossRef] [PubMed]
- Liu, R.; Streb, C. Polyoxometalate-single atom catalysts (POM-SACs) in energy research and catalysis. Adv. Energy Mater. 2021, 11, 2101120. [Google Scholar] [CrossRef]
- Gumerova, N.I.; Rompel, A. Interweaving disciplines to advance chemistry: Applying polyoxometalates in biology. Inorg. Chem. 2021, 60, 6109–6114. [Google Scholar] [CrossRef]
- Barba-Bon, A.; Gumerova, N.I.; Tanuhadi, E.; Ashjari, M.; Chen, Y.; Rompel, A.; Nau, W.M. All-Inorganic Polyoxometalates Act as Superchaotropic Membrane Carriers. Adv. Mater. 2024, 36, 2309219. [Google Scholar] [CrossRef]
- Bijelic, A.; Rompel, A. Ten good reasons for the use of the tellurium-centered Anderson–Evans polyoxotungstate in protein crystallography. Acc. Chem. Res. 2017, 50, 1441–1448. [Google Scholar] [CrossRef]
- Bijelic, A.; Aureliano, M.; Rompel, A. The antibacterial activity of polyoxometalates: structures, antibiotic effects and future perspectives. ChemComm 2018, 54, 1153–1169. [Google Scholar] [CrossRef]
- Bijelic, A.; Aureliano, M.; Rompel, A. Polyoxometalates as potential next-generation metallodrugs in the combat against cancer. Angew. Chem. Int. Ed. 2019, 58, 2980–2999. [Google Scholar] [CrossRef] [PubMed]
- Kondinski, A. Metal–metal bonds in polyoxometalate chemistry. Nanoscale 2021, 13, 13574–13592. [Google Scholar] [CrossRef]
- Kondinski, A.; Banerjee, A.; Mal, S.S. Hervé- and Krebs-Type Magnetic Polyoxometalate Dimers. Magnetochemistry 2022, 8, 96. [Google Scholar] [CrossRef]
- Gumerova, N.I.; Rompel, A. Speciation atlas of polyoxometalates in aqueous solutions. Sci. Adv. 2023, 9, eadi0814. [Google Scholar] [CrossRef]
- Gumerova, N.I.; Rompel, A. Polyoxometalates in solution: speciation under spotlight. Chem. Soc. Rev 2020, 49, 7568–7601. [Google Scholar] [CrossRef]
- Surman, A.J.; Robbins, P.J.; Ujma, J.; Zheng, Q.; Barran, P.E.; Cronin, L. Sizing and discovery of nanosized polyoxometalate clusters by mass spectrometry. J. Am. Chem. Soc. 2016, 138, 3824–3830. [Google Scholar] [CrossRef]
- Kondinski, A.; Rasmussen, M.; Mangelsen, S.; Pienack, N.; Simjanoski, V.; Näther, C.; Stares, D.L.; Schalley, C.A.; Bensch, W. Composition-driven archetype dynamics in polyoxovanadates. Chem. Sci. 2022, 13, 6397–6412. [Google Scholar] [CrossRef]
- Vilà-Nadal, L. POMzites: A roadmap for inverse design in metal oxide chemistry. Int. J. Quantum Chem. 2021, 121, e26493. [Google Scholar] [CrossRef]
- Soria-Carrera, H.; Atrián-Blasco, E.; de la Fuente, J.M.; Mitchell, S.G.; Martín-Rapún, R. Polyoxometalate–polypeptide nanoassemblies as peroxidase surrogates with antibiofilm properties. Nanoscale 2022, 14, 5999–6006. [Google Scholar] [CrossRef]
- Cameron, J.M.; Guillemot, G.; Galambos, T.; Amin, S.S.; Hampson, E.; Haidaraly, K.M.; Newton, G.N.; Izzet, G. Supramolecular assemblies of organo-functionalised hybrid polyoxometalates: from functional building blocks to hierarchical nanomaterials. Chem. Soc. Rev. 2022, 51, 293–328. [Google Scholar] [CrossRef] [PubMed]
- Pope, M.T.; Jeannin, Y.; Fournier, M. Heteropoly and isopoly oxometalates; Vol. 8, Springer, 1983.
- Krivovichev, S.V. Which inorganic structures are the most complex? Angew. Chem. Int. Ed. 2014, 53, 654–661. [Google Scholar] [CrossRef] [PubMed]
- Bard, A.J.; Dance, I.G.; Day, P.; Ibers, J.A.; Kunitake, T.; Meyer, T.J.; Mingos, D.M.P.; Roesky, H.W.; Sauvage, J.P.; Simon, A.; others. Bonding and charge distribution in polyoxometalates: a bond valence approach; Springer, 1999. [CrossRef]
- Heller, S.R.; McNaught, A.; Pletnev, I.; Stein, S.; Tchekhovskoi, D. InChI, the IUPAC international chemical identifier. J. Cheminform. 2015, 7, 1–34. [Google Scholar] [CrossRef]
- Gabrielson, S.W. SciFinder. J. Med. Libr. Assoc. 2018, 106, 588. [Google Scholar] [CrossRef]
- Paskin, N. Digital object identifier (DOI®) system. Encycl. Libr. Inf. Sci. 2010, 3, 1586–1592. [Google Scholar] [CrossRef]
- Müller, A.; Beckmann, E.; Bögge, H.; Schmidtmann, M.; Dress, A. Inorganic Chemistry Goes Protein Size: A Mo368 Nano-Hedgehog Initiating Nanochemistry by Symmetry Breaking. Angew. Chem. Int. Ed. 2002, 41, 1162–1167. [Google Scholar] [CrossRef]
- Fang, X.; Kögerler, P.; Furukawa, Y.; Speldrich, M.; Luban, M. Molecular growth of a core–shell polyoxometalate. Angew. Chem. Int. Ed. 2011, 50, 5212–5216. [Google Scholar] [CrossRef]
- Ling, J.; Qiu, J.; Burns, P.C. Uranyl peroxide oxalate cage and core–shell clusters containing 50 and 120 uranyl ions. Inorg. Chem. 2012, 51, 2403–2408. [Google Scholar] [CrossRef]
- Wu, Y.L.; Li, X.X.; Qi, Y.J.; Yu, H.; Jin, L.; Zheng, S.T. {Nb288O768(OH)48(CO3)12}: A Macromolecular Polyoxometalate with Close to 300 Niobium Atoms. Angew. Chem. Int. Ed. 2018, 57, 8572–8576. [Google Scholar] [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).