Preprint
Data Descriptor

Curated Polyoxometalate Formula Dataset

Altmetrics

Downloads

62

Views

57

Comments

0

This version is not peer-reviewed

Submitted:

29 August 2024

Posted:

02 September 2024

You are already at the latest version

Alerts
Abstract
Reticular and cluster materials often feature complex formulas, making a comprehensive overview challenging due to the need to consult various resources. While datasets have been collected for metal-organic frameworks (MOFs), covalent organic frameworks (COFs), and zeolites, among others, there remains a gap in systematically organized information for polyoxometalates. This paper introduces a carefully curated dataset of 1984 polyoxometalate (POM) and related cluster metal oxide formula instances, currently connecting over 2,500 POM material instances. These POM instances incorporate 75 different chemical elements, with compositions ranging from binary to octonary element clusters. This dataset not only enhances accessibility to polyoxometalate data but also aims to facilitate further research and development in the study of these complex inorganic compounds.
Keywords: 
Subject: Chemistry and Materials Science  -   Other

1. Summary

In materials science, comprehensive datasets are indispensable for investigating complex materials such as covalent organic frameworks (COFs) [1], metal-organic frameworks (MOFs) [2], and zeolites [3,4]. These datasets facilitate the exploration of structural diversity and analysis of properties such as porosity, adsorption, and permeation. Moreover, they empower researchers to apply data intelligence in understanding trends in material properties and in predicting behaviors under various conditions, including applications in catalysis and sieving [5].
Polyoxometalates (POMs) are versatile metal-oxo clusters with diverse applications in catalysis [6], life sciences [7,8,9,10,11], and nanoelectronics [12,13]. Despite their structural complexity and promising potential in smart applications, there has been limited development of curated POM datasets crucial for advancing AI-driven technologies in exploring inorganic chemical spaces. The creation of these datasets is essential for accelerating POM chemistry through the understanding of POM speciation in solution via techniques such as nuclear magnetic resonance and mass spectrometry [14,15,16,17], utilizing POMs as building blocks in hybrid composite materials [18,19,20], and developing metallodrugs [11].
Over the past century, the designation of POM formulas has posed significant challenges at the forefront of inorganic chemistry [21], as crystalline POM materials are among the most information-dense [22]. The structural complexity, intricacies of coordination bonding, and charge (de)localization make common chemical identifiers like InChI (IUPAC International Chemical Identifier) inadequate for fully capturing their nuances [23,24]. Although SciFinder has developed some collections of POM information, the proprietary nature of this database limits access to the data [25]. Consequently, the development of open and curated datasets of POM formulations could significantly accelerate research and collaboration within the POM community.
In this work, we present the recently completed “Curated Polyoxometalate Formula Dataset,” now published on a Git platform under a creative common license. This dataset addresses a critical data gap by linking POM formulas with their corresponding materials and provenances, specifically through Digital Object Identifiers (DOIs) [26]. It facilitates searches based on elements, charges, molecular mass, and POM formulas. Additionally, the structure and implementation of the dataset are designed for future reusability and expansion, enabling the development of new data-driven methods for exploring POM synthesis and other related research avenues.

2. Data Description

Polyoxometalates (POMs), as cluster materials, are featured as crystallographic motifs in a variety of POM-containing or POM-based materials. For instance, the α -Wells Dawson polyoxometalate composed of P, W, and O atoms with the formula [ ( P O 4 ) 2 ( W O 3 ) 18 ] 6 , as illustrated in Figure 1, exemplifies such motifs. This specific POM, reported in diverse materials like Li6 [ ( P O 4 ) 2 ( W O 3 ) 18 ] · 28 H 2 O (DOI: 10.1039/C3DT51120K), showcases the potential for various formula descriptions within the same POM motif. The given formula, [ ( P O 4 ) 2 ( W O 3 ) 18 ] 6 , is a coordination formula easily interpreted by chemists, indicating an overall charge of 6 and a composition that can be described as P 2 W 18 O 62 . This leads to possible empirical derivations like P W 9 O 31 , which can be explored further in synthetic studies.
Consider the connectivity outlined in Figure 2.a, which depicts a data connection schema for polyoxometalates and polyoxometalate materials. Each material is linked to its formula and a Digital Object Identifier (DOI), encapsulating a POM as a distinct entity within the class, assigned a unique identifier. This POM entity connects to various attributes such as elemental composition, from which both the molecular and POM formulas—typically serving as coordination formulas—are derived. These features, coupled with specific charge and molecular mass data, enable domain chemists to label and classify the structure effectively.
In terms of implementation (see Figure 2.b), our JSON data structure organizes information hierarchically about a specific Polyoxometalate (POM) entity, encapsulated within a top-level object. For example, we assign a unique four-letter key (e.g., "POM_GUKA") as a POM identifier, referencing the phosphotungstate Wells-Dawson structure from Figure 1. This key points to a nested JSON object detailing the POM’s properties such as formulas, elemental composition, and related materials. Keys like "POM Formula" and "Molecular Formula" store formulas as strings, "Contains Elements" maps to an object listing each element’s count, "Molecular Mass" is recorded as a string for precision, and "Charge" as a floating-point number. An array under "Labels" collects descriptive tags, while "POM Material" includes nested objects with unique identifiers detailing each material’s formula and DOI, enhancing data retrieval for scientific analysis and database integration.

3. Methods

For the creation of the curated polyoxometalate formula dataset, we have meticulously compiled chemical data on polyoxometalates (POMs) by analyzing over 1300 distinct research papers, each identified by a unique Digital Object Identifier (DOI). The resulting dataset currently includes 1984 unique entries, linking to more than 2500 POM-containing materials. Given the crucial role of nuclearity and charge in characterizing POMs, our data are presented in terms of overall charge plotted as a function of molecular mass, as illustrated in Figure 3.a. This panel depicts the broad distribution of molecular weights and corresponding charges, highlighting the chemical diversity within polyoxometalates. Notably, extreme examples such as the polyoxomolybdate [Mo368O1032H16(H2O)240(SO4)48]48−[27], the polyoxotungstate [ ( P 8 W 48 O 184 ) { ( P 2 W 14 M n 4 O 60 ) ( P 2 W 15 M n 3 O 58 ) 2 } 4 ] 152 [28], the dense polyoxouranate [ ( U O 2 ) 120 ( O 2 ) 120 ( C 2 O 4 ) 90 ] 180 [29], and the robust polyoxoniobate [ N b 288 O 768 ( O H ) 48 ( C O 3 ) 12 ] 180 [30], are easily distinguishable, demonstrating the current extents of the POM chemical space.
A further qualitative analysis of the plot in Figure 3.b reveals that as the charge increases while the molecular mass decreases or remains constant, there are fewer or no POMs observed. This trend suggests an over-reduction of the addenda centers, which likely requires coordination changes not captured by this dataset. Nonetheless, this pattern can serve as empirical guidance to evaluate the feasibility of proposed POM formulas. Focusing on the region characterized by a charge of less than -75 and a molecular mass below 30,000 g/mol, we observe that this is where the majority of POMs are situated. By employing different colorings for analysis, it becomes apparent that many polyoxomolybdates are located in the domain of high molecular mass but relatively low overall charge. Conversely, polyoxoniobates appear as one of the most negatively charged species with lower molecular weights, highlighting the diverse charge and mass relationships within POMs.
The current set of POM entries is primarily dominated by 1023 polyoxotungstate formulas. The prevalence of tungstates is likely due to tungsten’s ability to form strong, non-labile bonds in high oxidation states and to generate lacunary POM species that facilitate further chemical reactions, enhancing the diversity of tungstate-based POMs. Following tungstates are polyoxomolybdates with 482 entries, polyoxovanadates with 206, polyoxoniobates with 102, and polyoxotantalates with 30 entries; an additional 141 entries are categorized under "other," including diverse metal-oxo clusters. The dataset showcases chemical diversity with 75 unique elements across all entries, illustrating a broad range of structural combinations. Specifically, the distribution of unique elements per POM varies from two to eight, with the majority of structures, 603, containing five elements, often reflecting the integration of heteroatoms and hybrid species involving carbon, which underscores the complex compositional variety within this dataset.

4. Conclusions

This paper presents a comprehensive dataset of nearly 2000 POMs, currently featuring details such as molecular formulas, charges, and DOI identifiers, with an aim for future expansion to include topological, synthetic, and functional properties in the near future. The dataset is designed to evolve, incorporating charge distributions and mass projections to enhance material data projects through machine learning and cheminformatics. Upcoming enhancements will also cover polyoxometalates’ speciation in various environments to assess protonation levels, isomerism, and charge distribution for automated computational analyses, as well as integrated components for retrosynthetic strategies, significantly advancing both the fundamental and applied research in polyoxometalate chemistry.

Author Contributions

AK, NG, and AR conceptualized the need for the dataset. AK designed the data structure. AK and NG were involved in data curation. NG performed validation checks to ensure the reliability of the data. The manuscript was completed with contributions from all authors.

Funding

This research was funded in part by the Austrian Science Fund (FWF) [DOI 10.55776/P33089 (to A.R.); DOI 10.55776/P33927 (to N.G.)].

Data Availability Statement

One can access the “Currated Polyoxometalate Formula” dataset using the Digital Chemistry git repository at the following URL: https://github.com/digital-chemistry/Curated_POMs (accessed on 27 August 2024).

Acknowledgments

The University of Cambridge and the University of Vienna are thanked for research support. We extend our gratitude to Ella Duvanova for her invaluable assistance with data verification.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
POM Polyoxometalate
DOI Digital Object Identifiers
CSV Comma-separated values
JSON JavaScript Object Notation

References

  1. Ongari, D.; Yakutovich, A.V.; Talirz, L.; Smit, B. Building a consistent and reproducible database for adsorption evaluation in covalent–organic frameworks. ACS Cent. Sci. 2019, 5, 1663–1675. [Google Scholar] [CrossRef]
  2. Moghadam, P.Z.; Li, A.; Wiggin, S.B.; Tao, A.; Maloney, A.G.; Wood, P.A.; Ward, S.C.; Fairen-Jimenez, D. Development of a Cambridge Structural Database subset: a collection of metal–organic frameworks for past, present, and future. Chem. Mater. 2017, 29, 2618–2625. [Google Scholar] [CrossRef]
  3. Yang, S.; Lach-Hab, M.; Vaisman, I.I.; Blaisten-Barojas, E.; Li, X.; Karen, V.L. Framework-type determination for zeolite structures in the inorganic crystal structure database. J. Phys. Chem. Ref. Data. 2010, 39. [Google Scholar] [CrossRef]
  4. Zheng, C.; Li, Y.; Yu, J. Database of open-framework aluminophosphate structures. Sci. Data 2020, 7, 107. [Google Scholar] [CrossRef] [PubMed]
  5. Kancharlapalli, S.; Snurr, R.Q. High-throughput screening of the CoRE-MOF-2019 database for CO2 capture from wet flue gas: a multi-scale modeling strategy. ACS Appl. Mater. Interfaces 2023, 15, 28084–28092. [Google Scholar] [CrossRef] [PubMed]
  6. Liu, R.; Streb, C. Polyoxometalate-single atom catalysts (POM-SACs) in energy research and catalysis. Adv. Energy Mater. 2021, 11, 2101120. [Google Scholar] [CrossRef]
  7. Gumerova, N.I.; Rompel, A. Interweaving disciplines to advance chemistry: Applying polyoxometalates in biology. Inorg. Chem. 2021, 60, 6109–6114. [Google Scholar] [CrossRef]
  8. Barba-Bon, A.; Gumerova, N.I.; Tanuhadi, E.; Ashjari, M.; Chen, Y.; Rompel, A.; Nau, W.M. All-Inorganic Polyoxometalates Act as Superchaotropic Membrane Carriers. Adv. Mater. 2024, 36, 2309219. [Google Scholar] [CrossRef]
  9. Bijelic, A.; Rompel, A. Ten good reasons for the use of the tellurium-centered Anderson–Evans polyoxotungstate in protein crystallography. Acc. Chem. Res. 2017, 50, 1441–1448. [Google Scholar] [CrossRef]
  10. Bijelic, A.; Aureliano, M.; Rompel, A. The antibacterial activity of polyoxometalates: structures, antibiotic effects and future perspectives. ChemComm 2018, 54, 1153–1169. [Google Scholar] [CrossRef]
  11. Bijelic, A.; Aureliano, M.; Rompel, A. Polyoxometalates as potential next-generation metallodrugs in the combat against cancer. Angew. Chem. Int. Ed. 2019, 58, 2980–2999. [Google Scholar] [CrossRef] [PubMed]
  12. Kondinski, A. Metal–metal bonds in polyoxometalate chemistry. Nanoscale 2021, 13, 13574–13592. [Google Scholar] [CrossRef]
  13. Kondinski, A.; Banerjee, A.; Mal, S.S. Hervé- and Krebs-Type Magnetic Polyoxometalate Dimers. Magnetochemistry 2022, 8, 96. [Google Scholar] [CrossRef]
  14. Gumerova, N.I.; Rompel, A. Speciation atlas of polyoxometalates in aqueous solutions. Sci. Adv. 2023, 9, eadi0814. [Google Scholar] [CrossRef]
  15. Gumerova, N.I.; Rompel, A. Polyoxometalates in solution: speciation under spotlight. Chem. Soc. Rev 2020, 49, 7568–7601. [Google Scholar] [CrossRef]
  16. Surman, A.J.; Robbins, P.J.; Ujma, J.; Zheng, Q.; Barran, P.E.; Cronin, L. Sizing and discovery of nanosized polyoxometalate clusters by mass spectrometry. J. Am. Chem. Soc. 2016, 138, 3824–3830. [Google Scholar] [CrossRef]
  17. Kondinski, A.; Rasmussen, M.; Mangelsen, S.; Pienack, N.; Simjanoski, V.; Näther, C.; Stares, D.L.; Schalley, C.A.; Bensch, W. Composition-driven archetype dynamics in polyoxovanadates. Chem. Sci. 2022, 13, 6397–6412. [Google Scholar] [CrossRef]
  18. Vilà-Nadal, L. POMzites: A roadmap for inverse design in metal oxide chemistry. Int. J. Quantum Chem. 2021, 121, e26493. [Google Scholar] [CrossRef]
  19. Soria-Carrera, H.; Atrián-Blasco, E.; de la Fuente, J.M.; Mitchell, S.G.; Martín-Rapún, R. Polyoxometalate–polypeptide nanoassemblies as peroxidase surrogates with antibiofilm properties. Nanoscale 2022, 14, 5999–6006. [Google Scholar] [CrossRef]
  20. Cameron, J.M.; Guillemot, G.; Galambos, T.; Amin, S.S.; Hampson, E.; Haidaraly, K.M.; Newton, G.N.; Izzet, G. Supramolecular assemblies of organo-functionalised hybrid polyoxometalates: from functional building blocks to hierarchical nanomaterials. Chem. Soc. Rev. 2022, 51, 293–328. [Google Scholar] [CrossRef] [PubMed]
  21. Pope, M.T.; Jeannin, Y.; Fournier, M. Heteropoly and isopoly oxometalates; Vol. 8, Springer, 1983.
  22. Krivovichev, S.V. Which inorganic structures are the most complex? Angew. Chem. Int. Ed. 2014, 53, 654–661. [Google Scholar] [CrossRef] [PubMed]
  23. Bard, A.J.; Dance, I.G.; Day, P.; Ibers, J.A.; Kunitake, T.; Meyer, T.J.; Mingos, D.M.P.; Roesky, H.W.; Sauvage, J.P.; Simon, A.; others. Bonding and charge distribution in polyoxometalates: a bond valence approach; Springer, 1999. [CrossRef]
  24. Heller, S.R.; McNaught, A.; Pletnev, I.; Stein, S.; Tchekhovskoi, D. InChI, the IUPAC international chemical identifier. J. Cheminform. 2015, 7, 1–34. [Google Scholar] [CrossRef]
  25. Gabrielson, S.W. SciFinder. J. Med. Libr. Assoc. 2018, 106, 588. [Google Scholar] [CrossRef]
  26. Paskin, N. Digital object identifier (DOI®) system. Encycl. Libr. Inf. Sci. 2010, 3, 1586–1592. [Google Scholar] [CrossRef]
  27. Müller, A.; Beckmann, E.; Bögge, H.; Schmidtmann, M.; Dress, A. Inorganic Chemistry Goes Protein Size: A Mo368 Nano-Hedgehog Initiating Nanochemistry by Symmetry Breaking. Angew. Chem. Int. Ed. 2002, 41, 1162–1167. [Google Scholar] [CrossRef]
  28. Fang, X.; Kögerler, P.; Furukawa, Y.; Speldrich, M.; Luban, M. Molecular growth of a core–shell polyoxometalate. Angew. Chem. Int. Ed. 2011, 50, 5212–5216. [Google Scholar] [CrossRef]
  29. Ling, J.; Qiu, J.; Burns, P.C. Uranyl peroxide oxalate cage and core–shell clusters containing 50 and 120 uranyl ions. Inorg. Chem. 2012, 51, 2403–2408. [Google Scholar] [CrossRef]
  30. Wu, Y.L.; Li, X.X.; Qi, Y.J.; Yu, H.; Jin, L.; Zheng, S.T. {Nb288O768(OH)48(CO3)12}: A Macromolecular Polyoxometalate with Close to 300 Niobium Atoms. Angew. Chem. Int. Ed. 2018, 57, 8572–8576. [Google Scholar] [CrossRef]
Figure 1. Illustration of an α -Wells Dawson polyoxometalate instance, showcasing its connection to different chemical formulas on the right and various crystalline materials on the left, each linked to literature (e.g., DOI: 10.1039/C3DT51120K).
Figure 1. Illustration of an α -Wells Dawson polyoxometalate instance, showcasing its connection to different chemical formulas on the right and various crystalline materials on the left, each linked to literature (e.g., DOI: 10.1039/C3DT51120K).
Preprints 116711 g001
Figure 2. Hierarchical schema of polyoxometalate data. (a) Shows the relationship between polyoxometalate materials and their characteristics, including elemental composition, molecular formulas, and bibliographic references. (b) Provides a JSON example of how these attributes are digitally organized, illustrating a specific polyoxometalate entity with its associated material details.
Figure 2. Hierarchical schema of polyoxometalate data. (a) Shows the relationship between polyoxometalate materials and their characteristics, including elemental composition, molecular formulas, and bibliographic references. (b) Provides a JSON example of how these attributes are digitally organized, illustrating a specific polyoxometalate entity with its associated material details.
Preprints 116711 g002
Figure 3. Charge vs. Molecular Mass in Polyoxometalates: (a) shows the overall distribution for reported polyoxometalates, highlighting broad molecular weight trends. (b) zooms into molecular masses up to 30,000 Da, detailing finer distribution characteristics. These plots explore the chemical space of polyoxometalates.
Figure 3. Charge vs. Molecular Mass in Polyoxometalates: (a) shows the overall distribution for reported polyoxometalates, highlighting broad molecular weight trends. (b) zooms into molecular masses up to 30,000 Da, detailing finer distribution characteristics. These plots explore the chemical space of polyoxometalates.
Preprints 116711 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated