Preprint
Article

Standards for Data Spaces Building Blocks

Altmetrics

Downloads

150

Views

121

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

26 July 2024

Posted:

29 July 2024

You are already at the latest version

Alerts
Abstract
Data spaces are a relatively recent concept for a trusted and secure distributed data ecosystem through which to exchange resources in the Web. Several different efforts are currently being made to define guidance toward data space implementation, with some initiatives and organizations providing solutions and standards that address interoperability and good data exchange practices, especially in the domain of geospatial information. Geospatial Use Cases often require effective and high-quality data sharing between systems dealing with different aspects of real-world objects. However, the different solutions proposed do not yet provide a common interoperability framework with mature implementation options. This gap between concept and implementation risks confusing users and developers, instead of reciprocally strengthening the respective data spaces' infrastructures and capabilities. This paper reviews and compares some of the proposed solutions, providing mapping and integration of these blueprints to available interoperability standards. This concrete mapping is designed to support effective practical implementation of data spaces, and to guide future solutions developments.
Keywords: 
Subject: Engineering  -   Other

1. Introduction

The concept of data space [2,3] was mentioned in the early 2000s in relation to the web of data, enabled by Linked Data technologies [1], or as an abstraction for the need for integrated management of diverse data sources for different applications. Data spaces should minimize the bottlenecks to data exchange that result from diverse dataset storage and representation choices and from the lack of trusted channels to communicate data between different stakeholders. A key concern is the need to improve semantic interoperability, i.e., the ability for stakeholders to find, access, and interpret available data and processing services, and ultimately evaluate and make effective re-use of resources to deliver value. These factors, in particular, generate issues that impede the efficiency of data retrieval and harvesting as well as the automation of processing pipelines.
In parallel to the conceptualisation of data spaces, a number of open data paradigms, theories and sharing implementations have been formulated and developed, as well as national or regional open data strategies [4,5,6]. The underlying concept driving such open data approaches is that the value of the data is not in its cost but in its usage and the value it brings to the society.
The potential value of open data was immediately grasped in relation to scientific data and especially for environmental use cases, for which a cross-border approach is obviously essential [22]. At the same time, in Europe, the Directive for an Infrastructure for Spatial Information in the European Community (INSPIRE) was formulated with similar goals (Directive 2007/2/EC [23]). It was preceded by the Directive 2003/98/EC [24] on the re-use of public sector information, known as the PSI Directive, or Open Data Directive. However, the INSPIRE Directive introduced governance means to support higher efficiency in implementation, such as a roadmap and sanctions.
Other international institutions have also been developing successful architectures and standards for data exchange within their specific domains. For example, the World Meteorological Organisation (WMO) has used defined standards to exchange data since 1951. Although WMO’s approach was originally focused on the so called "push" systems (data broadcasting to everyone listening) which exploited telecommunications to exchange standardized open data in real-time, from 1999, they moved to Web services and "pull" systems (data is available and can be requested, which is more similar to the current data space concept) with the WMO Information System (WIS) [25].
These initial formulations and the successful example of WMO were followed by numerous other initiatives and organisations in various fields, and standards and principles were developed, which are now well-known. Examples include the standards developed within the World Wide Web Consortium (W3C) [26], founded in 1994; within the Open Geospatial Consortium (OGC), founded in 1994 [27]; and the principles defined by the Group on Earth Observations (GEO), established in 2005 [28]. Geospatial data-related organizations quickly understood the need for good data-sharing solutions and practices to successfully support their use cases (e.g., Earth Observations, satellite information, environment-related applications). Therefore, they actively started to collaborate to define standards, and build consensus over those standards.
More recently, other organisations have reformulated and further specified the concept of “data space”, (Section 3.1.1) and begun to propose their solutions to solve some of the implied challenges (e.g. International Data Spaces Association – IDSA defined data space as data exchange infrastructures characterized by uniform rules, certified data providers and recipients and trust among public and private partners [29]). In recent years the European Union developed a greater interest in providing a shared European infrastructure through which both public and private data may be exchanged and retrieved in a reliable and cost-effective way, with the goal of providing shared data-driven products and services across Member States. The European Commission understands data spaces as infrastructures and data that are open for the participation of all organisations and individuals, are secure and privacy-preserving to pool, access, share, and use data, respect EU rules and values, especially personal data and consumer protection, and competition law and empower data holders to make their data available for reuse for free or against compensation. This goal became part of the European Data Strategy, and several projects were funded by the European Commission to investigate the topic and provide working solutions (Section 3.1.1). The same concept is being investigated and developed in other parts of the world, many times without specifically referring to it a ’data space’, for example, by the NGA National Unclassified Data Lake (NUDL) in the US.
Several projects, initiatives, and organizations have provided valuable blueprints and architectures, representing the challenges to be addressed when planning data spaces (see a review in Deliverable 3.2 [30] of the HORIZON Europe USAGE [31] project). However, in these proposals, the connection to the huge palette of existing standards and solutions, often already adopted in practice, is not always straightforward.
Therefore, in this paper, various data space initiatives are mapped to the interoperability and sharing solutions and standards provided by or adopted by several key international organisations. This mapping especially considers the standards relevant to the geospatial domain (e.g., OGC, GEO, W3C). This offers an initial starting point to consolidate a common framework to which different solutions, coming from different initiatives, can be related.
This review provides an overview of the capability of the current offer in terms of standards and services to address the defined challenges. On the one hand, it identifies the scope of expertise of the geospatial information management-related organizations on which this paper focuses. On the other hand, it may help to identify the remaining gaps in the data space’s developments.
The European Commission has proposed 14 relevant domains for which data spaces are being investigated and developed [32], namely: Agriculture, Cultural Heritage, Energy, Finance, Green Deal, Health, Language, Manufacturing, Media, Mobility, Public Administration, Research and innovation, Skills, and Tourism. While for some of those sectors, geospatial information is not identified as essential (e.g. Finance, Language, Media, Research and Innovation, Skills), for others, it is at the core of any analysis and decision-making activities (Energy, Green Deal, Mobility, Tourism). For the remaining domains, geospatial data might not be essential but deserves to be considered and included because of the high probability of its relevant added value. Therefore, considering current solutions for geospatial information is useful to build an architecture on established infrastructures, besides being a potential reference for other kinds of data.

1.1. Principles and Recommendations for Data Interoperability and Management

In the efforts to enable such a data exchange ecosystem, some principles have been formulated to guide good practices for data management and publications. They come from different sectors, e.g., research, for FAIR principles [9]; legal directives, such as the European Interoperability Framework (EIF) [10]; or from practice and operational initiatives, such as GEO Data Sharing and Data Management Principles.
Moreover, their scope and goals are slightly different and complementary, in some cases, rather than overlapping.
For example, the focus area is different. The EIF mostly addresses practices of European public administrations, while the FAIR consider researchers and stakeholders from a global perspective. The four interoperability layers in the EIF are legal, organisational, technical, and semantic, while GEO Principles consider primarily technical aspects. FAIR principles require consideration of all layers. However, a decision to reuse is ultimately the result of the evaluation of many factors, such as the semantic relevance of resources to a problem, the ease of use from a legal or licensing perspective, and the cost-effectiveness of access and integration. Therefore, it is valid to directly compare EIF, FAIR, and GEO here.
By addressing different sectors and having distinct objectives and components, the EIF and GEO can be seen to serve complementary but different roles in the landscape of interoperability and data sharing.

1.1.1. Findable Accessible Interoperable and Reusable (FAIR) Principles

In 2016, Wilkinson et al. [9] published the ‘FAIR guiding principles for scientific data management and stewardship’ (Table 1), to support good practices allowing Findability Accessibility Interoperability and Reusability (FAIR) of data for good advantage of science and society [36]. These became a reference for the main interoperability and digitalisation-related institutions’ practices, laws and research environments. The FAIR principles are an enabling factor for data spaces as well.

1.1.2. The European Interoperability Framework

The European Union [10] gives recommendations and guidance to support a shared and interoperable digital environment for communication and exchange of data with the public administrations in Europe. There, interoperability is defined as as “the ability of organisations to interact towards mutually beneficial goals, involving the sharing of information and knowledge between these organisations, through the business processes they support, by means of the exchange of data between their ICT systems.”
It provides 12 interoperability principles, divided into 4 categoriesFor each principle, recommendations are proposed. In addition, the European Interoperability Framework defines an Interoperability model, structured on four layers (legal, organisational, semantic, technical) plus a transversal one related to the integrated public service governance, for which other recommendations are provided, and finally, a conceptual model of the interoperability components, again with related recommendations.

1.1.3. GEO Data Sharing and Data Management Principles

Since 2015, GEO has promoted fundamental principles for data sharing, recognising data sharing as a key factor in realising the potential societal benefits from Earth observation. GEO Data Sharing Principles are [37]:
  • data, metadata and products will be shared as Open Data by default, by making them available as part of the GEOSS Data Collection of Open Resources for Everyone (Data-CORE) without charge or restrictions on reuse, subject to the conditions of registration and attribution when the data are reused;
  • where international instruments, national policies or legislation preclude the sharing of data as Open Data, data should be made available with minimal restrictions on use and at no more than the cost of reproduction and distribution;
  • all shared data, products and metadata will be made available with minimum time delay.
Beyond the GEO Data Sharing Principles, ten GEO Data Management Principles were defined by the GEO Data Management Principles Task Force in April 2015 [37]. They are grouped under five categories: discoverability, accessibility, usability, preservation, and curation.
The GEO Data Sharing and Data Management principles Subgroup to the GEO Data working group has made a first mapping of these GEO Data Management principles to the FAIR Principles and is working on a more refined realignment. They also advocate the Transparency, Responsibility, User focus, Sustainability (TRUST) principles [11] for digital repositories and the Collective Benefit, Authority to Control, Responsibility, Ethics (CARE) [38] principles for indigenous Data Governance. All these principles are of interest for data spaces.

1.2. Established Interoperability and Standardisation Organisations and Initiatives

Some of the groups working towards consensus on interoperability across diverse operations are particularly relevant to geospatial information, including the TC 211 committee of the International Standardisation Organisation (ISO), ‘Spatial Data on the Web’ Working Group of the World Wide Web Consortium (W3C) and the Open Geospatial Consortium (OGC). Others, such as the Group on Earth Observation (GEO) recommend standards and interoperability by realising Principles and best practices and demonstrative portals.
The International Standardisation Organisation (ISO) [33] is recognised as a global institution to publish standards, covering a wide range of domains. Founded in 1945, it is an independent, non-governmental international organisation with more than 150 national standards bodies as members. ISO brings together experts to share knowledge and develop voluntary, consensus-based International Standards, supporting market innovation and interoperability. ISO standards are a general reference and are usually considered a priority for compliance.
The Open Geospatial Consortium (OGC) [27] is a global consortium organisation to provide open standards and solutions to support interoperability for geospatial data. The OGC operates a range of liaisons with ISO, W3C, and others within this specialised context. It has a particular emphasis on improving Findability, Accessibility, Interoperability and Reuse (FAIR) (Section 1.1.1) through standards openness. Over the past thirty years, OGC has brought together approximately 500 members from different sectors (research, industry, government), mostly with common interests in geospatial information management. Several standards and solutions published by the OGC are currently well known and adopted solutions that can be considered as options to address the required functionalities of data spaces.
The World Wide Web Consortium (W3C) [26] was founded in 1994 and publishes standards enabling the development of the World Wide Web. Many of the other standardisation actions for interoperability across the web stand on the basis of W3C standards. In particular, a joint W3C-OGC working group, namely, the ‘Spatial Data on the Web Working Group’, running since around 2017, gathered interoperability and data sharing experts from major organisations around the world and is active in providing and maintaining vocabularies, best practices and documents supporting effective use and better sharing of spatial data on the web. Moreover, the work of the group identifies where joint action in developing standards is needed from W3C and OGC [34]. Although formulated in terms of joint standards publishing, and being focused on spatial data, the scope of the working group is very much aligned with the data spaces concept. The deliverables produced can, therefore, be considered as an effective base to be extended in data space implementation. In particular, the Spatial Data on the Web Best Practices (latest version [7] proposes practices and solutions for publishing spatial data FAIRly through the web (including principles, data representation and documentation, validation, data access, metadata, and data ethics), as well as analysing current limitations for future development. They also propose an interesting extension to the FAIR principles, including web accessibility for humans and machines and data quality. Another part of their work regards the legal and ethical implications of sharing data over the web and provides guidance [8].
Some very relevant requirements have emerged from summits and policies related to sustainability, resilience, and disaster risk reduction. At the global level and initiated in 2003 to support work on these challenges, the Group on Earth Observations (GEO) is a partnership of national governments and participating organisations wishing to coordinate and collaborate on the management and use of Earth Observations. The GEO community developed the Global Earth Observation System of Systems (GEOSS) [35] aiming at the integration of the different observing systems and connection of infrastructures through common standards.

1.3. Software Interoperability Standards

Data spaces need software components that support its exchange, discovery, evaluation and use, The field of software development has also developed standards related to software quality, including interoperability and reciprocal connections between components; these should also be taken into account. For example, ISO 25000 defines, in the ISO/IEC25010 - ‘System and software quality models’, some software product quality criteria including categories such as ‘compatibility’ or ‘interaction capability’. These overlap with some principles and issues addressed within the data spaces blueprints and proposed architectures (Figure 1). In the same ISO25000 series, the ISO/IEC25012 on Data Quality model [29] indicates parameters for assessing the quality of data, several of which relate to the data capability to be accessed, understood and tracked. These examples clearly fall in the same scope as the data space objectives.
Another relevant standard related to software interoperability is the Open Systems Interconnection (OSI) model, ISO/IEC 7498. This defines a layered architecture for systems interconnections and communication, including data exchange. Each layer includes definitions and guidance which may align with the concerns of data spaces: application, presentation, session, transport, network, data link, physical [41].
Finally, the Reference Model of Open Distributed Processing (RM-ODP) [70], standard ISO/IEC 10746 [71], is considered for the technical blueprint description in the GREAT project, as well as in several other initiatives and organisations (e.g. OGC). The viewpoints recommended by the standard for the modelling of software architectures are:
  • Enterprise - Business requirements of the system (purpose, scope policies);
  • Information - Information managed by the system and the structure and content type of the supporting data (semantics and information processing);
  • Computational - Functionality provided by the system and its functional decomposition (objects which interact at interfaces);
  • Engineering - distribution of processing performed by the system to manage the information and provide the functionality (mechanisms and functions required to support distributed interactions between objects in the system);
  • Technology - technologies chosen to provide the processing, functionality and presentation of information.

1.4. Data Spaces Concept

A "data space" as currently defined, is a data exchange paradigm that envisions a distributed architecture or federated data ecosystem where data remains close to its owners or providers, who keep full control on their data and manage the scope of, and conditions for its use throughout the life cycle of that data. Data would be effectively exchanged over the Internet using trusted connections for the benefit of various use cases. A data space is defined by a governance framework that enables secure and trustworthy data transactions between participants (DSSC Glossary [42] and Starter Kit [43]).
A data space is implemented by one or more infrastructures and enables one or more use cases. A data space, is supposed to have a decentralized data storage, i.e., data physically remains with the respective data owner until it is transferred to a trusted party [12]. The users of such data spaces should be trusted parties enabled to access data in a secure, transparent, trusted, easy, and unified fashion, according to commonly agreed principles [13]. Access and usage rights can only be granted by persons or organisations entitled to dispose of the data [14].
The identified advantages of data spaces are [15]:
  • New services relying on enhanced transparency and data sovereignty;
  • Level playing field for data sharing and exchange;
  • A new user behaviour and digital culture, including a higher awareness of digital related ethics. Users will likely become more mindful about treating their data as an asset;
  • Availability of large pools of data;
  • Infrastructure to use and exchange data;
  • Appropriate governance mechanisms.
Data spaces with an appropriate underlying interoperability infrastructure enable shared understanding and reuse of data and processing capabilities.
Although this paper primarily considers the technical aspects of data spaces, these are only one part of the entire challenge, which also includes business, organisational, governance and legal aspects. A recent Joint Research Center (JRC) document on European Data Spaces [15] summarises their principles, requirements and features, including technical aspects but also a range of other facets (e.g. data sovereignty, citizen centricity, inclusion, self-determination, trust, innovation, scalability, and so on).

1.4.1. Data Spaces beyond Technical Features

The implementation of a data space implies decisions and infrastructures which go beyond the bare technical ones. In fact, specific business models, governance, organisational models and legal aspects need to be taken into account [43].
A specific business model, or multi-sided business models, need to be developed for a data space to assure a sustainable business case. It should consider network effects and serve both supply and demand of data and related services, and should be built on consensus from the multiple users and organizations involved. Data spaces go through different life cycle stages: preparatory stage; implementation stage; operational, and especially the growing and scaling stages, including maintenance and improvement. The scaling stage is actually the one with highest potential, and it is therefore essential that the interoperability solutions chosen are well supported by an established organisation and community.
From a legal point of view, several aspects should be taken into account, i.e.: different local and EU legal entitlements to data; the intense European legislative agenda; and the intricate interplay between different relevant regulatory instruments. Legal and governance building blocks need therefore to be proposed. Proposals on the topic come, for example, from the H2020 OpenDEI [44] design principles, the EU Support Centre for Data Sharing (SCDS) [45], and the Data Sharing Coalition [46] which proposes to follow the Business Legal Operational Functional and Technical (BLOFT) Framework.
The operational aspects of a data space include operational governance agreements, such as compliance with GDPR, onboarding of organisations, decision making, and dispute resolution. Business operations such as process streamlining, automation, marketing, and awareness activities are also important components of operational activities, e.g.: monitoring and logging data exchanges to detect and solve reconciliation issues in a timely manner, as well as monitoring the whole infrastructure, such as software, energy, and resources, to ensure efficiency and trustful Service Level. Guides and documentation should support correct use, management and maintenance of the system, as well as means for technical and organisational support.

1.4.2. Main Projects and Initiatives for Data Spaces

The International Data Spaces Association (IDSA) [39] is a global membership association founded in 2016, comprising approximately 140 members, in cross-sectoral fields (research, industry, lawmakers and others). The goal of the association is to provide solutions for data spaces, implying the sharing of data in a data economy model in which everyone can keep full control over their data (data sovereignty), when exchanging them. The focus of IDSA is to ensure reciprocal trust between the actors involved (data providers and consumers), security and data sovereignty, building on and further developing existing standards, technologies, measures and governance models [12].
The Gaia-X European Association for Data and Cloud (Gaia-X) [47] is a non-profit association, founded in 2021, aiming at the formulation and development of a technical framework for federated open data infrastructure based on European values regarding data and cloud sovereignty. The mission of Gaia-X is to design and implement a data exchange architecture that consists of common standards for data sharing, best practices, tools, and governance mechanisms [16]. Gaia-X proposes a complementary Data Ecosystem and Infrastructure Ecosystem, which will build upon each other.
FIWARE Foundation [48], established in 2016, develops and proposes components intended to support Open Source Platforms and connected solutions. It is a members-based organisation proposing open standards for interoperability. The focus of FIWARE is the Information and Communication technology (ICT). They provide components, called ‘Generic Enablers’, and a Reference Architecture intended to be further extended and developed based on an application’s needs.
Big Data Value Association (BDVA) [49]/DAIRO, FIWARE Foundation, Gaia-X, and IDSA formed the Data Spaces Business Alliance (DSBA) and in April 2023 they published the ‘Technical Convergence Discussion Document’ [17], also endorsed by DSSC [50]. It defines a common technology framework, based on the technical convergence of the respective architectures and models, leveraging mutual infrastructure and implementation efforts.
The four organisations defining themselves as a reference for the data spaces deployment, in the DSBA [17], assume the following roles (Figure 21):
  • BDVA: knowledge and general understanding for data usage
  • FIWARE Foundation: components for digital twin data exchange, decentralized Identity and Access Management relying on existing trust frameworks, and data services publication/trading
  • Gaia-X: Global cross-dataspace governance based on European values
  • IDSA: Data Space Connectors, Usage Contract Negotiation, and general creation of a data space
The Data Spaces Support Centre (DSSC) is funded by the European Union Digital Europe Programme under grant agreement n° 101083412. It started in October 2022 and will run until May 2025 [51]. DSSC aims at supporting data spaces deployment, by providing assets in cooperation with a network of stakeholders. They developed a stack of high level data spaces building blocks, starting from the evolution of the proposal by the OpenDEI project [44], which finds a consensus by several organisations and initiatives working on the topic of data spaces. This stack was mostly developed based on consensus among the DSSC partners.
The data space for smart and sustainable cities and communities (DS4SSCC) [52] developed a Catalogue of Specifications providing an overview of 11 identified building blocks (BBs) (technical and non-technical). These are based on, or compliant with, most of the DSSC specifications, especially in their technical aspects, and consist of: Data Models; Data Exchange; Provenance and Traceability; Identity Management; Trust; Access and Usage policies and control; Data, Services and Offerings descriptions; Publication and Discovery; Marketplaces; Business Agreements; Organisational and Operational Agreements.
For each BB, the commonly used standards, industry specifications and reference implementations are identified. In addition, the building blocks are mapped to the Minimum Interoperability Mechanisms defined by the Open Agile and Smart Cities Association.
The Green Deal Data Space (GREAT) project [53] is a project funded by the Digital Europe program, running from 2022 to 2024 and working towards defining foundations for a Green Deal Data Space as well as the connected community of practice. They work on 5 pillars: the Community of Practice, priority datasets, a technical blueprint, governance and business models, and a roadmap.
They published the Technical Blueprint [54] for Green Deal Data Spaces in November 2023. A digital ecosystem service design is proposed, as a “secure, trusted and seamless sharing of data to support Green Deal applications” [54]. They state the general principles of: Inclusiveness (i.e., all datasets should be allowed to participate in the data space); Fairness (i.e., provide equal possibility of access to the data space); Autonomy (i.e., each data source should maintain its own management, while being allowed to be included in the data space). Moreover, the following design principles were identified (GREAT Blueprint):
  • Low Entry Barrier: for both data providers and data consumers.
  • System of Systems: the GDDS is designed as a System of Systems (SoS) to interconnect many independent, autonomous systems, frequently of large dimensions, to satisfy a global goal (i.e., the GDDS Digital Ecosystem service) while keeping them autonomous.
  • Standardization and Mediation: the GDDS will rely on interoperability standards, developed at community level, complementing them with mediation/brokering to enable cross-domain interoperability.
  • Data as entry point: the GDDS focuses on the sharing and use of data, independently of how the data is generated (e.g., off-line, on-the-fly, etc.).
  • Loose-coupling: the GDDS Digital Ecosystem is enabled by a set of APIs which can be used by data consumers to leverage (and enrich) the GDDS resources.
  • Interoperability/Security Orthogonality: the GDDS security architecture is orthogonal to the GDDS interoperability architecture, so that the two issues can be tackled independently.
GREAT proposes to use the ISO Reference Model for Open Distributed Processing (RM-ODP) for describing the proposed architecture. They consider a number of viewpoints as follows:
  • Enterprise (purpose, scope and policies governing the activities of the specified system within the organization of which it is a part);
  • Information (concerned with the kinds of information handled by the system and constraints on the use and interpretation of that information);
  • Computational (functional decomposition of the system into a set of objects that interact at interfaces - enabling system distribution);
  • Engineering (infrastructure required to support system distribution)
  • Technology (technology to support system distribution).
In addition they identify and define some solutions supporting security and trust, especially regarding:
  • authentication (process of verifying a claim that a system entity or system resource has a certain attribute value);
  • access control (protection of system resources against unauthorized access);
  • confidentiality (data is not disclosed to system entities unless they have been authorized to know the data);
  • integrity (data has not been changed, destroyed, or lost in an unauthorized or accidental manner);
  • non-repudiation (protection against false denial of involvement in an association).
The GREAT Blueprint (2023) finally lists in its annex the relevant services and components that might be useful to address the identified functionalities and building blocks, including some standardised solutions already in place in practice.
The EU is funding several initiatives related to Common European Data Spaces for sector and domain-specific initiatives, notably the Data Spaces Support Centre and Smart Open-source Middleware (Simpl) [56] under the Digital Europe programme and Horizon Europe. It is intended to develop an open source smart and secure ecosystem (middleware platform supporting safe and secure access and interoperability). Simpl is composed by three parts: Simpl-Open (open-source software stack for data spaces and other cloud-to-edge federation initiatives); Simpl-Labs (environment for data spaces to experiment with deployment, maintenance, and support of the open-source software and assess their level of interoperability with Simpl); Simpl-Live (instances of Simpl-Open deployed for specific sectoral data spaces where the European Commission itself plays an active role in their management). Initial requirements are being developed by Simpl, which can be considered among the standards reference implementations and solutions available, although their maturity is not very high yet. Such requirements are grouped in three levels: L0 Business Processes; L1 High-Level Requirements; L2 Detailed Requirements [57].
Simpl adds to the principles for software quality described in Section 1.3, by establishing 5 principles:
  • Anchored to specific use cases: Simpl will ensure that data sets and their infrastructure can be seamlessly interconnected and made interoperable.
  • Smart and modular: Simpl will allow the replacement or addition of components without affecting the rest of the system.
  • Open source: Simpl will be publicly accessible, and allow anyone to see, modify and distribute it.
  • Green, scalable and agile: Simpl will allow the monitoring of its environmental performance, and the addition of new users without affecting performance.
  • Secure and interoperable: Simpl will ensure trust, confidence and compliance with regulations are built into the system. This implies an effortless sharing of resources between participants, regardless of their data processing environment. It creates an abstraction layer that enables data to flow across multiple providers and Member States.
Finally, input and discussion on the interoperability and data exchange topic also come from the ’Open Agile and Smart Cities’ (OASC) [58] global network of cities and actors joining efforts towards defining and agreeing on solutions for digital transformation, starting from 2015. They define the Minimal Interoperability Mechanisms (MIMs), as minimal technical requirements to facilitate digital solutions for cities. They are mostly based on leveraging open standards and APIs. The following MIMs have been adopted by OASC members:
  • MIM1 – Context – Context Information Management;
  • MIM2 – Data Modules – Shared Data Models;
  • MIM3 – Contracts – Ecosystem Transaction Management;
  • MIM4 – Trust – Personal Data Management;
  • MIM5 – Transparency – Fair Artificial Intelligence;
  • MIM7 – Places – Geospatial Information Management;
  • MIM6 – Security – Security Management;
  • MIM8 – Indicators – Ecosystem Indicator Management;
  • MIM9 – Analytics – Data Analytics Management;
  • MIM10 – Resources – Resource Impact Assessment.

2. Methodology

Mapping these diverse approaches to data spaces to existing interoperability and sharing solutions and standards widely adopted at European and global level, required a robust, detailed and systematic methodology. The starting point was the work of organisations and projects that have extensively worked on data spaces definition and related frameworks in recent years. Among all the options, the Data Spaces building blocks stack proposed by the Data Spaces Support Centre project was identified as the most inclusive and high-level, as well as being agreed by several other organisations and project consortia. The building blocks describe the different challenges for producing an effective data space. It is considered and further refined using a bottom-up approach, starting by mapping the solutions already provided and adopted in practice, especially focussing on technical standards from the geospatial domain. Interoperability-related principles are also considered for the mapping.
To further refine the mapping, integrating the chosen framework, a workshop was held (Section 2.1) with a panel of experts involved in standardisation organisations and in two of the four Horizon Europe projects funded on the topic of Data spaces for Green Deal (USAGE and AD4GD). As a result, data space building blocks were identified, having a sufficient granularity level to allow consistent mapping of each building block to single functionalities addressed by different standards.
An initial set of the main standards and solutions, as currently available, are then mapped to the identified building blocks, connecting concepts and ideas to proved solutions. This helps to identify where the solutions provided (i.e. working and adopted standards), can already represent a basis for further data space development, facilitating their uptake in general.
Finally, to assist in navigating the wide range of available solutions, some criteria are proposed to provide metrics for consistent description and assessment of each standard. This allows users to evaluate and choose the standards they need. At the same time it can guide standardisation organisations to improve their standards and present them transparently and effectively. However, this is an initial proposal, which will need to be improved through additional research, including testing and a consensus-based process.
Figure 2 summarises the different parts of the work described in this paper.

2.1. Finding a Consensus over the Revised Data Spaces Building Blocks

This study began with work developed in the Green Deal Data Spaces HORIZON Europe project USAGE and extended within the AD4GD project addressing the same topic (Section 3.2 and Section 3.2.1 respectively).
The data spaces building blocks stack provided by the Data Spaces Support Centre project was analysed and mapped to the available standards and shared data management principles within the USAGE project. It was further integrated with reference to specific implementation needs in later phases of the projects, especially considering insights and pilot cases from the AD4GD project. Both the USAGE and AD4GD projects consortia are participated by several partners deeply involved in standardisation actions and activities, as well as developers of interoperability solution. Their agreement on the proposed integration of the DSSC building blocks is therefore also valuable as an expert validation, reinforcing the bottom-up approach adopted in the first stage.
To further extend the consensus over the integrated building blocks stack and to outline possible shortcomings, a workshop was organised, involving the authors of this paper, (who are partly involved in USAGE or AD4GD and partly in other similar data ecosystems-focussed projects and standardisation initiatives), and external experts with similar extensive experience from research, implementation and practice perspectives (Linda van den Brink, Bart de Lathouwer - Geonovum, Giacomo Martirano – EPSIT).
The result from USAGE and AD4GD projects and the proposed evolution of building blocks (Section 3.2.1) was shared with and explained to the panel of experts. The panel were then asked to give their feedback on the adopted choices, considering their own solutions and standards, other frameworks they know, and general experiences in research, work, projects. A form was provided which guided them through each step and choice in the DSSC Data Spaces Building Blocks evolution, asking them for specific feedback and level of agreement. A final meeting was organised to discuss the possible discrepancies and different points of view, and to find a common agreement. The final results were summarised as reported in Section 3.2.2 and shared with the panel for last comments and feedback.

3. Results

3.1. A Revised Data Spaces Building Block Stack

3.1.1. Review and Comparison of Data Spaces-Related Initiatives and Blueprints

After reviewing the most prominent initiatives regarding data spaces definitions and related solutions and standards proposals, these were mapped to each other to highlight the reciprocal relationships, the respective scopes or main focuses and the progression and dependencies in the reference documents [30]. Figure 32 summarises the progression of the different initiatives and related proposed architectures developed over time.
Aspects of interoperability and data sharing (see Section 1.1) are relevant to enable effective data spaces. However, several data spaces-related conceptualisations [12,16], are focussed on data sovereignty and trust. By contrast, the institutions involved in (geo)data standardisation (e.g. Joint Research Centre - INSPIRE, Open Geospatial Consortium) or information sharing through the web (e.g. World Wide Web consortium) have been developing solutions and standards to support data interoperability for some time.

3.1.2. Identification of a Initial Baseline Suite of Building Blocks

Considering the timeline of different initiatives, as well as the participation of main actors in the data spaces conceptualisation and implementation, in thi study, the building blocks proposed by the DSSC [18] were taken as the initial baseline for data space building blocks. As a European funded project, DSSC supports the participation of other (not only) European funded activities such as Gaia-X, IDSA, BDVA (Section 3.1.1), besides building on European efforts and regulations. In addition, the project DS4SSCC considers the same building blocks as a reference for its catalogue, in which some standards are already mapped, alongside the Minimum Interoperability Mechanisms defined by OASC. The GREAT Blueprint [19] acknowledges the proposed building blocks as a reference, in addition to the mapping of building blocks and components proposed by the different initiatives. Grothe [13] also makes a mapping of the proposed blueprints and architectures over an initial version of building blocks, pointing out the main scope of the considered projects and initiatives.
The DSSC building blocks are grouped into ‘Technical’ building blocks, and ‘Governance and Business’ building blocks. Although governance and business are essential aspects and will be addressed in the near future, in this paper, the focus is on the technical building blocks, and the provision for semantic interoperability. For each building block the alternative solutions are mapped, starting from the mapping already done by the DS4SSCC, critically re-considered against these concerns and relevance of further standards and solutions, primarily coming from the OGC standards, INSPIRE and GEO references. When the direct reference to DS4SSCC remains in the Tables in Annex 1, it is because additional standards are reported by DS4SSCC, with respect to the main references considered in this paper.

3.1.3. Improved Mapping to Available Solutions through Increased Building Block Granularity

Such a mapping to current solutions highlighted the need for an increase in granularity in the proposed building blocks to allow their consistent use (Figure 4). In particular, under the ‘data exchange’ aspect, both the communication technology, such as Application Programming Interfaces (APIs), and the format to encode data should be considered, and these are two separate issues, for which different standards and solutions apply.
Similarly, usage policy specification and the control over the compliance to the terms established in the policy should be considered separately as they regard (a) the way in which to express and encode the policy terms and (b) the solutions used to read and enforce such terms.
“Data services and offerings description” is intended to describe metadata. However, these should be described for each of data, software and services. It may be noted that, in a workflow, such elements correspond to the core concepts of the W3C Provenance model: Entities, Agents and Activities. To keep these distinctions clear and provide appropriate interoperability standards for each case, additional building blocks are needed.
Figure 4. Re-factored Technical Building Blocks stack, following the mapping to solutions and their scopes.
Figure 4. Re-factored Technical Building Blocks stack, following the mapping to solutions and their scopes.
Preprints 113452 g004

3.2. Building Blocks Integration Aligning to Agreed International Principles

International principles and recommendations for good management and sharing of data (Section 1.1) were mapped to the building blocks baseline. These principles are a major driver in the digital economy that cannot be ignored, and they have to be explicitly addressed to benefit from related resources and be adopted by a broad community. Additional changes were generated in the proposed building blocks stack in order to make it consistent with such principles, as well as able to include additional standards and components being developed and tested for data spaces developments in the green deal data spaces-related HORIZON projects, in particular USAGE (Figure 5).
Considering FAIR principles, the ‘Data Interoperability’ category was generalised to address a wider range of aspects, and the title was revised into ‘Data FAIRness’. It will be important that, for each dataset involved in a data space, all the aspects listed under the ‘Data FAIRness’ category would be properly documented into extended versions of metadata, including or linking to:
  • Data Model used, which should be based on standards, typically specifying the used profile (i.e., the subset of a more comprehensive data model, if applicable) and possible extensions. using standardised documentation of them and machine-readable encodings to possibly support data validation as well. Data models and profiles should specify:
    -
    the semantic and structural description of the data,
    -
    the use of geometry and any related aspect, such as the kind of representation used (as solids, as surfaces, polygons, lines, raster), kind of geometry stored, any topology representation required, level of detail or resolution, accuracy, and so on.
  • Data Exchange – Encodings, related to the syntax.
  • Data description (metadata) (moved from the ‘Data value creation’ category after splitting from the related service).
  • Provenance and Traceability extends the attribute ‘provenance’ or ‘lineage’ of typical "discovery" metadata allowing the visibility of underlying data supply chains critical to a suitable understanding of, and subsequent reusability of the data.
  • Data Licences (moved from the ‘Data Sovereignty and Trust category’ after splitting from the related control service), indicating the conditions for use of the data and reusability of derived outputs.
The ‘Data Value Creation’ category is renamed as ‘Tools for FAIRness’ to emphasise that the contained solutions are not adding to the instrinsic value of data, but rather enabling full leveraging of this value by means of the FAIR principles’ comprehensive support.
The ‘Data exchange - communication (e.g. APIs for data exchange)’ building block (previously under ‘Data Interoperability’) has been moved to the ‘Tools for FAIRness’ category,. Moreover, a building block for ‘Data requirements specification and data validation’ was added. In fact, it is essential to agree on standardised methodologies and, possibly, supporting tools, to define exactly the data requirements, considering all the building blocks contained in the ‘Data FAIRness’ category.
Data requirements specifications connect needs of a use case and the processing required to the the datasets needed to be retrieved, or produced to order. Highly detailed definition of many aspects of such data requirements, including the needed data quality, are necessary to support evaluation, planning and any possible automation of data re-use and integration steps.
When such data requirements are defined in a machine-readable encoding, automatic data validation against them becomes possible, ensuring that data to be input into processing have the sufficient quality and characteristics, and the result of processing or analysis has, therefore, the expected reliability. They enable pipeline automation as well as the advantage given by data spaces. In addition, several aspects related to services were not placed within the building block stack, despite playing a relevant role in supporting data preparation for sharing, data use, analysis, processing and so on. Such software should exist for the whole data space in order to deliver maximum benefit. In addition, it needs to be suitable for connection to the ecosystem, as well as able for managing standardised data, or any shared data, properly. Therefore, we proposed an additional category addressing ‘Services FAIRness’ and which contains the related building blocks - both from the previous stack, such as the ‘Software descriptions (metadata)’ and new proposals, for example those to support documentation and definition of access to software and the related licences.
Figure 5. Integrated proposed building block stack aligning with principles and solutions components (USAGE D3.2).
Figure 5. Integrated proposed building block stack aligning with principles and solutions components (USAGE D3.2).
Preprints 113452 g005

3.2.1. Building Blocks Refinement with Input from Experts and Green Deal Data Spaces Projects

Each building block proposed by USAGE (Figure 5) was critically reconsidered by the AD4GD consortium, during a hackathon in Turin, in early February 2024, capitalising on the experience gained by the consortium during the first part of the project, including the development of AD4GD use cases and supporting architecture. Moreover, experience of the consortium partners (deeply involved in OGC, FIWARE and other interoperability-related initiatives and projects) in longer term interoperability solutions and technology developments was brought to the discussion to justify the proposed changes (Figure 6).
The pillar on ‘Data Sovereignty & Trust’ only underwent minor changes. For example, one more building block, the ‘Sharing traceability’, was added to address traceability of the data. This one came from the splitting of the original DSSC ‘Provenance and Traceability’ building block in two: one about solutions helping to track the data and their use, and one about expressing provenance according to standards, i.e., the ‘Data Provenance models’, which remained within the pillar ‘Data FAIRness’.
Within the ‘Data FAIRness’ pillar, we retained unchanged the ‘Data Models’ building block, ‘Data Descriptions (metadata)’, Data Exchange (encodings) and ‘Data Licences’. In addition, ‘Data requirements schemas’ was added, as complementary to ‘Data requirements system (definition + validation)’ under ‘Tools for FAIRness’, both resulting from the split of the previous ‘Data requirements specification’. Symmetrically to ‘Data Licences’, a ‘Licences for Services’ building block was added under ‘Services FAIRness’.
Under the category ‘Tools for FAIRness’ the building blocks were added: - Vocabularies and Meaning service, i.e. services enabling semantic interoperability by defining and providing terms and mechanisms to represent and leverage "shared knowledge" in different domains; - Data transformation, i.e. any mapping tool and routines facilitating data integration and conversions.
Moreover, ‘Vocabularies’ were added as a transversal building block for both Data FAIRness and Services FAIRness categories, since they are relevant tools to support all the blocks in those categories.
Finally, we added a category, which we considered as not being entirely part of the data spaces, but rather interacting with it (possibly under the concept of ‘digital twin’), to host all computation-related building blocks, i.e.: processing, software access (previously under ‘Services FAIRness’), workflows and actuators.
We also considered adding the data space assets (contents) themselves, which were classified as requirements, metadata, provenance, semantic tagging, data (including personal data) and licences.
Figure 6. Abstract architecture diagram for the GDDS from AD4GD hackathon and discussion.
Figure 6. Abstract architecture diagram for the GDDS from AD4GD hackathon and discussion.
Preprints 113452 g006

3.2.2. Rediscussed Building Blocks Validation

As the result of the final workshop, the panel of experts as described in Section 2.1 reported their feedback about each choice made to map the DSSC building blocks to the available standards, and components used in the projects and current pilots, through a form. These results were used as a base for a discussion to find a consensus over such a mapping and extension. It was the opportunity to clarify which components could be interpreted in different ways, if the mapping was aligned to general experience of standards and interoperable systems, as well as with the similar discussions in national and international venues, in which the panel experts are taking part.
Finally, as the DSSC blueprint evolved in parallel, and, although the study had started at the end of 2023 on the base of the DSSC Blueprint v.0.5, published in September 2023, we compared the results of the discussion with the updated DSSC Blueprint v.1.0 structure and definitions, which was published in March 2024, and applied the needed adjustments.
Figure 7 depicts the result of the DSSC data spaces building blocks extension based on the mapping done (below), compared to the DSSC Blueprint v.1.0 technical building blocks (above).
In Table 2, the definitions of each building block and the possible changes with respect to the DSSC Blueprint are reported, as well as the reasons behind them.
Figure 7. Data spaces building blocks stack, coming from the DSSC blueprint (above) and as resulting from the final comments and results of the discussion in this paper (below).
Figure 7. Data spaces building blocks stack, coming from the DSSC blueprint (above) and as resulting from the final comments and results of the discussion in this paper (below).
Preprints 113452 g007

3.3. Available Standards for Data Spaces Building Blocks

The standards and solutions proposed by different organisations and institutions were mapped to the integrated building blocks stack (Annex X) and can be considered as an initial reference catalogue for data spaces developers to address each relevant data space aspect by choosing among a set of available solutions.
The focus of the mapping reported in the Annex is especially on open and international standards. The list reported is intended an initial and provisional mapping, which can be improved with additional discussion and clarifications by the mentioned organisations about their own standards and solutions, which might be still under development in some cases. Ideally, the mentioned organisations, and possibly others, could agree in refining such mapping and maintain it through time by means of a joint collaboration, for the advantage of all of them and of data spaces in general.
It is hard to comment and interpret the standards mapping to the building blocks, although from reading the tables in Annex X it is clear how diverse the situation is for different building blocks. In some cases, several well-known and mature standards are available (e.g. data models, data encodings), while others are still quite new and can only count on rather general high level guidelines, or initial solutions still to be deeply tested and improved (e.g. data requirements specification, trust framework). According to this, it is possible to grasp from the tables an initial idea of where the major gaps with respect to data spaces solutions are, as well as which organisations can already provide skills related to the different issues, but we are not yet able to draw conclusions, because deeper insights and tests would be necessary for each building block.
However, it is possible to see from the tables how standards coming from geospatial information related organisations, which are tackling the interoperability problem from a long time, due to the nature of their use cases, can offer an extensive set of solutions especially for all the building blocks related to the category ‘Data Specification for FAIRness’: Data Models, Data Exchange – Encodings; Data Descriptions (metadata); Data Requirements and quality Schemas; Data provenance model. For ‘Data Sovereignty and Trust’, although some solutions are available, and the current developments are addressing the needs stated by the building blocks in such category, other organisations can provide a wider set of solutions. An extensive set of standards and solutions is again available for the building blocks in the category ‘Data Value Enhancement’.
To improve the reported mapping and integrate this study, it would be useful to provide a discussion over the standards available and solutions proposed, which can be very diverse in terms of reference technology, maturity stage, level of uptake, for different reasons. However, agreed criteria and a measurement matrix would be necessary to compare them to each other, in order to guide the users through the whole list. For this reason, an initial list is proposed in the next Section 3.4, to be improved and discussed in future activities, preferably by a joint standardisation organisation working group.

3.4. Measuring Standards Quality and Uptake

To further guide users and data spaces developers in the choice of suitable standards, some indications would be useful about the quality of the standards, supported technology and level of uptake (e.g. available compliant datasets, available supporting software, supported use cases in research or in operational environment and so on).
When choosing a set of technical standards from many options, with varying degrees of overlap and interdependencies, it is essential to apply a systematic approach to evaluate the quality of standards and their appropriateness for collective adoption. Standards quality is separate, but related to, uptake and maturity. All three aspects are factors in the evaluation of, and decision to reuse, specific standards. It should be noted that, whilst quality is not the ultimate arbiter of uptake, adoption by the developer community is a key enabler and driver of solutions, and is usually driven by some aspect that is perceived as providing advantages – e.g. quality of design.
Therefore, we should explore standards quality from the perspective of Data Spaces requirements, and then follow up to explore how best to allow this quality to be seen as attractive to the developer community.
It is worth noting that the ultimate goal of the data interoperability and management principles (Section 1.1) is realised through data reuse itself. The evaluation processes that trigger reuse are supported by the specific principles, but other factors also play a significant role. For data this includes design factors, such as how the data is gathered and how appropriate this is to the end use requirements. To achieve this ultimate goal, significant attention must be paid to the FAIR principle I1: “(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.” This has implications for the way metadata are handled: it is not possible to specify metadata standards for all possible aspects of all possible data sets. Thus, it is recommendable to provide extensible graph solutions that can adapt to the available forms of metadata, even if some aspects of the metadata need to be standardised to meet data space requirements.
This language must be expressive enough to convey, in a standardised and recognisable way, the details required for evaluation. The technical implication is that syntactical languages such as JSON, RDF serialisations, XML etc. are necessary but not sufficient – the description of the data itself must also be standardised, and in turn this means that common aspects must be standardised, as per FAIR R1.3. “(Meta)data meet domain-relevant community standards”.
For data spaces to function as intended, the interoperability of standards for data and function description thus becomes paramount. As no two datasets or functions are identical, it follows that such standards need to be composable into application-specific descriptions from standardised components. In addition, the composition process needs to be standardised, hence the concept of such components as reusable modules (see for example the OGC Location Building Blocks [66]).
Some approaches to standards assessment have been proposed in the past. The most widely-known in the European context is the ‘Common Assessment Method for Standards and Specifications’ (CAMSS) [67] provided by the European Commission. This is intended to guide public authorities in the choice of suitable standards through an online questionnaire, supporting the European Interoperability Framework recommendations. Other examples come from the Open Data Initiative [68] organisation, which made some investigations interviewing users and experts, and drafted guidelines for choosing open standards [69].
Building on these examples and on the experience matured in the field of standards development, in this Section we propose initial criteria that might be useful to present standards transparently and quantitatively (Table X), so that similar ones can be more easily compared and chosen by the users, facilitating an effective data space development. This might also support discussion and collaboration between different standardisation organisations, to complement each other or to support reciprocal compliance for the sake of mutual advantage.
In future work, we will further test and investigate these criteria, to provide a more robust and agreed reference matrix.
Table 3. Proposed criteria to describe and assess standards
Table 3. Proposed criteria to describe and assess standards
Relevance and Scope Fit for Purpose: Ensure that the standards align with the specific needs and objectives of the domain. Adoption within a domain is an obvious indicator of quality, but should be qualified by any activities to improve such standards based on experience of usage.
Scope Coverage: Evaluate whether the standards cover all necessary areas without significant gaps. Overlapping standards should be assessed to see if they offer complementary benefits or if they introduce redundancy.
Alignment: have alignments between related standards in use or proposed been published and available for re-use (noting that transformations of data are a significant overhead in most re-use scenarios, availability of tested transformation mechanisms make it easier to combine different standards in practice).
Flexibility and Extensibility Modularity: Building Block composition mechanisms must be explicit and supported by tooling to realise the principle of reuse. Such mechanisms must include explicit traceability of interoperability design, through transparent and standardised description of building block dependencies. Furthermore, such building blocks need to be composable into larger building blocks that use the same composition mechanisms, to allow the rich metadata required to evaluate and reuse resources to be assembled and understood. This has been a critical weakness of formal standardisation processes - leading to many alternative ad-hoc approaches to standardisation in application profiles,
Adaptability: Standards should be flexible enough to accommodate future changes and extensions. Avoid overly rigid standards that may hinder innovation or adaptation to new requirements
Special vs. General Solutions: Prefer standards that provide flexible, general solutions over those offering special case solutions, unless the special case is critical and cannot be addressed adequately by the general standard.
Interoperability and Integration Compatibility: Standards should work well together and integrate smoothly, minimizing the need for custom adapters or significant modifications.
Interdependencies: Assess the interdependencies between standards. Strongly interdependent standards should be adopted together only if they provide a cohesive, integrated solution.
Transparency: Machine readable declarations of compatibility and interdependencies allows for cost-effective and scalable testing and reuse of integrated suites of standards
Simplicity and Clarity Simplicity: Choose standards that support simplification through encapsulation – the ability to test and compose arbitrarily complex complete solutions from simple components
Ease of Understanding: Standards should be clear, well-documented, and easy to understand. Complex standards can lead to misinterpretation and implementation errors.
Examples: Standards should be supported by clear examples to allow practitioners to easily understand scope of standards. Examples should be tested to conform to standards, as discrepancies are common and cause significant confusion.
Support and Ecosystem Tool Support: Consider the availability of software tools and libraries that support the standards. Robust tool support can significantly ease implementation and maintenance.
Community and Vendor Support: Evaluate the level of community and vendor support. Widely adopted standards with active communities and strong vendor backing are often more reliable and future-proof.
Maturity and Stability Proven Track Record: Mature standards that have been widely adopted and tested in various contexts are usually more reliable.
Stability: Prefer standards that are stable and have a clear roadmap for updates and maintenance. Frequent changes can disrupt development and integration processes.
Compliance and Security Regulatory Compliance: Ensure that the standards comply with relevant regulations and industry best practices.
Security: Assess the security implications of the standards. They should support the implementation of secure systems and not introduce vulnerabilities.
Cost and Resource Considerations Implementation Cost: Consider the cost of implementing and maintaining the standards, including licensing fees, if any.
Resource Availability: Ensure that the necessary skills and resources are available for adopting and maintaining the standards.
When evaluating and selecting a set of standards, a structured approach will support sustainability:
  • Requirements Analysis: Define the specific needs and objectives the standards must meet.
  • Standards Identification: Identify potential standards and gather detailed information about each.
  • Evaluation and Comparison: Apply the principles in Table X to evaluate and compare the standards. A matrix will be proposed in future work to assess each standard against these criteria.
  • Integration Assessment: Examine how the standards will work together, considering interdependencies and potential conflicts.
  • Pilot Implementation: Conduct a pilot implementation to test the chosen standards in a real-world scenario.
  • Decision and Adoption: Based on the evaluation and pilot results, make an informed decision and proceed with the adoption of the selected standards.
By carefully considering these principles, organizations can choose a set of technical standards that are high-quality, appropriate for their needs, and conducive to building a robust, scalable, and maintainable system.

3.5. Discussion

The work described in this paper provides guidance considering the major efforts in the data spaces developments domain, bridging them to the international standardisation activities and agreed principles for good data management and sharing. The resulting configuration of the data spaces building blocks stack reflects the current granularity of issues as addressed by the single categories of standards and recognised data sharing and management principles, as well as the kinds of components necessary to address the data spaces needs.
In the intermediate versions of the building block stack, as worked out in the Green Deal data spaces projects (USAGE and AD4GD), a building block and a respective category, related to the software and services was added, which are intended to use and process the data. This was ultimately removed in order to remain closely aligned to the latest DSSC proposal, which includes a huge set of services, having any kind of scope in the ‘Added value services’ building block. In a future revision, this building block will probably need to be specified further to act as concrete support for planning the range of services necessary to a data space. In the AD4GD version, in particular (Section 3.2) the Digital Twins concept was included to encapsulate processing capabilities some time attributed to data spaces, as a way to add to the overall stack an interesting conceptualisation to relate the two paradigms. However, the reference to Digital Twin was also later removed, because of the fair opinion of some of the panel experts about the need for a higher level of alignment with documents and conceptualisations specifically on digital twins. However, in future elaborations, in collaboration with the Digital Twins domain, it would be useful to revisit this proposal and investigate the connections and interactions between cross-domain building blocks, to make the data spaces and digital twins reciprocally stronger and consistent.
Tables reported in Appendix A, mapping the available standards for each building block, can be considered as a reference to support the development and description of data spaces solutions as well as a tool to ensure that the FAIR, GEO DMP and European Interoperability Framework recommendations are respected as far as possible.
The mapping reflects the current status of standards. Many parts are still empty or could be integrated. Additional standardisation organisations not involved in this initial effort can later integrate the overview with their solutions, as well as other projects and initiatives. The organisations involved may decide to collaborate in the future to maintain such a catalogue in a joint and shared effort, directly related to standardisation organisations. In this way, the most recent updates in terms of standards and proposed solutions could be reflected and the new status in technologies offer, rapidly changing, could be flexibly represented for the advantage of data spaces designers, developers and users.

3.6. Conclusion

The landscape of blueprints and reference architectures for data spaces has become huge in the last years, although coming from rather recent projects and initiatives. For this reason, the study considered the different offers for a conceptualisation of data spaces structures and components and mapped them to the well-established standards for data interoperability and sharing, as well as guiding principles acknowledged at international level. Some integrations to the DSSC were proposed, together with a mapping to the interested standards and solutions available.
In addition, some criteria were proposed that might support the users and developers in their assessment of and choice between available standards, to address each building block according to their own needs.
The work is certainly very complex, and is situated in a very dynamic field, requiring extensive collaboration among different organisations. For this reason, it can be only a foundation stone for a more robust overview and agreed blueprint, which we hope will be developed in the future through wider and sustainable collaborations. However, it was critical to start bridging the different initiatives outlining the relatively novel concept of data spaces with the current standards supporting the well-established principles and concepts related to interoperability and positive sharing of the data through the web.
Future work will improve, systematise and maintain this conceptualisation and mapping in collaboration with other organisations. Moreover, the organisational and business building blocks and related challenges will need to be considered and investigated in the near future to provide a complete framework.

Author Contributions

Conceptualization, F.N.; methodology, F.N., R.A., L.B. and I.S.; validation, R.A., L.B., J.M., I.M., A.V., M.V. and P.Z.;investigation, F.N.;writing—original draft preparation, F.N. and R.A.; writing—review and editing, L.B., J.M., I.S., A.V., M.V. and P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the USAGE project under the European Union’s Horizon Europe programme - Grant Agreement No.101059950 and by the AD4GD project, cofunded under the European Union’s Horizon Europe programme - Grant Agreement No.101061001, Switzerland and the United Kingdom.

Informed Consent Statement

Not applicable

Data Availability Statement

No new data were created or analyzed in this study. All the references to the reviewed documents are available in the text. Data sharing is not applicable to this article.

Acknowledgments

We’d like to acknowledge the panel of expert who has taken part in the discussions over the data spaces building blocks: Linda van den Brink, Bart de Lathouwer, Giacomo Martirano

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Mapping of Available Standards to Data Spaces Building Blocks

Table A1 to Table A2416, reported in the Appendix A.1, Appendix A.2, Appendix A.3 and Appendix A.4 map the useful solutions and standards from the different reference initiatives and standardisation organisations to the Data Spaces building blocks as described in this paper (Section 3.1.1). The order of institutions in the left column follows the order of start of activities of each organisation.
After each table reporting the mapping of standards for each building block, another table reporting the addressed good data sharing and management principles mentioned in Section 1.1 (i.e., FAIR principles, GEOSS Data Management Principles - DMP - and the European Interoperable Framework - EIF principles and recommendations) is reported.

Appendix A.1. Technical Building Blocks – Data Specification Enabling FAIRness

Table A1, Table A2, Table A3, Table A4, Table A5, Table A6, Table A7 and Table A8 report the mapping related to building blocks in the category ‘Data Specification enabling FAIRness’.
Table A1 reports the mapping of standards to the ’Data Models’ building block, defined as: "The model provides semantics and a shared vocabulary, as well as a structure for the data (hierarchies and relationships)".
Table A1. Standards and solutions for the Data Models building block.
Table A1. Standards and solutions for the Data Models building block.
Ref Specification / implementation(s) recommended
W3C SSN-SOSA
OGC CityGML/CityJSON, LandInfra, IndoorGML, Indoor Mapping Data Format (IMDF), MUDDI, PipelineML, WaterML, Augmented Reality ML (ARML), SensorThings API data model, SWE common Data Model, SensorML, Semantic sensor Network (SSN), STAplus, Time Ontology in OWL, TimeseriesML, WaterML, GeoPose, Geoscience Markup Language (GeoSciML), Zarr (https://portal.ogc.org/files/100727), GroundwaterML, network Common Data Form (netCDF) standards suite, Observations, Measurements and Samples
GEO et al. Essential Variables (https://www.earthdata.nasa.gov/learn/backgrounders/essential-variables#:~:text=Essential%20variables%20(EV)%20are%20variables,facet%20of%20the%20Earth%20system). Topics/domains: Climate; Ocean; Biodiversity; Geodiversity; Agriculture
INSPIRE INSPIRE Themes and UML model (https://inspire.ec.europa.eu/Themes/Data-Specifications/2892)
OASC MIM2 – Data Models - Smart data Models; NGSI-LD compliant data models for aspects of the smart city have been defined by organisations and projects, including OASC, FIWARE, GSMA and the SynchroniCity project and there is an ongoing joint activity of TM Forum and FIWARE to specify more. Existing data models and ontologies, e.g. the SAREF (Smart Applications REFerence ontology) standard by ETSI/oneM2M, can be mapped for use with NGSI-LD by identifying what are entities, properties and relationships, which can be managed and requested by the NGSI-LD API. oneM2M base ontology (that is compatible with SAREF). Additionally, oneM2M provides the means to instantiate ontologies as a means to provide semantic descriptions of the data exchanged (through the use of metadata). The extension SAREF4Cities provides an ontology focused on smart cities. Core vocabularies of ISA like Core Public Service Vocabulary Application Profile used as the basis for the Single Digital Gateway Regulation that touches local governments, Core Person, Core Organization etc. DTDL is the Digital Twin Definition Language developed by Microsoft. This language is based on top of JSON-LD and the existing FIWARE data models are converted in this format. MIM7 - Places
DSBA SmartDataModels
IDSA IDS RA – Functional Layer – Ecosystem of Data – Vocabularies; Information Layer - Hexagon of concerns
In addition, standardisation domain organisations would likely propose their own data models, ontologies and vocabularies for the specific domains and applications of interest (e.g. the CIDOC-CRM (https://cidoc-crm.org/) for cultural heritage and many more). Extension mechanisms can be foreseen by the different standards, in case the existing data model or previous related extensions would not be sufficient. However, usually only a profile of the provided comprehensive domain data model is necessary, or a combination of profile and extension. Therefore, it is recommended to document them properly, in machine readable format, and associate the resulting data model to datasets through metadata. For example, the OGC Data Exchange Toolkit is intended to support it [21].
Table A2. Principles addressed by the Data Models building block.
Table A2. Principles addressed by the Data Models building block.
FAIR principles EIF
R1.3: (Meta)data meet domain-relevant community standards Recommendation 4 (Openness): Give preference to open specifications, taking due account of the coverage of functional needs, maturity and market support and innovation.
Table A3 reports the mapping of standards to the ’Data Exchange - Encodings’ building block, defined as: "Encoding is the format in which the data are encoded".
Table A3. Standards and solutions for the Data Exchange - Encodings building block.
Table A3. Standards and solutions for the Data Exchange - Encodings building block.
Ref Specification / implementation(s) recommended
ISO SQL, JSON
W3C RDF (RDF/XML, Turtle, JSON-LD), SPARQL, OWL
OGC 3D Tiles, Cloud Optimised GeoTIFF, CoverageJSON, GML in JPEG2000, GeoPackage, GeoSPARQL, GML, GeoTiff, I3S, Hierarchical Data Format Version 5 (HDF5), KML, LAS, Moving Features, SWE Service Model Implementation Standard, Sensor Observation Service SOS, WKT CRS, Simple Features, OpenGeoSMS, GeoXACML
Table A4. Principles addressed by the Data Exchange - Encodings building block.
Table A4. Principles addressed by the Data Exchange - Encodings building block.
FAIR principles DMP EIF
I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation DMP-3 (Usability). Data will be structured using encodings that are widely accepted in the target user community and aligned with organizational needs and observing methods, with preference given to non-proprietary international standards. Recommendation 9 (Technological neutrality and data portability): Ensure data portability, namely that data is easily transferable between systems and applications supporting the implementation and evolution of European public services without unjustified restrictions, if legally possible.
Table A5 reports the mapping of standards to the ’Data descriptions (metadata)’ building block, defined as: "Metadata, technical guidance and schemas for describing datasets, exchanged within a data space between providers and recipients. Metadata allow consistent data retrieval, generation or reuse, ensuring reliability in the results for which data are used as input and related decision making process."
Table A5. Standards and solutions for the Data descriptions (metadata) building block.
Table A5. Standards and solutions for the Data descriptions (metadata) building block.
Ref Specification / implementation(s) recommended
ISO ISO 19115, ISO 15836 Dublin core
W3C DCAT
OGC GeoDCAT – 3dP, EO Dataset Metadata GeoJSON(-LD) Encoding Standard (EO-GeoJSON)
INSPIRE INSPIRE metadata (based on ISO19115, ISO19119 and ISO 15836 (Dublin Core)
OASC MIM1 - Context, MIM7 - Places
IDSA IDS Reference Architecture – Functional Layer – Ecosystem of Data – Data source description; Information Layer
Gaia-X Federated catalogue – Self description
SIMPL Self-description (ID SECAV-FUNC-002-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l2-detailed-requirement/attributes-self-description-dataset)
Table A6. Principles addressed by the Data descriptions (metadata) building block.
Table A6. Principles addressed by the Data descriptions (metadata) building block.
FAIR principles DMP
F2: Data are described with rich metadata
R1: (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.3: (Meta)data meet domain-relevant community standards
I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
I2: (Meta)data use vocabularies that follow the FAIR principles
I3: (Meta)data include qualified references to other (meta)data
DMP-4 (Usability). Data will be comprehensively documented, including all elements necessary to access, use, understand, and process, preferably via formal structured metadata based on international or community-approved standards. To the extent possible, data will also be described in peer-reviewed publications referenced in the metadata record.
Q - quality: Data should be of sufficient quality for the user’s task (extension to the FAIR principles [7]). DMP-6 (Usability). Data will be quality-controlled and the results of quality control shall be indicated in metadata; data made available in advance of quality control will be flagged in metadata as unchecked.
F1: (Meta) data are assigned globally unique and persistent identifiers.
F3: Metadata clearly and explicitly include the identifier of the data they describe
DMP-10 (Curation). Data will be assigned appropriate persistent, resolvable identifiers to enable documents to cite the data on which they are based and to enable data providers to receive acknowledgement of use of their data.
Table A7 reports the mapping of standards to the ’Data Requirements and Quality Schemas’ building block, defined as: "Data Requirements Schemas definition: Data requirements specification is essential for a successful data retrieval or generation, leading to reliable results for use cases. Referring to standardised schemas to specify data requirements allows interoperability between systems".
Table A7. Standards and solutions for the Data Requirements and Quality Schemas building block.
Table A7. Standards and solutions for the Data Requirements and Quality Schemas building block.
Ref Specification / implementation(s) recommended
ISO ISO19131 on Data Product Specification, ISO/IEC25012 on Data Quality Model (https://iso25000.com/index.php/en/iso-25000-standards/iso-25012)
OGC Schema used by the Data Exchange Toolkit [21]. It addresses a complementary part of ISO19131, to support data requirements definition and semantic validation. Other experiments are trying to address the issue starting from profiling the PROV vocabulary, originally intended to represent provenance information (https://github.com/ogcincubator/prov-cwl/tree/master)
Committee on Earth Observation Satellites (CEOS) (https://ceos.org/ard/) Analysis Ready Data Framework, currently planned to be extended for being applied to geospatial data within OGC (https://www.ogc.org/press-release/ogc-forms-new-analysis-ready-data-standards-working-group/)
SIMPL Quality dimension and quality rules (ID SEGOA-FUNC-012-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l2-detailed-requirement/quality-dimension-and-quality-rules-0)
There is no specific mapping of the interoperability-related principles to this building block. However, it addresses very similar needs than metadata and data descriptions, although being from a different point of view: metadata document existing datasets, while data requirements specification describe the needs of use cases to be matched to metadata for an efficient data retrieval or generation.
In addition, the extension to the FAIR principles [7] includes the ‘Q - Quality’ principle, referring in turn to other best practices proposed by the W3C-OGC working group ‘Spatial data on the Web’, among which ‘Best practice 6: Provide data quality information’ (https://www.w3.org/TR/dwbp/#DataQuality) and ‘Best practice 21: Provide data up to date’ (https://www.w3.org/TR/dwbp/#AccessUptoDate).
Table A8 reports the mapping of standards to the ’Data Provenance Model’ building block, defined as: "Data Provenance model definition: Standards intended to represent and document provenance and lineage of data".
Table A8. Standards and solutions for the Data Provenance Model building block.
Table A8. Standards and solutions for the Data Provenance Model building block.
Ref Specification / implementation(s) recommended
ISO ISO19115 geospatial lineage model
W3C PROV-O (https://www.w3.org/TR/prov-o/)
OGC OGC Provenance chains (https://ogcincubator.github.io/bblock-prov-schema/build/generateddocs/slate-build/ogc-utils/prov/index.html?json#examples), W3C PROV-O extension
OASC MIM5 - Transparency
IDSA Part of the information layer – hexagon of concerns
Others (reported by DS4SSCC) ETSI-CIM; DACT-AP – property ‘provenance’
Table A9. Principles addressed by the Data Provenance Model building block.
Table A9. Principles addressed by the Data Provenance Model building block.
FAIR principles DMP
R1.2: (Meta)data are associated with detailed provenance DMP-5 (Usability). Data will include provenance metadata indicating the origin and processing history of raw observations and derived products, to ensure full traceability of the product chain.

Appendix A.2. Technical Building Blocks – Data Sovereignty and Trust

Table A10, Table A11, Table A12, Table A13, Table A14 and Table A15 report the mapping related to building blocks in the category ‘Data Sovereignty and Trust’.
Table A10 reports the mapping of standards to the ’Data Policies’ building block, defined as: "Either an offer from the data provider or an agreement between the data provider and the data recipient."
Table A10. Standards and solutions for the Data Policies building block.
Table A10. Standards and solutions for the Data Policies building block.
Ref Specification / implementation(s) recommended
W3C W3C Open Digital Rights Language (ODRL) (https://www.w3.org/TR/odrl-model/), W3C Verifiable Credentials.
OGC RAINBOW for licences; GeoXACML
OASC MIM3 - Contracts
IDSA RA – Functional layer – Security Data Sovereignty – Usage Policies and Usage enforcement; RA – Functional layer – Data markets – Usage restrictions and governance; RA – Security perspective
Gaia-X Identity and Trust – Federated Access; Sovereign Data Exchange – Policies and Usage control
Others (reported by DS4SSCC) Standards: OASIS XACML (https://docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-os-en.html) Policy Definition Language; Industry Body Specifications: Rego, Open Policy Agent, JSON-LD; Implementations: i4Trust, Prometheus-X
Others Creative Commons
Some tools are provided by the European Commission to guide through the choice of a suitable licence for the specific needs for data sharing, through the Joinup Licensing Assistant (https://joinup.ec.europa.eu/collection/eupl/solution/joinup-licensing-assistant/jla-find-and-compare-software-licences?etrans=fr See an explanation of the tool at https://www.youtube.com/watch?v=DhEhKtlsjQ0). Other websites (For example, https://choosealicence.com) provide guidance in the specific case an open licence is needed.
Table A11. Principles addressed by the Data Policies building block.
Table A11. Principles addressed by the Data Policies building block.
FAIR principles DMP
R1.1: (Meta)data are released with a clear and accessible data usage licence. DMP-1b (Discoverability). [...] and data access and use conditions, including licences, will be clearly indicated.
Table A12 reports the mapping of standards to the ’Access and usage control and management’ building block, defined as: "Mechanisms in place to ensure access and usage policies related to certain data are respected".
Table A12. Standards and solutions for the Access and usage control and management building block.
Table A12. Standards and solutions for the Access and usage control and management building block.
Ref Specification / implementation(s) recommended
W3C W3C Web Access Control (WAC) (https://www.w3.org/wiki/WebAccessControl)
OGC OGC APIs rely on the access control from the underlying OpenApi mechanisms (https://docs.ogc.org/is/19-072/19-072.html#rc_oas30-security), included in service model (STA+)
OASC MIM3 - Contracts
IDSA RA – Functional layer – Security Data Sovereignty – Usage Policies and Usage enforcement; RA – Functional layer – Data markets – Usage restrictions and governance; RA – Security perspective
Gaia-X Identity and Trust – Federated Access; Sovereign Data Exchange – Policies and Usage control + Logging service and Data agreement service
Others (reported by DS4SSCC) Industry Body Specifications: Rego, Open Policy Agent, JSON-LD Implementations: i4Trust, Prometheus-X
SIMPL Federated Authentication (ID SEGOA-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/federated-authentication)
Table A13 reports the mapping of standards to the ’Identity and Attestation Management’ building block, defined as: "Information provided on the relevant entities must be verifiable to enable the onboarding and offboarding processes. The trustworthiness of information is linked to the trustworthiness of the Trust Anchors (or Trust Service Providers, specifically for identities), who are entitled to issue the respective attestations" (https://dssc.eu/space/BVE/357075352/Identity+and+Attestation+Management).
Table A13. Standards and solutions for the Identity and Attestation Management building block.
Table A13. Standards and solutions for the Identity and Attestation Management building block.
Ref Specification / implementation(s) recommended
W3C W3C Decentralized Identifiers (DID) (https://www.w3.org/TR/did-core/)
OASC MIM4 - Trust, MIM6 - Security
IDSA RA – functional layer – Trust – Identity Management + user certification; RA – Functional layer – Security & Data Sovereignty – Authentication & Authorisation; RA – Security perspective + Certification perspective
Gaia-X Identity and Trust – Federated Identity Management; Sovereign Data Exchange – logging service; Compliance – Onboarding and certification
Others (reported by DS4SSCC) Standards: LDAP OAUTH2 X.500 X.509; Industry body specifications: CEF eID, OpenID Connect; SAML 2.0; SOLID
Table A14. Principles addressed by the Identity and Attestation Management building block.
Table A14. Principles addressed by the Identity and Attestation Management building block.
FAIR principles
A1.2: The protocol [for metadata publication] allows for an authentication and authorisation procedure where necessary
Table A15 reports the mapping of standards to the ’Trust Framework’ building block, defined as: "Verification that a participant in a data space adheres to certain rules and a common set of standards." (https://dssc.eu/space/BVE/357075333/Data+Sovereignty+and+Trust).
Table A15. Standards and solutions for the Trust Framework building block.
Table A15. Standards and solutions for the Trust Framework building block.
Ref Specification / implementation(s) recommended
W3C Verifiable Credentials (https://www.w3.org/TR/vc-data-model-2.0/)
OGC OGC Web Services Security OASC MIM4 - Trust, MIM6 - Security
IDSA RA – Functional layer – Security & Data Sovereignty – Trustworthy communication; + Security by design; + Technical certification; RA – Security perspective + Certification perspective
Gaia-X Identity and Trust – Trust Management; Compliance – Relation between Providers and consumers + Rights and obligations of participants
Others (reported by DS4SSCC) Standards: EUDI; Industry body specifications: EBSI; Reference implementations: European Blockchain, i4Trust
Table A16. Principles addressed by the Trust Framework building block.
Table A16. Principles addressed by the Trust Framework building block.
EIF
Recommendation 15 (Security and privacy): Recommendation 15 (Security and privacy): Define a common security and privacy framework and establish processes for public services to ensure secure and trustworthy data exchange between public administrations and in interactions with citizens and businesses.
Table A17 reports the mapping of standards to the ’Sharing Traceability’ building block, defined as: "Standards and services intended to keep track of the data processing and sources along their lifecycle."
Table A17. Standards and solutions for the Sharing Traceability building block.
Table A17. Standards and solutions for the Sharing Traceability building block.
Ref Specification / implementation(s) recommended
OASC MIM5 - Transparency
Others (reported by DS4SSCC) ETSI-CIM; DACT-AP – property ‘provenance’

Appendix A.3. Technical Building Blocks – Data Value Enhancement

Table A18, Table A19, Table A20, Table A21 and Table A22 report the mapping related to building blocks in the category ‘Data Value Enhancement’.
Table A18 reports the mapping of standards to the ’Vocabulary Services’ building block, defined as: "Services intended to leverage vocabularies for the related functionalities."
Table A18. Standards and solutions for the Vocabulary Services building block.
Table A18. Standards and solutions for the Vocabulary Services building block.
Ref Specification / implementation(s) recommended
W3C ?
OGC ?
OASC ?
IDSA ?
... ?
Table A19 reports the mapping of standards to the ’Data Exchange - Communication (APIs)’ building block, defined as: "The data exchange building block focuses on data transmission once the conditions for interchange authorisation are met".
Table A19. Standards and solutions for the Data Exchange - Communication (APIs) building block.
Table A19. Standards and solutions for the Data Exchange - Communication (APIs) building block.
Ref Specification / implementation(s) recommended
W3C W3C APIs (https://api.w3.org/doc)
OGC OGC APIs (https://ogcapi.ogc.org/), OGC Web Services.
OASC MIM1 – context; MIM7 - Places
IDSA Connectors
Others (reported by DS4SSCC) NGSI-LD; LDES MQTT JSON-LD
Table A20 reports the mapping of standards to the ’Metadata Publication and discovery’ building block, defined as: "The purpose of the publication and discovery building block is to provision and discover metadata of data, services and offerings in a data space".
Table A20. Standards and solutions for the Metadata Publication and discovery building block.
Table A20. Standards and solutions for the Metadata Publication and discovery building block.
Ref Specification / implementation(s) recommended
W3C ?
OGC OGC Catalogue Service (https://www.ogc.org/standard/cat/), OGC API Records, Cat:ebRIM App Profile: Earth Observation Products (https://www.ogc.org/standard/cat2eoext4ebrim/)
OASC MIM1 - Context, MIM3 - Contracts
IDSA IDS Reference Architecture – Functional Layer – Ecosystem of Data – Brokering
Gaia-X Federated Catalogue – Catalogue Management Functions
Others (reported by DS4SSCC) ICT Innovation Network reference architecture, DCAT-AP, JSON-LD
SIMPL Catalogues of Data/Applicaton/Infrastructure (ID SECAV-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/catalogues-dataapplicationinfrastructure)
Table A21. Principles addressed by the Metadata Publication and discovery building block.
Table A21. Principles addressed by the Metadata Publication and discovery building block.
FAIR principles DMP EIF
F4: (Meta)data are registered or indexed in a searchable resource DMP-1a (Discoverability). Data and all associated metadata will be discoverable through catalogues and search engines
A2: Metadata should be accessible even when the data is no longer available
A1: (Meta)data are retrievable by their identifier using a standardised communication protocol
A1.1: The protocol is open, free and universally implementable
DMP-2 (Accessibility). online services, including, at minimum, direct download but preferably user-customizable services for visualization and computation.
Recommendation 5 (Transparency): Ensure internal visibility and provide external interfaces for European public services.
Table A22 reports the mapping of standards to the ’Value added services’ building block, defined as: "any kind of processing, as service, is included". (https://dssc.eu/space/BVE/357076468/Value-Added+Services)
Table A22. Standards and solutions for the Value added services building block.
Table A22. Standards and solutions for the Value added services building block.
Ref Specification / implementation(s) recommended
W3C ADMS
ISA/ISA2/SEMIC ADMS-AP
OGC Coordinate transformation Service, GeoAPI, LocationService (OpenLS), Open Model Interface (OpenMI), RAINBOW, Filter Encoding, Styled Layer Description, Symbology Encoding, Geospatial User Feedback (GUF)
OASC MIM3 Basic Data Marketplace Enablers SynchroniCity_D2.4.pdf
IDSA RA – functional layer – Data markets – Clearing and billing
SIMPL UI and API for defining data quality rules (ID SEGOA-FUNC-012) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/ui-and-api-defining-data-quality-rules), Data quality assessment (ID SEARE-FUNC-017) (https://futurium.ec.europa.eu/en/simpl/l1-high-level-requirement/data-quality-assessment)
Table A23. Principles addressed by the Value added services building block.
Table A23. Principles addressed by the Value added services building block.
EIF
Recommendation 6 (Reusability): Reuse and share solutions, and cooperate in the development of joint solutions when implementing European public services.
In the last version of the DSSC Blueprint [20], several kinds of services, serving rather different purposes, are gatherers into the ’Value added services’ building block, including the previous ‘Marketplaces’, which has disappeared in such an updated version as a building block per se. From DSSC definitions [20], it seems to include any kinds of software or services intended to use the data (data processing, management and analysis software, services and apps) (https://dssc.eu/space/BVE/357076468/Value-Added+Services?attachment=/download/attachments/357076468/image-20240301-090319.png&type=image&filename=image-20240301-090319.png). They are also foreseen in the IDSA Reference Architecture, in the functional layer, under ‘value adding apps’ group, including: data processing and transformation; Data app implementation; Providing Data Apps; installing and supporting data apps.

Appendix A.4. Technical Building Blocks – Services FAIRness

Table A24 reports the mapping of standards to the ’Services Description (metadata)’ building block, defined as: "Metadata, technical guidance and schemas for describing services chosen as components of a data space architecture".
Table A24. Standards and solutions for the Services Description (metadata) building block.
Table A24. Standards and solutions for the Services Description (metadata) building block.
Ref Specification / implementation(s) recommended
OGC OGC API Processes, Web Processing Service, Web Coverage Processing Service
Gaia-X Gaia-X Labels
SIMPL Attributes of a self-description for an application (ID SECAV-FUNC-002-FUNC-001) (https://futurium.ec.europa.eu/en/simpl/l2-detailed-requirement/attributes-self-description-application)

References

  1. Heath, T.; Bizer, C. Linked data: Evolving the web into a global data space (Vol. 1), 3rd ed.; Morgan & Claypool Publishers, 2011. [Google Scholar]
  2. Halevy, A.; Franklin, M.; Maier, D. Principles of dataspace systems. In Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 2006; pp. 1–9. [Google Scholar]
  3. Kasamani, S.B.; Lukandu, A.I.; Gregory, W. Modelling Dataspace Entity Association Using Set Theorems. Computer Technology and Application 2012, 3. [Google Scholar]
  4. Zuiderwijk, A.; Janssen, M.T. Open data policies, their implementation and impact: A framework for comparison. Government information quarterly 2014, 31, 17–29. [Google Scholar] [CrossRef]
  5. Braunschweig, K.; Eberius, J.; Thiele, M.; Lehner, W. The state of open data. Limits of current open data platforms 2012, 1, 72. [Google Scholar]
  6. Huijboom, N.; Van den Broek, T. Open data: an international comparison of strategies. European journal of ePractice 2011, 12, 4–16. [Google Scholar]
  7. Tandy, J.; van den Brink, L.; Barnaghi, P.; Homburg, T. Spatial Data on the Web Best Practices; OGC, W3C, 2023. Available online: https://www.w3.org/TR/2023/DNOTE-sdw-bp-20230919/ (accessed on 8 July 2024).
  8. Abhayaratna, J.; Daemen, E.; Janowicz, K.; Parsons, E.; Smith, R.; Verschoor, F. The Responsible Use of Spatial Data; OGC, W3C, 2021. Available online: https://www.w3.org/TR/responsible-use-spatial/ (accessed on 8 July 2024).
  9. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Mons, B. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 2016, 3, 1–9. [Google Scholar] [CrossRef] [PubMed]
  10. EU. New European Interoperability Framework – Promoting seamless services and data flows for European public administrations; ISBN 978-92-79-63756-8. [CrossRef]
  11. Lin, D.; Crabtree, J.; Dillo, I.; Downs, R.R.; Edmunds, R.; Giaretta, D.; De Giusti, M.; L’Hours, H.; Hugo, W.; Jenking, R; Khodiyar, V.; Martone, M.E.; Mokrane, V.N.; Petters, J.; Sierman, B.; Sokolova, D.V.; Stockhause, M.; Westbrook, J. The TRUST Principles for digital repositories. Sci Data 2020, 7, 144. [Google Scholar] [CrossRef] [PubMed]
  12. Otto, B.; Steinbuß, S.; Teuscher, A.; Lohmann, S.; et al. Reference architecture model; Version 3; International Data Spaces Association, 2019; Available online: https://internationaldataspaces.org/wp-content/uploads/IDS-Reference-Architecture-Model-3.0-2019.pdf (accessed on 8 July 2024).
  13. Grothe, M. Exploring data space initiatives; Geonovum, 2023. Available online: https://www.geonovum.nl/uploads/documents/Exploring%20data%20space%20initiatives%20v0.52_EN%20publication%20version%20Geonovum.pdf (accessed on 8 July 2024).
  14. Ulrich, A.; et al. Design principles for Data Spaces, Version 1.0.; OpenDEI, International Data Spaces Association, 2021. (accessed on 8 July 2024). [CrossRef]
  15. Farrell, E.; Minghini, M.; Kotsev, A.; Soler-Garrido, J.; Tapsall, B.; Micheli, M.; Posada, M.; Signorelli, S.; Tartaro, A.; Bernal, J.; Vespe, M.; Di Leo, M.; Carballa-Smichowski, B.; Smith, R.; Schade, S.; Pogorzelska, K.; Gabrielli, L.; De Marchi, D. European Data Spaces: Scientific insights into data sharing and utilisation at scale, 3rd ed.; Publications Office of the European Union: Luxembourg, 2023; JRC129900. [Google Scholar] [CrossRef]
  16. Gaia-X. Gaia-X - Architecture document. 2022. Available online: https://gaia-x.eu/wp-content/uploads/2022/06/Gaia-x-Architecture-Document-22.04-Release.pdf (accessed on 8 July 2024).
  17. Gronlier, P.; Hierro, J.; Steinbuss, S. Technical convergence – Discussion Document; Data Spaces Business Alliance, 2023. Available online: https://data-spaces-business-alliance.eu/wp-content/uploads/dlm_uploads/Data-Spaces-Business-Alliance-Technical-Convergence-V2.pdf (accessed on 8 July 2024).
  18. DSSC. Blueprint, Version 0.5; 2023. Available online: https://dssc.eu/space/BPE/179175433/Data+Spaces+Blueprint+%7C+Version+0.5+%7C+September+2023 (accessed on 8 July 2024).
  19. Santoro, M.; Mazzetti., P. GREAT D3.2 Final Blueprint of the GDDS Reference Architecture. 2024. Available online: https://www.greatproject.eu/wp-content/uploads/2024/04/D3.2-Final-Blueprint-of-the-GDDS-Reference-Architecture.pdf (accessed on 8 July 2024).
  20. DSSC. Blueprint, Version 1.0; 2024. Available online: https://dssc.eu/space/BVE/357073006/Data+Spaces+Blueprint+v1.0 (accessed on 8 July 2024).
  21. Noardo, F.; Atkinson, R.; Simonis, I.; Villar, A.; Zaborowski, P. OGC Data Exchange Toolkit: Interoperable and Reusable 3D Data at the End of the OGC Rainbow. In Recent Advances in 3D Geoinformation Science - Proceedings of the 3D Geoinfo Conference; Kolbe, T.H., Donaubauer, A., Beil, C., Eds.; Springer Nature Switzerland, 2024; pp. 761–779. [Google Scholar]
  22. Chignard, S. A brief history of open Data. ParisTech Review. 2013. Available online: http://www.paristechreview.com/2013/03/29/brief-history-open-data/ (accessed on 8 July 2024).
  23. Directive 2007/2/EC of the European Parliament and of the council of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community (INSPIRE). Available online: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://knowledge-base.inspire.ec.europa.eu/publications/directive-20072ec-european-parliament-and-council-14-march-2007-establishing-infrastructure-spatial_en&ved=2ahUKEwivsMPsuv6FAxWxg_0HHU7TBKMQjBB6BAgKEAE&usg=AOvVaw3qAi9koTOasyha4tbMdIUd (accessed on 8 July 2024).
  24. Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information. Available online: https://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2003:345:0090:0096:en:PDF (accessed on 8 July 2024).
  25. WIS Timelines. Available online: https://community.wmo.int/en/activity-areas/wis/wis-timelines (accessed on 8 July 2024).
  26. World Wide Web Consotium - W3C. Available online: https://www.w3.org (accessed on 8 May 2024).
  27. Open Geospatial Consortium - OGC. Available online: https://en.wikipedia.org/wiki/Open_Geospatial_Consortium; https://www.ogc.org (accessed on 8 May 2024).
  28. Group on Earth Observations. Available online: https://earthobservations.org; https://en.wikipedia.org/wiki/Group_on_Earth_Observations#:~:text=GEO%20was%20established%20formally%20in,Observation%20Summit%20in%20Washington%2C%20DC (accessed on 8 July 2024).
  29. International Data Spaces Association - IDSA. Available online: https://internationaldataspaces.org (accessed on 9 May 2024).
  30. USAGE Deliverable 3.2 - Data space prototype and report - first version. Available online: https://drive.google.com/file/d/1FyvNkWkAWkuKKHh-3l529VUu5LIKyu5E/view?usp=drive_link (accessed on 8 July 2024).
  31. Urban Data Space for Green deal - USAGE project. funded within the HORIZON Europe programme (GA. 101059950). Available online: https://www.usage-project.eu (accessed on 9 May 2024).
  32. Common European Data Spaces. Available online: https://digital-strategy.ec.europa.eu/en/policies/data-spaces (accessed on 8 July 2024).
  33. International Standardisation Organisation - ISO. Available online: https://www.iso.org/home.html (accessed on 8 July 2024).
  34. W3C-OGC Spatial Data on the Web Working Group Charter. Available online: https://www.w3.org/2021/10/sdw-charter.html (accessed on 8 July 2024).
  35. Global Earth Observation System of Systems - GEOSS. Available online: https://old.earthobservations.org/geoss.php (accessed on 8 July 2024).
  36. FAIR principles. Available online: https://www.go-fair.org/fair-principles/ (accessed on 8 July 2024).
  37. GEOSS Data Management Principles. Available online: https://old.earthobservations.org/documents/dswg/201504_data_management_principles_long_final.pdf (accessed on 8 July 2024).
  38. CARE principles. Available online: https://en.wikipedia.org/wiki/CARE_Principles_for_Indigenous_Data_Governance (accessed on 8 July 2024).
  39. ISO 25012. Available online: https://iso25000.com/index.php/en/iso-25000-standards/iso-25012 (accessed on 8 July 2024).
  40. ISO 25010. Available online: https://www.iso25000.com/index.php/en/iso-25000-standards/iso-25010 (accessed on 8 July 2024).
  41. OSI model. Available online: https://en.wikipedia.org/wiki/OSI_model (accessed on 8 July 2024).
  42. DSSC Glossary. Available online: https://dssc.eu/space/Glossary/55443460/DSSC+Glossary+%7C+Version+1.0+%7C+March+2023 (accessed on 8 July 2024).
  43. DSSC starter Kit. Available online: https://dssc.eu/space/SK/29523973/Starter+Kit+for+Data+Space+Designers+%7C+Version+1.0+%7C+March+2023 (accessed on 8 July 2024).
  44. OpenDEI project. Available online: https://www.opendei.eu (accessed on 8 July 2024).
  45. Support Centre for Data Sharing deliverables. Available online: https://dssc.eu/space/DC/408059908/Support+Centre+for+Data+Sharing (accessed on 8 July 2024).
  46. Data Sharing Canvas - A stepping stone towards cross-domain data sharing at scale. Available online: https://coe-dsc.nl/wp-content/uploads/2024/02/data-sharing-canvas.pdf (accessed on 8 July 2024).
  47. Gaia-X. Available online: https://gaia-x.eu/ (accessed on 8 July 2024).
  48. FIWARE. Available online: https://www.fiware.org/about-us/ (accessed on 8 July 2024).
  49. Big Data Value Association - BDVA. Available online: https://www.bdva.eu (accessed on 8 July 2024).
  50. DSSC Endorsement. Available online: https://dssc.eu/page/Endorsements (accessed on 8 July 2024).
  51. Data Spaces Support Centre - DSSC. Available online: https://dssc.eu. https://www.egi.eu/project/dssc/. https://dssc.eu/space/DDP/117211137/DSSC+Delivery+Plan+-+Summary+of+assets+publication (accessed on 8 July 2024).
  52. Data Space for Smart and Sustainable Cities and Communities - DS4SSCC. Available online: https://inventory.ds4sscc.eu (accessed on 8 July 2024).
  53. GREAT project. Available online: https://www.greatproject.eu. https://www.epos-eu.org/great (accessed on 8 July 2024).
  54. GREAT technical blueprint. Available online: https://www.greatproject.eu/wp-content/uploads/2023/10/D3.1-Initial-Blueprint-of-the-GDDS-Reference-Architecture_web.pdf (accessed on 8 July 2024).
  55. A European Strategy for Data. Available online: https://digital-strategy.ec.europa.eu/en/policies/strategy-data (accessed on 8 July 2024).
  56. Simpl. Available online: https://digital-strategy.ec.europa.eu/en/policies/simpl (accessed on 8 July 2024).
  57. Simpl requirements. Available online: https://futurium.ec.europa.eu/en/simpl/pages/about (accessed on 8 July 2024).
  58. Open and Agile Smart Cities - OASC. Available online: https://oascities.org (accessed on 8 July 2024).
  59. DSSC data Sovereignty and Trust. Available online: https://dssc.eu/space/BVE/357075333/Data+Sovereignt y+and+Trust (accessed on 8 July 2024).
  60. DSSC Data Exchange. Available online: https://dssc.eu/space/BVE/357075193/Data+Exchange (accessed on 8 July 2024).
  61. DSSC Access and Usage Policies Enforcement. Available online: https://dssc.eu/space/BVE/357075567/Access+%26+Usage+Policies+Enforcement (accessed on 8 July 2024).
  62. DSSC Identity and Attestation Management. Available online: https://dssc.eu/space/BVE/357075352/Identity+and+Attestation+Management (accessed on 8 July 2024).
  63. DSSC Data Services and Offerings Descriptions. Available online: https://dssc.eu/space/BVE/357075789/Data%2C+Services+and+Offerings+Descriptions (accessed on 8 July 2024).
  64. DSSC Publication and Discovery. Available online: https://dssc.eu/space/BVE/357076320/Publication+and+Discovery (accessed on 8 July 2024).
  65. DSSC Value Added Services. Available online: https://dssc.eu/space/BVE/357076468/Value-Added+Services (accessed on 8 July 2024).
  66. OGC Location building Blocks. Available online: https://blocks.ogc.org (accessed on 8 July 2024).
  67. Common Assessment Methods Standards and Specifications - CAMSS. Available online: https://joinup.ec.europa.eu/collection/common-assessment-method-standards-and-specifications-camss (accessed on 8 July 2024).
  68. The Open Data Institute - ODI. Available online: https://theodi.org/about-the-odi/our-vision/ (accessed on 8 July 2024).
  69. ODI - How to choose an Open Standard. Available online: https://docs.google.com/document/d/1E5uARrZf5AJUIF_DJz-42_793EY_Dwk7n7B3bMn3x5A/edit#heading=h.xbuzggui7nk0; https://standards.theodi.org/find-existing-standards/how-to-choose-an-open-standard/ (accessed on 8 July 2024).
  70. Reference Model of Open Distributed Processing (RM-ODP). Available online: https://en.wikipedia.org/wiki/RM-ODP (accessed on 19 July 2024).
  71. ISO/IEC10746 Information technology - Open Distributed Processing (RM-ODP). Available online: https://committee.iso.org/sites/jtc1sc7/home/projects/flagship-standards/isoiec-10746.html (accessed on 19 July 2024).
Figure 1. Quality characteristics defined by the ISO/IEC25010 [40].
Figure 1. Quality characteristics defined by the ISO/IEC25010 [40].
Preprints 113452 g001
Figure 2. Methodology workflow
Figure 2. Methodology workflow
Preprints 113452 g002
Figure 3. Timeline of the different initiatives, projects and documents related to the data spaces conceptualisation and implementation.
Figure 3. Timeline of the different initiatives, projects and documents related to the data spaces conceptualisation and implementation.
Preprints 113452 g003
Table 1. Findability Accessibility Interoperability Reusability (FAIR) principles [9]
Table 1. Findability Accessibility Interoperability Reusability (FAIR) principles [9]
Findable
F1.
(Meta)data are assigned a globally unique and persistent identifier
F2.
Data are described with rich metadata (defined by R1 below)
F3.
Metadata clearly and explicitly include the identifier of the data they describe
F4.
(Meta)data are registered or indexed in a searchable resource
Accessible
A1.
(Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1.
The protocol is open, free, and universally implementable
A1.2.
The protocol allows for an authentication and authorisation procedure, where necessary
A2.
Metadata are accessible, even when the data are no longer available
Interoperable
I1.
(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
I2.
(Meta)data use vocabularies that follow FAIR principles
I3.
(Meta)data include qualified references to other (meta)data
Reusable
R1.
(Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1.
(Meta)data are released with a clear and accessible data usage licence
R1.2.
(Meta)data are associated with detailed provenance
R1.3.
(Meta)data meet domain-relevant community standards
Table 2. definitions of each building block and the possible changes with respect to the DSSC Blueprint.
Table 2. definitions of each building block and the possible changes with respect to the DSSC Blueprint.
DSSC blueprint aspect Extension Reason for extension Definition
Data Interoperability category Data Specification enabling FAIRness Building blocks support different FAIR aspects. All of these contribute to an effective description of data characteristics (be it published or required) A complete and standards compliant specification of all aspects of data FAIRness.
Data Sovereignty and Trust category Unchanged - Technical enablers to guarantee reliability and authenticity of participants’ information, to establish trust among them when interacting and performing data transactions. Common standards and agreed policies should prevent lock-in effects for users, and support FAIR principles, verification and authentication mechanisms, ensuring interoperability and security.
Data Value Creation Enablers Data Value enhancement The intrinsic value of data is not created, however, the tools contained in this category unlock the value of those data by making them available for a wider audience and facilitating their use. To leverage the value of data, users must be supported in the retrieval and access to the data they need, and have the possibility to apply the necessary processing to adapt them to specific needs (e.g. data transformation, data visualisation, etc.).
- New Services FAIRness category Although data spaces are focused on data, specialised Services may be required to enable applications to exploit that data. Therefore, similar challenges apply to identify the necessary services and their possible use. Symmetrically to the category “Data specification enabling FAIRness”, this category supports a good description of services to facilitate services retrieval according to the use cases workflows needs.
Data models Unchanged - The model provides semantics and a shared vocabulary, as well as a structure for the data (hierarchies and relationships).
Data Exchange or Data Models Data Exchange - Encodings In previous versions of the DSSC building blocks stack, the encodings of data, or formats were included in the Data Models building block and, in the current version, they also have relationships with the Data Exchange building block. However, it is important to specify them separately, because dedicated standards are available, and they are independent from the options for representing data models or data exchange mechanisms. Encoding is the format in which the data are encoded.
Data Exchange Data Exchange (APIs) Minor change in the title. Moved from the “Data interoperability” to the “Data value enhancement” category The data exchange building block focuses on data transmission once the conditions for interchange authorisation are met.
Provenance and Traceability Data Provenance model The original building block is intended to address both (a) the standards to be used to represent and document provenance in the data and (b) mechanisms to track the data throughout their lifecycle. However, two different kinds of standards and services are available for the two scopes. Moreover, one complements the data description, while the other is intended to support data sovereignty. It is therefore reasonable to address them separately. Standards intended to represent and document provenance and lineage of data.
Provenance and Traceability Sharing Traceability See row above. Moved from the “Data Interoperability” to the “Data Sovereignty and Trust” category Standards and services intended to keep track of the data processing and sources along their lifecycle.
Access and usage policies and enforcement Data Policies The DSSC building block aims to specify how to define and enforce access and usage policies within a data space and how participants define their policies in data spaces. However, we consider that the policies specifying access and usage conditions and policies schemas follow rules and definitions which are in practice separated from the generic mechanisms used to ensure that such conditions are respected. Therefore in this proposal, it is split in two respective building blocks. A policy is defined by the DSSC Blueprint v1.0 as “either an offer from the data provider or an agreement between the data provider and the data recipient. A policy is comprised of rules that specify the rights and duties of the parties: Access Rules: whether access to a resource is allowed or not; Usage Rules: how a resource might or may not be used; Consent Rules: whether usage of a resource, for which consent might be required from third parties, is allowed or not.”. Policies include licence details.
Access and usage policies and enforcement Access and usage control and management The DSSC building block aims to specify how to define and enforce access and usage policies within a data space and how participants define their policies in data spaces. However, we consider that the policies specifying access and usage conditions and policies schemas follow rules and definitions which should be separated from the mechanisms used to ensure that such conditions are respected. Therefore in this proposal, it is split in two respective building blocks. Mechanisms in place to ensure access and usage policies related to certain data are respected.
Identity and attestation management Unchanged - Information provided on the relevant entities must be verifiable to enable the onboarding and offboarding processes. The trustworthiness of information is linked to the trustworthiness of the Trust Anchors (or Trust Service Providers, specifically for identities), who are entitled to issue the respective attestations.
Trust Framework Unchanged - Mechanisms and standards enabling a trust environment to be implemented within which data can be securely exchanged.
Data, services and offering descriptions Data descriptions (metadata) Separated by the original building block, because specific schemas are needed to describe datasets (metadata). Moved from the “Data Value Creation Enablers” to the “Data Specification enabling FAIRness” category Metadata, technical guidance and schemas for describing datasets, exchanged within a data space between providers and recipients. Metadata allow consistent data retrieval, generation or reuse, ensuring reliability in the results for which data are used as input and related decision making process.
Data, services and offering descriptions Services descriptions (metadata) Separated by the original building block, because specific schemas are needed to describe services. Moved from the “Data Value Creation Enablers” to the “Services FAIRness” category Metadata, technical guidance and schemas for describing services chosen as components of a data space architecture.
Data, services and offering descriptions Offerings descriptions? Separated by the original building block as part not covered by existing or planned data and services technical description. We remain with the doubt about the category under which it could fall, since several elements from different categories are useful to define the offering. Offerings refer to a combination of descriptions and conditions attached to the data made available in the data space. However, a sharper definition is hard to find in the DSSC building block description.
Publication and discovery Metadata publication and discovery Minor change in the title. Checking the DSSC blueprint, it refers to metadata publication rather than to data publication itself. Therefore “metadata” was added to the title, to avoid any confusion with data publication systems (e.g. by means of APIs).
Value added services Unchanged - Included in the Blueprint v1.0, any kind of processing, as service, is included
- New Data Requirements and quality Schemas In “Data Specification enabling FAIRness” category Data requirements specification is essential for a successful data retrieval or generation, leading to reliable results for use cases. Referring to standardised schemas to specify data requirements allows interoperability between systems.
Data models New Vocabularies Data models are often described as "vocabularies", however there are many forms of terminology needed to describe the structure and semantics of data, as well as constrain the values that are used to convey information within these structures. Data spaces will face issues of abstraction, where one domain may model a concept in detail, where for another domain this is simply a classification attribute - e.g. a "cat" or an "animal, type=cat". Vocabularies are defined sets of terms. In the case of data models, these terms may be formally described in terms of relationships to other terms, as ontologies or taxonomies, but they may exist a lists of terms with defintions.
- New Vocabularies services In “Data Value Enhancement” category. Services may be used to cross-reference or match related terms from different domains to enhance "findability" in particular. Publishing cross-walks with expert curation can add significant value to data for domains needing this semantic clarity. Services intended to manage or augment vocabularies for the related functionalities.
- New data requirements and quality services (definition+validation) In “Data Value Enhancement” category Services intended to support specification of standard-based data requirements (based on the building blocks in the category “Data Specifications for FAIRness”) as well as data validation against such requirements.
- New Licenses for services In “Services FAIRness” category The kinds of licenses available for services (to be known when planning and implementing a data space architecture).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated