Computer Science and Mathematics

Sort by

Article

Information Systems

MongoDB Aggregation Pipeline Performance: Analysis of Query Plan Selection and Optimizer Behavior Across Versions and Collection Scales

Rosen Ivanov

Abstract: This article examines how MongoDB optimizes aggregation pipeline queries, focusing on two mechanisms: a trial-based plan selection process that runs candidate execution plans in parallel and picks the one returning the most results for the least work, and rule-based operator rewriting by the Pipeline Optimizer. The study tests nine aggregation query types on a synthetic e-commerce dataset with 50K documents. It uses MongoDB versions 6.0.3 and 8.2.5 under identical conditions. For each query, it evaluates all valid operator orderings. It also examines the physical execution plan and the Pipeline Optimizer output. Each test runs 20 times. The system clears the plan cache before every run. The study also tests scalability with datasets of 150K and 250K documents. Three cases are identified where the rule-based optimizer falls short: IXSCAN preference bias at low selectivity, where the suboptimal plan is up to 9 times slower than the optimal (80ms vs. 699ms at 250K under MongoDB 8.2.5); unbounded document multiplication after $unwind; and failure to account for $group output cardinality. MongoDB 8.2.5 improves performance in most cases compared to 6.0.3. $match + $group queries run up to 28% faster. Queries that rely on IXSCAN improve by up to 18%. Unbounded projection operations run slower in MongoDB 8.2.5 at all tested sizes. The slowdown is +23% at 50K, +3% at 150K, and +14% at 250K pointing to a change in the projection execution path between versions.

Posted: 31 March 2026

https://doi.org/10.20944/preprints202603.2514.v1

Article

Computer Science and Mathematics

Information Systems

Content-Aware Adaptive Dense Communication Network: Enhancing Multi-LLM Collaboration with Dynamic Information Flow

Anthony White

Joshua Allen

Abstract: Multi-agent systems built upon large language models (LLMs) are hindered by the limitations of traditional token-based communication, which suffers from information bottlenecks, redundant processing, and a lack of end-to-end differentiability. While dense vector communication offers improvements, existing methods lack adaptivity to diverse task requirements. To address this, we propose the Content-Aware Adaptive Dense Communication Network (CADCN), a novel architecture that empowers LLM agents to communicate through dense vectors by dynamically perceiving content and adaptively selecting routing and transformation strategies. CADCN introduces the Content-Aware Communication Unit (CACU), which integrates Content-Aware Routing via a lightweight network and Adaptive Dense Transformation using a Mixture-of-Experts structure, ensuring full differentiability. Our experiments, conducted under a highly constrained training token budget, demonstrate that CADCN consistently achieves superior performance across diverse general knowledge, reasoning, mathematical, and coding benchmarks compared to prior dense communication approaches. Furthermore, CADCN significantly outperforms vanilla LLMs trained on vastly larger datasets, highlighting its remarkable data efficiency and capability expansion. Ablation studies confirm the synergistic contributions of CADCN's adaptive components, while analysis of communication dynamics reveals learned expert specialization. Our findings establish CADCN as a highly efficient and intelligent paradigm for robust multi-LLM collaboration.

Posted: 31 March 2026

https://doi.org/10.20944/preprints202603.2356.v1

Article

Computer Science and Mathematics

Information Systems

Semantic Core for Sensor Telemetry Ingestion for Digital Twins

Oleksandr Osolinskyi

Khrystyna Lipianina-Honcharenko

Myroslav Komar

Abstract: Digital twin platforms for smart cities must continuously receive different types of data from sensors, gateways, and services, but in real situations these data is heterogeneous in terms of indicator names, measurement units, time rules, and object identification, which makes integrations expensive and fragile, while second verification becomes complicated. In this paper, a minimal semantic core for "first-stage" telemetry receiving of the DTwin platform, where semantics are used as operational rules during data ingestion. The core includes a machine-readable model of entities and relation-ships, dictionaries of metrics and measurement units, a unified event format with sep-aration into a stable envelope and payload, formal validation against data schemas, a mapping table for transforming raw fields into standardized measurements [name, value, unit], as well as an ingestion service with canonicalization of the event record and integrity control through the SHA-256 cryptographic hash. The implementation ensures ingestion of correct events, rejection of incorrect ones without recording, and reproducible verification through control examples, a testing protocol, and evidence snapshots. In smart city settings, such a telemetry ingestion foundation can support reliable monitoring of municipal buildings and infrastructure, including energy efficiency, indoor environmental quality, and data-driven operational decision-making. The proposed approach creates a core for stable integration of different sensor data into digital twins and further scaling of the platform.

Posted: 28 March 2026

https://doi.org/10.20944/preprints202603.2186.v1

Article

Computer Science and Mathematics

Information Systems

Design Insights for Exploring Identity Bubbles with Alternate Reality Gameplay

Guilherme Almeida

Mariana Seiça

Licínio Roque

Abstract: To activate conscious reflection regarding personal identity and identity-building processes in our daily lives is an increasing social concern. With this aim, we designed an Alternate Reality Game that invites participants to collectively explore these themes. Participants played with a prototype, reflecting on identity through emergent dynamics from gameplay and interpersonal interactions. We analysed participants’ appropriation of the prototype through logged activity, direct observation and interviews. The patterns identified enabled iterative redesign and further exploration of interaction dynamics. From this process of discovery, we synthesised the following design insights that may guide further research in the field: 1) how to explore design-play-reflect as a co-design process supported on individual appropriation, 2) how ARG designs can become generative social theories, 3) how ARGs generate reflective social phenomena, such as varied social identity and power narratives, and 4) how ARG Design can open doors to balance power dynamics.

Posted: 27 March 2026

https://doi.org/10.20944/preprints202603.2197.v1

Article

Computer Science and Mathematics

Information Systems

Theory of Epistemic Abductive Geometry (TEAG): A Unified Theory of Admissibility-Driven Inference Across Dynamical Systems, Measure Theory, and Language

Moriba Kemessia Jah

Abstract:

We introduce the Theory of Epistemic Abductive Geometry (TEAG), a framework for non-Bayesian inference grounded in admissible-support contraction under possibility theory. The central object is the TEAG quintuple $ \mathcal{E} = (H, \pi, \{H_\alpha\}_{\alpha\in(0,1]}, C, A) $, where evidence acts by contracting the geometry of admissible hypotheses rather than redistributing probabilistic belief mass. Falsification has two-stage structure. Under the log-admissibility transformation $ \Phi(h) = -\log\pi(h) $, the canonical TEAG conjunctive update becomes tropical addition in the max-plus semiring: $ \Phi^+(h) = \Phi^-(h) \oplus \psi(h) = \max\!\bigl(\Phi^-(h),\,\psi(h)\bigr), $ where $ \psi(h) = -\log\kappa(y\mid h) $ is the surprisal of hypothesis h under observation y. The tropical variety of this polynomial, $ \mathcal{B}_{\mathrm{active}} = \bigl\{h \in H : \Phi^-(h) = \psi(h)\bigr\}. $ is the active deformation front: the exact locus where incoming evidence first matches prior impossibility and begins to deform the posterior field. This is a necessary condition for falsification but not sufficient. Sufficient falsification requires exit from the PCRB admissible basin $ \mathcal{A}_k = \{h : \Phi^+_k(h) \leq c_k^\star\} $, where $ c_k^\star $ is the equipotential threshold determined by the PCRB at step k. Popper's criterion thus receives a two-stage algebraic formulation: the tropical variety marks where falsification becomes possible; the PCRB basin boundary marks where falsification is complete. Within the class of possibility-theoretic recursive inference systems, this is, to the best of our knowledge, the first exact formulation of this distinction. Main results. 1. Epistemic Contraction Theorem. Contraction is tropical addition: $ \Phi^+ = \Phi^- \oplus \psi $. Posterior α-cuts satisfy $ H_\alpha^+ = H_\alpha^- \cap E_\alpha(y) $: geometric intersection, not belief redistribution. The active deformation front is the tropical variety $ \mathcal{B}_{\mathrm{active}} $; the falsification boundary is the PCRB admissible basin boundary $ \mathcal{B}_{\mathrm{adm}} $. 2. Possibilistic Cramér–Rao Bound (PCRB} For any filter in the class $ \mathcal{F} $ of epistemically admissible, contraction-based recursive estimators satisfying Axioms 2.1–2.5: $ \mathcal{E}_{\pi,k|k} \geq \mathcal{E}_{\pi,k|k-1} + \tfrac{n}{2}\log(1-I_k) $, where $ I_k $ is the Choquet integral of per-hypothesis surprisal against the prior possibility capacity. Within this class, the ESPF [28] is the unique filter achieving this bound with equality, and is therefore the unique minimax-entropy-optimal set-based recursive estimator under bounded epistemic uncertainty. 3. Tropical Hamilton–Jacobi structure (summary). The TEAG update is structurally consistent with a tropical Lagrangian $ L = T - V $, Legendre transform to a tropical Hamiltonian equal to the surprisal field, and a Hamilton–Jacobi equation whose solution is the tropical addition rule. The Euler–Lagrange equations on the epistemic manifold yield geodesic motion with explicit Levi–Civita connection and Christoffel symbols. This structure is interpretive and consistent with the axioms; full derivations are in the companion paper [31]. Taken together, this structure admits a precise interpretation: the TEAG update rule is a max-plus dynamical system whose governing equations have the same algebraic form as the Hamilton–Jacobi equations of classical mechanics, instantiated on hypothesis space rather than physical space. 4. Gaussian collapse. Probability theory is the collapse limit of TEAG as epistemic width $ W \to 0 $: Choquet converges to Lebesgue, the ESPF recovers the Kalman filter, and $ \mathcal{E}_\pi \to \tfrac{1}{2}\log\det\Sigma + \mathrm{const}(n) $. Probability is earned by evidence, not assumed. Epistemic neutrality and knowledge-system synthesis. Because TEAG's axioms require only a hypothesis space, a possibility field, and a contraction operator — not a probability measure, a likelihood function, or a frequentist grounding — heterogeneous knowledge systems can each instantiate the TEAG quintuple independently. Their joint admissible support intersection is the locus of coherence: the set of hypotheses neither system has falsified. No transformation of one system into the other's representational primitives is required. The composition theory (Section 6) formalizes the coupling architecture. Four instantiations provide the unifying structure: the ESPF [28] for recursive state estimation; the Geometry of Knowing [29] for measure-theoretic collapse; the minimax-entropy optimality proof [30]; and the Possibilistic Language Model (PLM, forthcoming [32]).

Abstract:

Posted: 26 March 2026

https://doi.org/10.20944/preprints202603.2010.v2

Article

Computer Science and Mathematics

Information Systems

A Formal Semantics of Governance History Validity in Encrypted Storage

Jesús F. Rodríguez-Aragón

Carolina Zato

Fernando De la Prieta

Abstract: Encrypted storage systems increasingly rely on governance mechanisms such as delegation, revocation, key updates, and policy evolution. While existing approaches provide strong guarantees for access enforcement, integrity, and transparency, they do not address a fundamental question: under which conditions can an observed sequence of governance events be accepted as a semantically valid evolution of authorization state? This work introduces a formal semantic framework for governance validity based on observable evidence. Governance is modeled as an admissibility-constrained state transition system in which events are accepted only if they satisfy explicit authorization, reference, temporal, revocation, and evidence conditions. The framework defines valid governance histories as sequences of admissible events, characterizes the conditions for deterministic state reconstruction, and establishes invariants capturing correctness properties such as revocation soundness, policy-constrained evolution, evidence completeness, non-equivocation, and temporal coherence. It also defines event-specific evidence obligations that support independent verification. The proposed approach is architecture-independent and does not prescribe specific enforcement or logging mechanisms, focusing instead on the semantic conditions required for accepting governance histories as valid from observable evidence.

Posted: 26 March 2026

https://doi.org/10.20944/preprints202603.2102.v1

Article

Computer Science and Mathematics

Information Systems

DS-GBT: Proactive Safety Integration for Dynamic Agent Decision Policies

Yao-Tian Chian

Yuxin Zhai

Abstract: The rise of large language models (LLMs) enables autonomous agents in complex environments, yet their opaque decision logic challenges safety, robustness, and interpretability. Existing reactive or static policy methods often fall short in dynamic scenarios, and dynamic policy generation typically lacks inherent safety. We identify a critical gap: proactively integrating verifiable safety policies into autonomous agents' dynamic decision generation. To address this, we propose Dynamic Safe-GBT Generation (DS-GBT), a novel tripartite framework generating intrinsically safe Gated Behavior Trees (GBTs). This architecture ensures safety by formalizing requirements, dynamically generating GBTs with embedded safety gates and risk awareness, and verifying behaviors pre-execution. Evaluated in an Agentic City Simulator, DS-GBT significantly outperforms baselines, demonstrating superior task success and remarkably low safety violations. Comprehensive experiments, including ablation and robustness assessments, confirm DS-GBT's 'safety by generation' paradigm delivers demonstrably safer, more robust, and interpretable agent behaviors in dynamic, safety-critical environments, while maintaining competitive efficiency.

Posted: 25 March 2026

https://doi.org/10.20944/preprints202603.1992.v1

Article

Computer Science and Mathematics

Information Systems

A Theoretically‐Grounded Federated Attribution Framework with Adaptive Differential Privacy Budgets for Cross‐Device Social Commerce Advertising Systems

Xiongsheng Yi

Abstract: In social electronic commerce advertising systems involving multiple devices, user behavior data is scattered across various devices and is highly sensitive regarding privacy. How to achieve ad attribution of high quality while protecting user privacy has become a key issue. This paper proposes a theoretically supported Federated Attribution Framework, which innovates on the basis of existing Shapley value attribution methods and federated learning mechanisms. By integrating user behavior graph modeling across multiple devices, introducing graph neural networks for local temporal encoding, and implementing a federated alignment mechanism consisting of two stages, it achieves collaborative user representation and attribution optimization across devices. Additionally, an adaptive differential privacy budget allocation strategy is proposed, which can dynamically adjust privacy budget allocation based on device attribution sensitivity and training rounds, achieving a personalized balance between privacy protection and attribution performance. Experimental results show that the proposed method improves attribution accuracy by an average of 8.3% on a social electronic commerce advertising dataset compared to existing methods.

Posted: 25 March 2026

https://doi.org/10.20944/preprints202603.2000.v1

Concept Paper

Computer Science and Mathematics

Information Systems

Astrophysical Constraints on the Simulation Hypothesis for This Universe from a Biological Point of View

Alessandro Perrella

Antonio D'Amore

Ada Maffettone

Abstract: We have examined the Simulation Hypothesis through an interdisciplinary lens, combining insights from physics, biology, and psychology. Building on recent evidences on quantitative critique, we propose an alternative computational paradigm inspired by biological information processing. Rather than a continuous "brute-force" simulation, we suggest an event-driven, observer-dependent model that dramatically reduces computational requirements. Using the adaptive immune system as an analogy, we demonstrate how sparse computation can manage vast potential information spaces efficiently through a quantitative framework distinguishing potential from active entropy. This framework aligns with relational interpretations of quantum mechanics and offers a resolution to the physical impossibilities identified in continuous simulation models. Our aim is not to provide a constructive simulation design within known physics, but to show that alternative, biologically inspired paradigms re-open conceptual space that previous analysis, by construction, leaves unexamined. Additionally, we explore the psychosocial dimensions of the Simulation Hypothesis, examining why observer-dependent ontologies resonate in contemporary discourse and how they intersect with broader cultural patterns of meaning-making.

Posted: 24 March 2026

https://doi.org/10.20944/preprints202603.1865.v1

Review

Computer Science and Mathematics

Information Systems

Antecedents, Decisions, and Outcomes for ICT Governance Adoption in the African Public Sector: A Systematic Review Based on McClelland Theory and ADO Framework

Samuel Simbarashe Furusa

Mampilo Pahlane

Abstract: Digital systems now serve as the foundation of institutional planning and service delivery. The public sector operates under various influential factors that complicate its ability to fulfil its mandates effectively. It is therefore important for these entities to adopt ICT governance to integrate ICT into business operations for effectiveness. This review identifies antecedents, decisions and outcomes that influence ICT governance and maps them to McClleland's theory of Needs. It also analyses the geographical and publication trends influencing ICT governance adoption in the African public sector. A systematic review was conducted using the SPAR-4-SLR protocol and the ADO framework. Peer-reviewed publications and grey literature from 2010 to 2025 were retrieved from Scopus and Web of Science. The selected studies were coded and thematically analysed. The study's findings were categorised into structural, relational, and process pillars. The main finding suggests that relational mechanisms must be addressed for effective ICT governance. The findings, when mapped into McClelland’s framework, indicate that power needs to affect control processes, achievement needs to impact performance processes, and affiliation to provide the necessary conditions for effective ICT governance. It is imperative that leaders consider the factors to avoid symbolic representation of ICT governance in public sector.

Posted: 20 March 2026

https://doi.org/10.20944/preprints202603.1653.v1

Article

Computer Science and Mathematics

Information Systems

Speech-Adaptive Detection of Unnatural Intra-Sentential Pauses Using Contextual Anomaly Modeling for Interpreter Training

Hyoeun Kang

Jin-Dong Kim

Juriae Lee

Hee-Jo Nam

Kon Woo Kim

Joowon Lim

Hyun-Seok Park

Abstract: Detecting unnatural pauses is a critical component of automated quality assessment (AQA) in interpreter training, as pause patterns directly reflect an interpreter's cognitive load and fluency. Traditional pause detection methods rely on static temporal thresholds (e.g., 1.0 second), which often fail to account for segment-specific speech rate variability and individual speaking styles. This study proposes a context-adaptive pause detection framework that integrates unsupervised anomaly detection using Isolation Forest (iForest) with a sliding window technique. To enhance pedagogical validity, we specifically focused on intra-sentential pauses by delineating sentence boundaries using a specialized segmentation model. The proposed model was evaluated against ground-truth labels annotated by professional interpreting experts. Our results demonstrate that the sliding window–based contextual anomaly detection model significantly outperforms the conventional static baseline, particularly in terms of recall and Cohen’s kappa. Furthermore, by applying a weighted F3-score and the "Recognition-over-Recall" principle, we confirmed that the proposed model substantially reduces the instructor's total operational burden by shifting the workload from de novo annotation creation to more efficient corrective pruning. These findings suggest that speech-adaptive modeling provides a more reliable and labor-saving framework for automated interpreting assessment and feedback.

Posted: 18 March 2026

https://doi.org/10.20944/preprints202603.1454.v1

Article

Computer Science and Mathematics

Information Systems

HieMaGT: Hierarchical Multi-Scale Graph Transformer for Brain Disorder Diagnosis

Yutian Qi

Bowen Xun

Abstract: Brain disorder diagnosis using functional magnetic resonance imaging (fMRI) is crucial but challenging, largely due to the brain's complex, multi-scale, and dynamic nature. Existing models often fall short by focusing on single-scale or pairwise connections, or by requiring predefined higher-order interactions. To address these limitations, we propose HieMaGT (Hierarchical Multi-scale Graph Transformer), a novel end-to-end framework designed to adaptively learn dynamic, higher-order, and multi-scale functional connectivity directly from fMRI time series. HieMaGT integrates parallel Multi-scale Graph Transformer layers to capture interactions across various granularities, a Hierarchical Pooling module for progressive feature abstraction, and a Robustness Enhancer based on contrastive learning to ensure stable and generalizable disease biomarkers. Comprehensive experiments on three real-world fMRI datasets for conditions like schizophrenia, Alzheimer's Disease, and various brain states demonstrate that HieMaGT consistently achieves superior diagnostic performance. HieMaGT significantly outperforms state-of-the-art methods, showing substantial improvements across all datasets. These results highlight HieMaGT's advanced capability in leveraging complex brain functional interactions for accurate and robust brain disorder diagnosis.

Posted: 18 March 2026

https://doi.org/10.20944/preprints202603.1463.v1

Article

Computer Science and Mathematics

Information Systems

Mapping Research Trends with the CoLiRa Framework: A Computational Review of Semantic Enrichment of Tabular Data

Luis Omar Colombo-Mendoza

Julieta del Carmen Villalobos-Espinosa

María Elisa Espinosa-Valdés

Elías Beltrán-Naturi

Abstract: This article proposes a novel and replicable computational methodology named CoLiRa (Computational Literature Review & Analysis) Framework to quantitatively analyze and map the evolution of a scientific field. As a multi-stage approach, the CoLiRa Framework first uses Latent Dirichlet Allocation (LDA) to identify core research topics from a body of literature. Second, it applies cluster analysis (K-Means and Multidimensional Scaling) to map the conceptual structure of the field’s key terms. Finally, it uses linear regression analysis to quantitatively assess the development trends of these topics over time. We demonstrate our proposal through a semi-systematic literature review on the semantic enrichment of tabular data, which covers studies (up to 2024) that utilize Semantic Web ontologies, Linked Data, and knowledge graphs. The analysis of this case study revealed three core research topics and found no statistically significant evidence of a shift in topic prevalence, indicating a stable research ecosystem. This work thus offers a validated computational approach for conducting literature reviews and mapping research trends.

Posted: 17 March 2026

https://doi.org/10.20944/preprints202603.1353.v1

Article

Computer Science and Mathematics

Information Systems

Digital Items as Information System Artifacts: Toward a Typology of Valuation-to-Monetization Mechanisms in Southeast Asian Free-to-Play Economies

Laurence Maningo

Abstract: Free-to-play games have become one of the dominant economic models in the digital games industry, especially in mobile-first markets. Although access to these games is free, revenue is generated through digital items, premium currencies, randomized reward systems, seasonal passes, and other monetization mechanisms. Existing studies have examined many of these features separately, but there is still a lack of a design-oriented framework that explains how player-perceived value is systematically converted into monetization. This paper proposes a Design Science Research approach for developing a typology of valuation-to-monetization mechanisms in Southeast Asian free-to-play game economies. The study reframes digital items as information system artifacts whose design shapes both player value and revenue capture. Drawing from literature in information systems, game studies, virtual goods, and monetization research, the paper outlines a three-phase research design consisting of systematic evidence synthesis, typology development, and empirical evaluation through structured observation. The intended contribution is a practical and theoretically grounded typology that helps explain how digital items are designed to generate willingness to spend. The paper also argues that Southeast Asia provides a meaningful regional context because of its mobile-heavy market, hybrid monetization portfolios, and strong presence of free-to-play titles. This preprint presents the conceptual foundation and methodological design of the study.

Posted: 17 March 2026

https://doi.org/10.20944/preprints202603.1286.v1

Article

Computer Science and Mathematics

Information Systems

Intelligent Sensor-Driven Integration Framework for IoT-Enabled Public Transportation Using an Extended CAMS Architecture

Nelson Herrera-Herrera

Estevan Ricardo Gómez-Torres

Abstract: The rapid proliferation of heterogeneous IoT sensor networks in urban public transporta-tion systems generates large volumes of real-time data that are often fragmented across in-dependent platforms, thereby limiting interoperability, scalability, and coordinated intel-ligence. Existing architectures typically treat sensing, edge processing, and artificial intel-ligence as loosely coupled components, lacking unified frameworks that support real-time adaptive decision-making in complex transportation environments. To address this gap, this study proposes a sensor-centric extension of the CAMS architec-ture that integrates semantic sensor interoperability, edge-enabled distributed processing, and embedded AI-driven coordination within a unified framework. The sensor-centric ex-tended CAMS framework introduces a distributed sensor integration layer combined with a native intelligent coordination module that enables real-time multi-sensor fusion and predictive analytics. A functional prototype is evaluated using hybrid real-world and simulated datasets representing vehicle telemetry, infrastructure sensing, and passenger demand across diverse operational scenarios. Experimental results demonstrate signifi-cant improvements in interoperability efficiency, predictive accuracy, scalability, and end-to-end latency compared with conventional centralized architectures. The results indicate that tightly integrating distributed sensing with embedded intelli-gence enhances robustness and scalability in smart transportation ecosystems. The pro-posed architecture provides a practical and extensible foundation for next-generation in-telligent urban mobility systems and advances the integration of IoT sensing and AI-driven decision-making in large-scale cyber–physical environments.

Posted: 11 March 2026

https://doi.org/10.20944/preprints202603.0892.v1

Article

Computer Science and Mathematics

Information Systems

Rethinking Spatial Data Quality: A Socio-Technical and Lifecycle Framework

Tomaž Podobnikar

Abstract: Spatial data quality (SDQ) is commonly assessed through technical verification. However, empirical evidence demonstrates that perceived data quality often diverges from objectively measured quality due to cognitive, institutional, and lifecycle-related factors. This paper proposes a multi-layered SDQ framework that integrates technical admissibility, process and lifecycle stewardship, visual and interpretive diagnostics, and governance indicators to enable holistic quality assessment within a socio-technical system. Rather than treating quality elements in isolation, the framework supports the diagnosis of emergent quality states and associated risk patterns. The framework is demonstrated through two empirical cases: validation of planned land use data using the OPIAvalid toolkit, and semantic conflation of multiple digital elevation models (DEMs) with heterogeneous lineage. Results show that governance failures, specification misuse, and degradation of lineage can undermine trust and decision-making even when datasets formally comply with ISO-based indicators. Visual spatial forensics and lineage-aware integration proved essential for detecting undocumented methodological shortcuts and restoring justified trust in authoritative data. Artificial intelligence is positioned as a diagnostic and explanatory support, assisting in anomaly detection, prioritization, and communication of quality risks, while deterministic validation and expert judgment remain mandatory. Overall, the framework shifts SDQ management from isolated technical validation toward lifecycle-oriented, transparent, and sustainable data governance.

Posted: 11 March 2026

https://doi.org/10.20944/preprints202603.0798.v1

Article

Computer Science and Mathematics

Information Systems

Ontology-Based Validation of Enterprise Architecture Principles

Devid Montecchiari

Abstract: Enterprise architecture (EA) principles provide normative guidance for architectural evolution, yet validating whether EA models comply with such principles is typically performed manually and does not scale to continuous governance. This paper presents an ontology-based validation approach that enables automated compliance checking of ArchiMate models against EA principles. The approach (i) semantically lifts ArchiMate models into RDF/OWL as ontology instances grounded in ArchiMEO, (ii) structures natural-language principles using SBVR Structured English to reduce ambiguity and support traceability, (iii) enriches the resulting knowledge graph with inferred architectural relations through derivation rules, and (iv) operationalizes validation using SHACL constraints and SPARQL queries that produce explainable violation reports linked to concrete model elements. The approach is developed following Design Science Research and evaluated in three case studies (two real-world organizational settings and one controlled educational setting).The evaluation demonstrates that the approach supports repeatable execution of principle checks on evolving models, improves traceability of violations for architecture review and decision-making, and reduces manual effort by shifting substantial parts of compliance checking from human interpretation to automated constraint validation.

Posted: 04 March 2026

https://doi.org/10.20944/preprints202603.0361.v1

Article

Computer Science and Mathematics

Information Systems

Integration of Physical and Probabilistic Measures in Stochastic Measurements of Manufacturing Processes

Artur Zaporozhets

Vitalii Babak

Valerij Zvaritch

Svitlana Kovtun

Yurii Gyzhko

Vladyslav Khaidurov

Vladyslav Verpeta

Abstract: Deterministic and probabilistic models of measured quantities, processes and fields in production process control systems, as well as physical and probabilistic measures, make it possible to form the measurement result and give it the properties of objectivity and reliability. The issue of improving and developing models and measures in measurement methodology plays an increasingly important role in achieving high measurement accuracy in control systems and the reliability of decision-making by expert systems in production processes. The measure is formed by many factors, most of which are random in nature. The stochastic approach in measurement theory is of particular importance in the measurement of physical quantities that are probabilistic in nature and in the construction of decision rules for expert systems. Probabilistic measures play a key role both in the process of measuring physical quantities and in the construction of decision rules when using a stochastic approach. The main idea of the article is to show the peculiarities of the transition from the well-known triad "model → algorithm → program" to a more meaningful methodology "model → measures → algorithm → program" and to show an example of using this approach. The methodology allows to increase the accuracy and reliability of the results obtained from measuring control systems and decision-making by expert systems of production processes. Examples of the use of this approach are considered in this work.

Posted: 02 March 2026

https://doi.org/10.20944/preprints202603.0092.v1

Article

Computer Science and Mathematics

Information Systems

A Federated and Differentially Private Incentive–Marketing Framework for Privacy-Preserving Cross-Channel Measurement in AI-Powered Digital Commerce

Xiongsheng Yi

Abstract: In the U.S. digital economy, small and medium-sized businesses (SMBs) and creators in remote regions face structural disadvantages in access to integrated advertising and incentive platforms, largely due to accelerating privacy regulations and the fragmentation of cross channel datasets. This paper proposes a unified federated and differentially private measurement framework that integrates Topics/Protected Audience, Attribution Reporting, and SKAdNetwork, aiming to achieve privacy-preserving incentive optimization and cross-channel effectiveness measurement for web and mobile environments. The framework prioritizes compliant data usage, resolves data silos across ad ecosystems, and supports privacy-preserving recommendation and incentive allocation. Technically, we design a hybrid architecture that combines federated learning, differential privacy, and low-latency attribution aggregation, while ensuring end-to-end consistency across uplift modeling, multi-touch attribution (MTA), and event-level reporting. Empirical analysis compares the proposed model with state-of-the-art privacy-preserving baselines (e.g., last-touch attribution with DP aggregation), demonstrating substantial gains in accuracy, robustness, and reporting fidelity under strict privacy constraints.

Posted: 28 February 2026

https://doi.org/10.20944/preprints202602.1929.v1

Communication

Computer Science and Mathematics

Information Systems

Shadow AI in Organisations: A Practical Framework for Detection, Risk Classification, and Governance

Ayokunle Ojowa

Monteiro Marques

Antonio Goncalves

Abstract: Shadow AI—the unsanctioned use of artificial intelligence tools, models, or services within organisational processes—introduces governance, security, and privacy risks that extend beyond traditional shadow IT. This communication proposes a practical framework to (i) define and classify shadow AI use cases, (ii) detect shadow AI activity through multi-layer technical signals, and (iii) govern risk through an obligations-to-evidence mapping that supports compliance and auditability. The framework aims to balance innovation and productivity with proportionate controls, offering clear remediation paths (block, replace, or regularise with evidence). We also outline a validation plan based on a PRISMA-informed literature review and triangulation (expert feedback, case studies, and survey) to support subsequent empirical evaluation.

Posted: 28 February 2026

https://doi.org/10.20944/preprints202602.1924.v1

of 36