Computer Science and Mathematics

Sort by

Review
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Guoxiu He

,

Jinquan Zheng

,

Fangqing Han

Abstract: Selection bias in Large Language Models has emerged as a fundamental obstacle to reliability, fairness, and robustness. Defined operationally as systematic decision changes under equivalence-preserving input perturbations, including option permutation, label renaming, candidate-order swapping, and evidence relocation, the phenomenon is examined across four representative task families: multiple-choice question answering, in-context classification, LLM-as-a-Judge evaluation, and long-context or retrieval-augmented generation. Selection bias is first analyzed through a causal chain that links biased behavior to training-data priors, architectural asymmetries, and post-training amplification. Existing mitigation methods are then synthesized through an intervention-level taxonomy spanning inference-time calibration and prompt optimization, architecture-level modification, and training-level debiasing. The evaluation landscape is unified by summarizing commonly used metrics, benchmark families, and application settings, with the lack of standardized and cross-task-comparable protocols identified as a central bottleneck. Selection bias is best understood as a failure of invariance under non-semantic reformatting, and mitigating it is essential for trustworthy, robust, and selection-invariant language models.

Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Thaer Thaher

,

Alaa Sheta

,

Huthaifa I. Ashqar

,

Hamouda Chantar

,

Salim Surani

Abstract: Background/Objectives: Obstructive sleep apnea (OSA) is a common and serious sleep-related disorder that causes repeated interruptions in breathing during sleep. Traditional diagnostic methods, such as polysomnography, are accurate but costly, time-consuming, and unsuitable for large-scale screening. This study proposes and evaluates a lightweight diagnostic framework based on an Extreme Learning Machine (ELM) optimized by a set of basic and advanced metaheuristic optimizers (GA, RUN, MEO, CL-PSO, HI-WOA, GWO, HGS, HHO, SeaHO, MGO, and the hybrid GWO--WOA). The model aims to improve early detection of OSA using demographic and clinical data. Methods: Two real datasets were employed to train and evaluate the proposed framework: (i) a clinical OSA dataset with 274 subjects and 31 demographic/anthropometric and sleep-related predictors, and (ii) a public strongly imbalanced Sleep-Disordered Breathing (SDB) dataset with 500 subjects and 10 structured predictors. Metaheuristic algorithms are used to optimize ELM weights and biases, addressing the instability of random initialization and improving model generalization. The optimized models are evaluated against eight baseline classifiers, including Logistic Regression (LR), k-nearest neighbours (KNN), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP), XGBoost (XGB), and a standard ELM classifier. Results: Results show that metaheuristic optimization improves ELM on the OSA dataset, increasing ROC-AUC from 0.6527 to about 0.73 and accuracy from 0.6573 to about 0.69–0.70, while on the highly imbalanced SDB dataset, it yields modest ROC-AUC gains (from 0.5132 to about 0.544–0.548) with small decreases in accuracy and F1-score. Conclusions: The proposed framework provides a fast, lightweight, and cost-effective screening tool for large-scale, resource-limited healthcare settings, enabling early OSA detection and preventive intervention.

Article
Computer Science and Mathematics
Algebra and Number Theory

Ibar Federico Anderson

Abstract: This paper, which is entirely unconditional, proves a sharpened almost-all theorem with fully explicit effective constants for the restricted weighted Goldbach sum R_{a,q}(N) := sum over p1+p2=N, p1 = a (mod q), of (log p1)(log p2), with q >= 1 and gcd(a,q) = 1, whose expected main term is M_{a,q}(N) = C_2 * S(N) * N / phi(q), where C_2 = 0.6601618... is the twin-prime constant and S(N) is the binary singular series.The results are organised around four pillars.(I) A complete character-pair decomposition of the second moment of the error E(N) := R_{a,q}(N) - M_{a,q}(N), extracting the exact diagonal constant G/(2*phi(q)), where G = prod_{p>2}(1 + (p-1)^{-2}) in [1.41320886, 1.41320899] is the Gallagher-Goldston constant.(II) A uniform minor-arc L^4 bound: integral over minor arcs of |S(alpha)|^4 dalpha <= kappa_safe * 2^A * X^3 / (log X)^A, with kappa_safe = 4.40, obtained by combining the complete Vaughan identity with the Bombieri-Vinogradov theorem in integral form, with an explicit derivation of kappa_explicit = C_V^2 * c_{L^2} = 4.004 before applying a rigorous 10% safety margin.(III) The effective almost-all theorem: #{N <= X even : |R_{a,q}(N) - M_{a,q}(N)| > C(A,q) * N * (log N)^{-3}} << X * (log X)^{-A}, with the explicit constant K := 2*C(1,4) <= 3.3624, obtained from C(1,4) <= 1.6812 via a Stechkin-type optimisation.(IV) A Pintz-type exceptional-set bound on {N <= X : R_{a,q}(N) = 0}.Every statement in the main body carries the tag [PROVED]. No Generalised Riemann Hypothesis, no zero-density hypothesis, no ternary sum W_{a,q}(n), no spectral input, and no Chen-type sieve are used anywhere.

Article
Computer Science and Mathematics
Information Systems

Noël Crescenzo

,

David Arnaud

,

Peiman Fallahian Sichani

,

Johan Winther Kristensen

,

Nikolaos Partarakis

,

Xenophon Zabulis

Abstract: This article investigates how an e‑learning platform and a virtual reality (VR) workshop simulator can be integrated into a traditional craft apprenticeship without displacing workshop‑based learning. Drawing on the Craeft glassblowing Pilot 1 at CERFAV, it reports a two‑phase mixed‑methods study contrasting a Traditional Augmented (TA) group, which used a Craeft e‑learning platform and a VR glassblowing simulator, with a Traditional (T) control group following the standard Certificate of Professional Competence (CPC) programme. Quantitative data from formative assessments and CPC examination results are combined with qualitative feedback, satisfaction surveys, self‑assessment questionnaires, and interviews with apprentices and trainers. In Phase 1, where digital tools were deployed in a separated mode alongside existing instruction, the e‑learning platform was perceived as pedagogically valuable, but effects on assessment outcomes were limited and uneven, with greater score dispersion in the TA group. In Phase 2, redesigned hybrid usage scenarios assigned distinct and complementary roles to the e‑learning platform, VR, and workshop practice within an iterative learning cycle, yielding more consistent advantages for the TA group in cross‑cutting theoretical subjects and reducing variance in their scores. Qualitative analyses show that apprentices adopt a pragmatic stance towards digital tools, using the e‑learning platform primarily for revision and exam preparation and VR for workshop discovery and tool recognition, while maintaining a strong attachment to material practice. The study concludes that, in small, high‑stakes craft VET programmes, the impact of virtual learning environments depends less on their intrinsic properties than on their orchestration within coherent hybrid designs and on trainers’ capacity to align them with authentic tasks and assessment regimes.

Article
Computer Science and Mathematics
Computer Networks and Communications

Robert Campbell

Abstract: Artificial intelligence (AI) systems increasingly depend on multi-stage supply chains that incorporate pre-trained models, third-party datasets, open-source libraries, and automated pipelines, creating an expanding attack surface in which model poisoning, dependency compromise, and provenance manipulation can undermine integrity before deployment. Existing AI governance frameworks—including the NIST AI Risk Management Framework and Secure Software Development Framework—acknowledge supply chain risks but do not define verifiable model provenance or cryptographically durable integrity guarantees. The transition to post-quantum cryptography (PQC) compounds this gap: classical digital signatures used to verify model lineage, dataset integrity, and pipeline attestation will become vulnerable to quantum-enabled forgery within the operational lifetime of many AI systems. This paper synthesizes evidence from policy, standards, and incident sources to characterize the AI supply chain threat landscape and the cryptographic dependencies that the PQC transition disrupts. It proposes three integrated design-science artifacts: a Model Bill of Materials with PQC-safe extensions (MBOM-PQC) defining a verifiable provenance schema; a unified signing and attestation pipeline integrating ML-DSA and hybrid signature modes; and a five-level Supply Chain Assurance Maturity Model (SCAMM) for repeatable organizational evaluation. These contributions provide a structured foundation for AI supply chain integrity in cloud-connected, mission-critical smart systems, ensuring verifiable lineage, authenticity, and trustworthiness through the PQC transition. Empirical validation is deferred to future work.

Article
Computer Science and Mathematics
Security Systems

Yerlan Tursynbek

,

Nurtay Albanbay

,

Djamel Djenouri

,

Shahid Latif

,

Ainur Akhmediyarova

,

Zhibek Alibiyeva

,

Janna Alimkulova

,

Dina Oralbekova

Abstract: Federated learning (FL) enables distributed model training in IoT environments while keeping raw data on local devices. However, protecting model-update exchange is difficult on microcontroller-class devices due to strict latency, memory, and energy constraints. Existing studies often evaluate lightweight cryptography outside complete FL pipelines or on more powerful hardware, leaving its practical overhead on MCU-class devices insufficiently explored. This paper presents an end-to-end, hardware-validated secure framework for exchanging model updates in federated learning on resource-constrained IoT microcontrollers. Implemented on ESP32-based edge devices, the framework combines light-weight block ciphers (SPECK, SIMON, and PRESENT), HMAC-SHA256 for integrity verification, and ECDH-HKDF for session-key establishment. The evaluation assessed latency, throughput, RAM/ROM footprint, and energy consumption. Results show that SPECK provides the lowest overhead (0.13 µs/byte, 8.68 MB/s, 138.3 mJ), SIMON offers intermediate performance (0.41 µs/byte, 1.96 MB/s, 184.9 mJ), and PRESENT incurs the highest computational cost (89.37 µs/byte, 0.011 MB/s, 446.2 mJ). In the CICIoT2023 federated intrusion detection evaluation, the secure model maintained stable convergence and achieved 85.43% accuracy after 20 rounds, remaining close to the centralized baseline. These findings demonstrate the practical feasibility of secure model-update exchange in FL on real IoT microcontrollers and provide hardware-grounded guidance for cipher selection under tight resource budgets.

Article
Computer Science and Mathematics
Analysis

Helal Mohamed

,

Mohammed Rabih

Abstract: This research explores the existence of solutions for a class of random fractional differential equations characterized by bounded delay, specifically within the context of Fréchet spaces. By integrating the properties of noncompactness measures with a generalized Darbo fixed point approach, we establish existence results for the associated Darboux problem. To illustrate the practical utility of these analytical results, a representative example is provided.

Article
Computer Science and Mathematics
Software

Elton Boshnjaku

,

Galia Marinova

,

Edmond Hajrizi

,

Besnik Qehaja

Abstract: Smart microgrids combining photovoltaic arrays, wind turbines, and battery storage generate telemetry that existing open-source monitoring tools cannot process with per-mechanism energy loss visibility in real time. This paper presents a design, im-plementation, and evaluation of an open-source IoT Monitoring Framework. The framework incorporates a physics-based microgrid simulator, a hierarchical MQTT communication architecture, and a React-based web-based user interface that supports WebSocket-based real-time data visualization. The open-source framework consists of twelve containerized microservices that can be started with a single command: docker compose up -d. The code has been released under the permissive MIT license. All stack performance testing was conducted using a simulated 1 hour test case based on a 100kWp PV system, 10kW wind turbine, and 50kWh battery powered campus mi-crogrid. Average P50 end-to-end latency was 27.2 ms and P99 end-to-end latency was 48.3 ms with 100% message delivery across 5,840 test messages with per-topic analy-sis revealing a 25 ms serialization-order effect in sequential MQTT publishing. Com-parative analysis against ten existing platforms including OpenEMS, VOLTTRON, Eclipse Ditto, and pymgrid confirms that no prior open-source framework unifies physics-based loss telemetry, IoT communication, time-series storage, and real-time visualization in a single reproducible deployment.

Article
Computer Science and Mathematics
Other

Huayou Si

,

Mengyang Li

,

Yuanyuan Qi

,

Neal N. Xiong

,

Wei Chen

,

Loc Nguyen The

,

Shichong Wang

Abstract: This paper proposes a decentralized data trading approach based on the Automated Market Maker (AMM) mechanism, aiming to break through the bottlenecks in data trading efficiency and fairness via the collaborative innovation of market-oriented pricing mechanisms and automated trading processes. We focus on constructing an automatic pricing and matching mechanism based on liquidity pools. Subsequently, mathematical modeling and simulations reveal slippage generation mechanisms in data liquidity pools under trading shocks and imbalances. To address these issues, a novel dual slippage optimization mechanism integrating dynamic trade splitting and alternating order sorting is proposed, which decomposes orders into sub-orders and reorganizes their timing, establishing a dynamic equilibrium model. Experiments show the method reduces average slippage amplitude from 2.1% to 0.5% and representing a 76.2% reduction, significantly enhancing price stability and market efficiency. This research explores the mechanism of applying AMM to data asset trading and overcomes AMM's limitations, providing a theoretical and empirical foundation for building low-cost, high-fairness data trading systems through mechanism innovation and technical optimization.

Article
Computer Science and Mathematics
Information Systems

Antonio Rocca

,

Luigi Laura

,

Marco Parrillo

Abstract: The generation of synthetic seismic accelerograms is a critical problem in earth- quake engineering, where the scarcity of strong-motion records, particularly for high-magnitude and near-fault scenarios, limits the reliability of structural analyses and probabilistic seismic hazard assessments. This paper proposes a wavelet-decomposed conditional Generative Adversarial Network (WD-cGAN) for the synthesis of seismic accelerograms that faithfully reproduce the phys- ical and statistical properties of real ground-motion records. Unlike prior GAN-based approaches that rely on Fourier-domain decomposition, the pro- posed architecture decomposes each training signal into N wavelet sub-bands (experimentally N ∈ {5, 6}) using the Daubechies-4 (db4) discrete wavelet transform (DWT), assigning each sub-band to a dedicated discriminator. A novel energy-based weighting scheme αi modulates the relative contribution of each discriminator to the total generator loss, ensuring that physically dominant, low-frequency bands, which carry the bulk of seismic energy, receive proportionally higher training emphasis. Seismic moment magnitude Mw serves as the primary conditioning variable, enabling targeted synthesis for specific hazard scenarios. The model is implemented in Python using PyTorch and trained on accelerograms drawn from the Italian INGV/ITACA v4.0 archive. Qualitative evaluation confirms that the proposed wavelet-domain multi-discriminator scheme improves the realism and physical consistency of synthetic accelerograms relative to a single-discriminator baseline; full quantitative validation on a larger corpus is identified as the principal avenue for future work.

Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Robert Campbell

Abstract: Anthropic’s April 2026 Claude Mythos Preview release established a new operational category of frontier AI systems—Mythos-class—whose capability profile (extended-context reasoning over codebases, recursive self-correction, native system-tool integration, and agentic scaffolding at deployable scale) renders the dominant AI safety paradigms insufficient as sole controls. Reinforcement learning from human feedback, post-generation output filtering, contractual access vetting, and human-in-the-loop supervision were each calibrated to a generation of systems that did not exhibit autonomous cyber capability at the levels Mythos-class systems now demonstrate, and each is insufficient as a sole control against the new category under the threat assumptions specified here. This paper develops a defense-in-depth reference architecture for detecting and mitigating Mythos-class capability across enterprise and federal deployment surfaces. Detection is structured as a three-tier framework spanning pre-deployment evaluation, deployment-time access and telemetry, and runtime behavioral signatures. Mitigation is structured as four concentric layers: governance, cryptographic enforcement, architectural isolation, and operational monitoring. The cryptographic enforcement layer specifies an authority-binding architecture using post-quantum-attested provenance to bind output release to a verifiable authority chain. The architecture is mapped to the NIST AI Risk Management Framework, the NIST Cybersecurity Framework (CSF) 2.0, and the CISA Zero Trust Maturity Model, and is demonstrated against three application cases: post-quantum cryptography migration, federal AI supply-chain assurance, and critical-infrastructure operational technology defense. Limitations and a research agenda for empirical calibration are stated explicitly.

Article
Computer Science and Mathematics
Applied Mathematics

Bichitra Kumar Lenka

Abstract: We address a constructive fractional Lyapunov direct method for Caputo-type incommensurate non-autonomous fractional order systems whenever orders lie in (0,1]. We prove some new fractional Lyapunov theorems by using new ideas of fractional generalized Gronwall inequality and establish higher versions of Lyapunov theorems that give sufficient conditions to predict stability dynamics of equilibrium points of many such systems. We demonstrate the new significance of such a method with five mathematical examples in stability theory.

Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

G. H. B. A. de Silva

Abstract: Artificial Intelligence (AI) systems are increasingly embedded in development contexts across the Global South, yet limited evidence explains how individuals within marginalized communities behaviorally adapt to these technologies beyond structural access and governance conditions. Building on prior framework-based analysis, this study examines the micro-level processes through which users internalize and operationalize AI-enabled systems in everyday livelihood and learning activities. A mixed-method sequential explanatory design was employed using the same population across urban, peri-urban, and rural settings, integrating structured surveys with ethnographic observations, digital usage tracing, and behavioral mapping. The findings identify three dominant adaptation pathways: instrumental adoption driven by efficiency gains, socially negotiated use shaped by contextual constraints, and reflexive adaptation linked to learning and trust formation. Quantitative analysis indicates that user agency significantly mediates the relationship between access and effective utilization, while qualitative insights reveal that learning styles and socio-cultural conditions influence the depth and sustainability of engagement. The study concludes that inclusive AI outcomes depend not only on infrastructure and governance but also on dynamic human–technology interactions, where cognitive engagement and iterative feedback mechanisms play a central role. These findings extend existing models by introducing a behavioral adaptation dimension critical for designing context-sensitive and sustainable AI interventions.

Review
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Yiwen Zhu

,

Lihe Liu

,

Jiaqian Yu

,

Di Zhang

Abstract: The proliferation of large language model (LLM) agents has enabled increasingly complex 2 multi-step automation; however, composing multiple agents into coherent systems intro3 duces significant orchestration challenges that remain poorly documented. This survey 4 examines LLM-based multi-agent orchestration from 2023 through early 2026 (literature 5 cutoff: March 2026). We propose a three-topology, one-adaptivity taxonomy—centralized, 6 decentralized, and hierarchical coordination topologies, each optionally augmented with 7 a dynamic/adaptive control axis—grounded in classical multi-agent systems theory and 8 recent empirical evidence. We compare four leading frameworks (LangGraph, CrewAI, 9 AutoGen/Microsoft Agent Framework, and OpenAI Agents SDK) along axes directly rele10 vant to practitioners: state-management granularity, token cost structure, failure-recovery 11 options, and design philosophy. The emerging protocol stack is examined in terms of why 12 MCP (agent-to-tool) and A2A (agent-to-agent) occupy complementary layers, how the 13 ACP–A2A merger signals protocol convergence, and where ANP’s decentralized-discovery 14 design fits. Production design considerations—state management, task planning, error 15 handling, scalability, and security—are evaluated with reference to published benchmarks. 16 We close by identifying five open challenges and proposing a six-dimension evaluation 17 framework for multi-agent coordination quality. This paper provides practitioners with 18 a decision framework spanning taxonomy, framework selection, protocol adoption, and 19 production deployment.

Concept Paper
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Gabriel Axel Montes

Abstract: AGI is often framed as a problem of aligning model objectives with human values or constraining agent behavior. That framing becomes incomplete once AI systems move into the infrastructures through which people and institutions perceive, evaluate, remember, and decide. Cognitive integrity is introduced as the first infrastructure of intelligence, in humans and AGI-mediated systems alike: the evolving capacity of a bounded system to maintain calibrated attention, trust, contestability, and decision under pressure. The central risk is not boundary change as such, but maladaptive boundary reorganization: transitions that leave persons or institutions unable to reform a viable, reality-linked, self-directing boundary after coupling with AI. This reframing surfaces a conceptual vocabulary for AGI governance centered on integrity boundaries and health, failed reintegration, cognitive rails, and successor-safe continuity.

Review
Computer Science and Mathematics
Computer Science

Matthew P. Dube

,

Brendan P. Hall

,

T. Tyler Thibeau

Abstract: The big data revolution transformed how we think of data analytics in many ways. Critical amongst them are the somewhat interconnected ideas of volunteered geographic information, crowdsourcing, and the big data property of variety. The robust literature concerning conceptual neighborhood graphs in two of these cases considers objects whose datatypes are held stable between the relations under consideration. This, however, is a limiting factor in these three application spaces due to the unknown form that data will take. This paper considers two avenues for the conceptual neighborhood graph to take as directions for future research: discretization conceptual neighborhood graphs (changing between corresponding vector and raster spaces) and cartographic generalization conceptual neighborhood graphs (changing the form of the objects in question). This paper provides insights as to what considerations should be considered when embarking upon this idea and demonstrates these concepts applied to prior conceptual neighborhood graphs.

Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Ali Kia

,

Batuhan Uzunoglu

,

Silvana Andreescu

,

Masudul H Imtiaz

Abstract: Wearable electrochemical biosensors often producevoltammetric signals that are corrupted by noise and long-term drift.Effective on-device denoising is critical to improve signal quality anddetect anomalies due to sensor drift or interference. This paperexplores lightweight TinyML models for denoising and drift detectionin wearable sensor voltammograms under the strict memoryconstraints of microcontrollers. We apply compact 1D convolutionaland dense autoencoder networks, as well as a PCA-basedreconstruction, to remove noise and identify drifting signals. Using apublic NIST dataset of cyclic voltammograms with added syntheticnoise and artifacts, we evaluate each model’s denoising performance(signal reconstruction MSE) and drift/anomaly detection capability (ROC-AUC) versus its memory footprint (quantized int8 model size). Results show that a small Conv1D autoencoder (8KB weights) canreduce noise by 75% and achieve 0.89 AUC for drift detection,approaching the performance of a larger dense autoencoder (35KB)and outperforming PCA. We observe a trade-off between model sizeand generalization: the larger autoencoder nearly perfectly flaggedanomalies (AUC 1.0) but smaller models remain competitive whileusing 4–6× less memory. These findings demonstrate that drift-resilient signal enhancement can be achieved on-device with minimalresource usage, enabling more robust wearable electrochemicalsensing.

Article
Computer Science and Mathematics
Analysis

Minghua Shi

,

Jianbing Su

,

Kang Wang

Abstract: This paper investigates weighted composition-differentiation operators acting between Bers- type spaces defined on generalized Hua domains of the first kind. By establishing a key norm inequality for functions in these spaces, we derive necessary and sufficient conditions for the boundedness and compactness of such operators.

Article
Computer Science and Mathematics
Computer Vision and Graphics

Zichen Zhang

,

Chengjun Guo

Abstract: Existing vision-based defect detection algorithms built upon YOLOv11 often exhibit unstable performance in complex building environments, where varying illumination conditions and partial occlusions caused by debris or vegetation can severely degrade detection accuracy. More importantly, most existing methods rely solely on visual features while neglecting domain-specific prior knowledge from civil engineering, particularly the geometric continuity of structural damages and the physical stress distribution around defect regions. As a result, these approaches remain vulnerable to background interference, show limited capability in extracting features of small-scale defects, and may generate detections that are inconsistent with the actual physical characteristics of structures.To overcome these limitations, this paper proposes an enhanced detection framework, termed **PIA-YOLO**, which integrates a Physical Information Attention (PIA) module and a Residual Efficient Channel Attention (RECA) module as dual attention branches. Specifically, the PIA module incorporates civil engineering priors by embedding physically inspired gradient operators into the attention mechanism, rather than directly solving physical equations, thereby enhancing structural feature perception and suppressing physically unreasonable detections. Meanwhile, the RECA module adaptively recalibrates channel-wise feature responses through learnable residual coefficients, enabling more effective representation of subtle defects such as cracks and spalling that are characterized by small targets and weak pixel contrast.Extensive experiments on both public datasets and a self-built crack dataset demonstrate the effectiveness of the proposed method. Compared with the baseline YOLOv11, PIA-YOLO improves mAP@0.5 by 2.2\% and 15.9\%, respectively, while increasing recall by 4.6\% and 34.0\%, without significantly sacrificing inference speed or increasing computational cost. These results indicate that PIA-YOLO provides an efficient and accurate solution for intelligent building defect detection, with promising applications in structural inspection, environmental monitoring, traffic infrastructure management, and post-disaster assessment.

Review
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Numan Saeed

,

Salma Hassan

,

Shadab Khan

,

Mohammad Areeb Qazi

,

Klaus H. Maier-Hein

,

Salman Khan

,

Mohammad Yaqub

Abstract: Clinical care is interventional. Physicians must decide how a patient's trajectory is likely to change under competing actions, not only estimate risk under the status quo. Most deployed medical artificial intelligence, however, remains optimized for classification or passive forecasting. We argue that the useful next abstraction is the medical world model, a learned system that represents patient state, models how that state evolves over time, accepts interventions such as drugs, doses, and procedures, and rolls trajectories forward under those interventions. Progress toward this goal is currently fragmented across digital twins, disease-trajectory models, surgical simulators, and generative electronic health record forecasting, with each community addressing a subset of the necessary ingredients. We organize the field with a capability ladder spanning representation, forecasting, single-arm projection, comparative treatment evaluation, and planning. Across imaging, physiology, longitudinal electronic health records, and surgical simulation, a consistent maturity pattern emerges. Representation and forecasting are widespread, narrow treatment-conditioned simulators are appearing, credible counterfactual comparison remains scarce, and validated treatment planners are absent. Once a model simulates what would happen under alternative treatments, causal validity becomes the binding constraint. Scaling data and generative modeling alone will not solve this. Credible medical world models also require explicit action definitions, causal design, and staged clinical validation with regulatory oversight. In this paper, the medical world model is a claims-to-evidence framework for simulation that can inform clinical decisions.

of 707

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated