Computer Science and Mathematics

Sort by

Article

Robotics

A Smart Contract-Based Algorithm for Offline UAV Task Collaboration: A New Solution for Managing Communication Interruptions

Linchao Zhang,

Lei Hang,

Keke Zu,

Yi Wang

Abstract: Environmental factors and electronic interference often disrupt communication between UAV swarms and ground control centers, requiring UAVs to complete missions autonomously in offline conditions. However, current coordination schemes for UAV swarms heavily depend on ground control, lacking robust mechanisms for offline task allocation and coordination, which compromises efficiency and security in disconnected settings. This limitation is especially critical for complex missions, such as rescue or attack operations, underscoring the need for a solution that ensures both mission continuity and communication security. To address these challenges, this paper proposes an offline task coordination algorithm based on blockchain smart contracts. This algorithm integrates task allocation, resource scheduling, and coordination strategies directly into smart contracts, allowing UAV swarms to autonomously make decisions and coordinate tasks while offline. Experimental simulations confirm that the proposed algorithm effectively coordinates tasks and maintains communication security in offline states, significantly enhancing the swarm’s autonomous performance in complex, dynamic scenarios.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1684.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Adaptive AI-Driven Toll Management: Enhancing Traffic Flow and Sustainability through Real-Time Prediction, Allocation, and Task Optimization

Satendra Ch Pandey,

Vasanthi Kumari Kumari P

Abstract: Efficient toll processing is essential for reducing traffic congestion and enhancing transportation network operations at toll stations. This study examines the Neelamangala Toll Plaza on India's National Highway 48, focusing on the potential of artificial intelligence (AI) to optimize toll processing. A detailed work following with case study of the Neelamangala Toll Plaza was conducted, with machine learning algorithms utilized to analyze data and predict traffic patterns as vehicles approached the toll station. The system integrated AI models—specifically, a Supervised Learning (SL) time series model for traffic prediction and Reinforcement Learning (RL) based on a Markov Decision Process (MDP)—alongside a randomized algorithm to dynamically adjust to real-time traffic conditions. The randomized algorithm facilitated equitable task distribution, preventing system overload during peak hours. System performance was assessed using key metrics: Average Processing Time (APT), Queue Length Reduction (QLR), and Throughput (TP), which measured the system’s ability to manage high traffic volumes and mitigate congestion. The AI-powered model demonstrated significant improvements in processing times, queue length reduction, and overall vehicle flow, outperforming traditional methods in both speed and scalability. AI-driven toll management techniques reduce processing times by approximately 35%, decrease queue lengths by 28%, and increase throughput by 40% compared to traditional toll processing systems. These findings suggest a robust, adaptive solution for modern toll systems, with broader implications for efficient and sustainable transportation infrastructure

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1595.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

CRP-RAG: A Retrieval-Augmented Generation Framework for Supporting Complex Logical Reasoning and Knowledge Planning

Kehan Xu,

Kun Zhang,

Jingyuan Li,

Wei Huang,

Yuanzhuo Wang

Abstract: The Retrieval-Augmented Generation (RAG) Framework enhances Large Language Models (LLMs) by retrieving relevant knowledge to broaden their knowledge boundaries and mitigate factual hallucinations stemming from knowledge gaps. However, the RAG Framework faces challenges in effective knowledge retrieval and utilization; invalid or misused knowledge will interfere with LLM generation, reducing reasoning efficiency and answer quality. Existing RAG methods address these issues by decomposing and expanding queries, introducing special knowledge structures, and using reasoning process evaluation and feedback. However, the linear reasoning structures limit complex thought transformations and reasoning based on intricate queries. Additionally, knowledge retrieval and utilization are decoupled from reasoning and answer generation, hindering effective knowledge support during answer generation. To address these limitations, we propose the CRP-RAG framework, which employs reasoning graphs to model complex query reasoning processes more comprehensively and accurately. CRP-RAG guides knowledge retrieval, aggregation, and evaluation through reasoning graphs, dynamically adjusting the reasoning path based on evaluation results and selecting knowledge-sufficiency paths for answer generation. Experimental results show that CRP-RAG significantly outperforms state-of-the-art LLMs and RAG baselines in three reasoning and question answering tasks, demonstrating superior factual consistency and robustness compared to existing RAG methods.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1648.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

WGIE: Extraction of Wheat Germplasm Resource Information Based on Large Language Model

Yijin Wei,

Jingchao Fan

Abstract:

Increased wheat production is crucial for addressing food security concerns caused by limited resources, extreme weather, and population expansion. However, breeders have challenges due to fragmented information in multiple research articles, which slows progress in generating high-yield, stress-resistant, and high-quality wheat. This study presents WGIE (Wheat Germplasm Information Extraction), a wheat research article abstract-specific data extraction workflow based on conversational large language models (LLMs) and rapid engineering. WGIE employs zero-shot learning, multi-response polling to reduce hallucinations, and a calibration component to ensure optimal outcomes.Validation on 443 abstracts yielded 0.8010 Precision, 0.9969 Recall, 0.8883 F1 Score, and 0.8171 Accuracy, proving the ability to extract data with little human effort. Analysis found that irrelevant text increases the chance of hallucinations, emphasizing the necessity of matching prompts to input language. While WGIE efficiently harvests wheat germplasm information, its effectiveness is dependent on the consistency of prompts and text. Managing conflicts and enhancing prompt design can improve LLM performance in subsequent jobs.

Abstract:

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1571.v1

Article

Computer Science and Mathematics

Information Systems

Digital Twin, Didymos, Meets Digital Cousin, Didymium. From Paradox to Paradigm or a Paradoxical Paradigm?

Shoumen Datta

Abstract: Laissez-faire interpretation of what constitutes a digital twin may catalyze a broader diffusion of the principles (ideas) and perhaps even accelerate adoption of digital representations of physical entities, albeit in select parts of the affluent world (nations with significant amount of disposable income, per capita). The limits of efficiency and efficacy of digital proxies will affect the value of actionable (bidirectional) information which may be extracted/shared/exchanged from data and analytics (contextually connected causal relationships, Figure 33). Applications are easier in the mechanical context (manufacturing, automotive, buildings). Digital duplicates of natural systems (environment, health, agriculture) are beguiling. Representation in the form of “twins” suggests exact/identical twining (of data) which may be difficult to duplicate between the physical and digital. Hence, digital cousins of tiny sub-segments of systems may be useful if we grasp the science of the data and avoid the less understood cognitive processes (cognition refers to mental action or process of acquiring knowledge and understanding through thought, experience, and the senses). If parameters are well understood (e.g., causality), if the acquired data is rigorous, mathematically robust (e.g., proportionality, rate, ratio) and informative (e.g., blood glucose levels and type II diabetes mellitus), then digital cousins may be less irrational as an aspirational goal. Directly or indirectly, knowingly or unknowingly, in astronomical events or in infinitesimal instances, all tools, technologies and techniques (e.g., statistical, operations research [OR], mathematical) converge to catalyze our need to be data-informed, to make sense of data before the value of the data perishes, and extract actionable information (e.g., process optimization in OR). At the core of almost any system with a popular “buzz” (digital twins, internet of things, cyberphysical systems, cloud, machine learning, smart cities, “big data”, “DL”, “AI”, “Industry X.O”) we commence with data to extract meaningful information of value. Relevant semantics or “meaning” must arise from the anastomosis of causality with context as well as metrics and measurements. Value is related to “performance” depending on the context and actions (feed-back, feed-forward) which could become a highly complex decision process (e.g., explosion of state space when synthesizing or analyzing data from percepts, environment, actuators, and sensors, referred to as PEAS, the superset of the OODA loop: the cycle of observe-orient-decide-act). The underlying glue that permeates the fabric of continuum between meaning and value is causality. Almost every “thing” (made of atoms) or processes or systems we dissect, deconstruct and reconstruct, is made significant when and if associated with data (bits). The continuum of meaning and value is in dynamic interaction with the continuum between atoms to bits. The constructs of this multi-string, multi-dimensional continuum are connectivity, data, analytics and context (ACDC). In this chapter, we explore examples of this “electricity” which powers the engines of science, decision science, and data-informed systems across a broad and diverse spectrum of verticals and applications. However, economics of technology could make or break digital representation. It may remain prohibitive for decades, if not centuries, in resource constrained communities, which comprises ~80% of the global population of ~8 billion. Therefore, one begs to ask how suitable are digital twins?

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1638.v1

Article

Computer Science and Mathematics

Other

DIKWP-TRIZ: A Revolution on Traditional TRIZ towards Invention for Artificial Consciousness

Kunguang Wu,

Yucong Duan

Abstract: We propose the DIKWP-TRIZ framework, an innovative extension of the traditional Theory of Inventive Problem Solving (TRIZ) designed to address the complexities of cognitive processes and artificial consciousness. By integrating the elements of Data, Information, Knowledge, Wisdom, and Purpose (DIKWP) into the TRIZ methodology, the proposed framework emphasizes a value-oriented approach to innovation, enhancing the ability to tackle problems characterized by incompleteness, inconsistency, and imprecision. Through a systematic mapping of TRIZ principles to DIKWP transformations, we identify potential overlaps and redundancies, providing a refined set of guidelines that optimize the application of TRIZ principles in complex scenarios. The study further demonstrates the framework’s capacity to support advanced decision-making and cognitive processes, paving the way for the development of AI systems capable of sophisticated, human-like reasoning. Future research will focus on comparing the implementation paths of DIKWP-TRIZ and traditional TRIZ, analyzing the complexities inherent in DIKWP-TRIZ-based innovation, and exploring its potential in constructing artificial consciousness systems.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202410.0751.v2

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Table Extraction With Table Data Using VGG-19 Deep Learning Model

Muhammad Zahid Iqbal,

Nitish Garg,

Saad Bin Ahmed

Abstract: In recent years, significant progress has been achieved in understanding and processing tabular data. However, existing approaches often rely on task-specific features and model architectures, posing challenges in accurately extracting table structures amidst diverse layouts, styles, and noise contamination. This study introduces a comprehensive deep learning methodology that is tailored for precise identification and extraction of rows and columns from document images containing tables. The proposed model employs table detection and structure recognition to delineate table and column areas, followed by semantic rule- based approaches for row extraction within tabular sub-regions. The evaluation was performed on the publicly available Marmot data Table datasets demonstrates state-of-the-art performance. Additionally, transfer learning using VGG 19 is employed for fine-tuning the model, enhancing its capability further. Furthermore, this project fills a void in the Marmot dataset by providing it with extra annotations for table structure, expanding its scope to encompass column detection in addition to table identification.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1627.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

AI-Driven Real-Time Monitoring of Ground-Nesting Birds: A Case Study on Curlew Detection Using YOLOv10

Carl Chalmers,

Paul Fergus,

Serge Wich,

Steven N Longmore,

Naomi Davies Walsh,

Lee Oliver,

James Warrington,

Julieanne Quinlan,

Katie Appleby

Abstract: Effective monitoring of wildlife is critical for assessing biodiversity and ecosystem health, as declines in key species often signal significant environmental changes. Birds, particularly ground-nesting species, serve as important ecological indicators due to their sensitivity to environmental pressures. Camera traps have become indispensable tools for monitoring nesting bird populations, enabling data collection across diverse habitats. However, the manual processing and analysis of such data are resource-intensive, often delaying the delivery of actionable conservation insights. This study presents an AI-driven approach for real-time species detection, focusing on the curlew (Numenius arquata), a ground-nesting bird experiencing significant population declines. A custom-trained YOLOv10 model was developed to detect and classify curlews and their chicks using 3/4G-enabled cameras linked to the Conservation AI platform. The system processes camera trap data in real-time, significantly enhancing monitoring efficiency. Across 11 nesting sites in Wales, the model achieved high performance, with a sensitivity of 90.56%, specificity of 100%, and F1-score of 95.05% for curlew detections, and a sensitivity of 92.35%, specificity of 100%, and F1-score of 96.03% for curlew chick detections. These results demonstrate the capability of AI-driven monitoring systems to deliver accurate, timely data for biodiversity assessments, facilitating early conservation interventions and advancing the use of technology in ecological research.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1619.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

MC-Net: A Multi-Path Contextual Reasoning Framework for Multimodal Conversations

Ethan Parker,

Nia Harper,

Jannat Roy

Abstract: Multimodal Conversation is a sophisticated vision-language task where an AI agent must engage in meaningful dialogues grounded in visual content. This requires a deep understanding of not only the presented question but also the dialog history and the associated image context. However, existing methods primarily focus on single-hop or single-path reasoning, which often fall short in capturing the nuanced multimodal relationships essential for generating accurate and contextually relevant responses. In this paper, we propose a novel and powerful model, the Multi-path Contextual Reasoning Model (MC-Net), which employs multi-path reasoning and multi-hop mechanisms to process complex multimodal information comprehensively. MC-Net integrates dialog history and image context in parallel, iteratively enriching the semantic representation of the input question through both paths. Specifically, MC-Net adopts a multi-path framework to simultaneously derive question-aware image features and question-enhanced dialog history features, effectively leveraging iterative reasoning processes within each path. Furthermore, we design an enhanced multimodal attention mechanism to optimize the decoder, enabling it to generate highly precise responses. Experimental results on the VisDial v0.9 and v1.0 datasets demonstrate that MC-Net significantly outperforms existing methods, showcasing its efficacy in advancing multimodal conversational AI.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1637.v1

Article

Computer Science and Mathematics

Artificial Intelligence and Machine Learning

Advanced Data Framework for Sleep Medicine Applications: Machine Learning-Based Detection of Sleep Apnea Events

Kristina Zovko,

Yann Sadowski,

Toni Perković,

Petar Šolić,

Ivana Pavlinac Dodig,

Renata Pecotic,

Zoran Đogaš

Abstract: Obstructive Sleep Apnea (OSA) is a prevalent condition that disrupts sleep quality and contributes to significant health risks, necessitating accurate and efficient diagnostic methods. This study introduces a machine learning-based framework aimed at detecting and predicting apnea events through analysis of polysomnographic (PSG) and oximetry data. The core component is a Long Short-Term Memory (LSTM) network, which is particularly suited to processing sequential time-series data, capturing complex temporal relationships within physiological signals such as oxygen saturation, heart rate, and airflow. Through extensive feature engineering and preprocessing, the framework optimizes data representation by normalizing, scaling, and encoding input features to enhance computational efficiency and model performance. Key results demonstrate the model’s effectiveness, achieving an accuracy of 79%, precision of 68%, and recall of 76% on the test dataset, with validation set metrics similarly high, supporting the model’s ability to generalize effectively. Comprehensive hyperparameter tuning further contributed to a stable, robust architecture capable of accurately identifying apnea events, providing clinicians with a valuable tool for early detection and tailored management of OSA. This data-driven framework offers an efficient, reliable solution for OSA diagnostics, with the potential to improve clinical decision-making and patient outcomes.

Posted: 21 November 2024

https://doi.org/10.20944/preprints202411.1629.v1

of 785