Computer Science and Mathematics

Sort by

Article
Computer Science and Mathematics
Security Systems

Fatma Yasmine Loumachi,

Mohamed Chahine Ghanem,

Mohamed Amine Ferrag

Abstract: Cyber timeline analysis or Forensic timeline analysis is critical in Digital Forensics and Incident Response (DFIR) investigations. It involves examining artefacts and events—particularly their timestamps and associated metadata—to detect anomalies, establish correlations, and reconstruct a detailed sequence of the incident. Traditional approaches rely on processing structured artefacts, such as logs and filesystem metadata, using multiple specialised tools for evidence identification, feature extraction, and timeline reconstruction. This paper introduces an innovative framework, GenDFIR, a context-specific approach powered by large language models (LLMs) capabilities. Specifically, it proposes the use of Llama 3.1 8B in zero-shot, selected for its ability to understand cyber threat nuances, integrated with a Retrieval-Augmented Generation (RAG) agent. Our approach comprises two main stages: (1) Data Preprocessing and Structuring: Incident events, represented as textual data, are transformed into a well-structured document, forming a comprehensive knowledge base of the incident. (2) Context Retrieval and Semantic Enrichment: A RAG agent retrieves relevant incident events from the knowledge base based on user prompts. The LLM processes the pertinent retrieved-context, enabling detailed interpretation and semantic enhancement. The proposed framework was tested on synthetic cyber incident events in a controlled environment, with results assessed using DFIR-tailored, context-specific metrics designed to evaluate the framework’s performance, reliability, and robustness, supported by human evaluation to validate the accuracy and reliability of the outcomes. Our findings demonstrate the potential of LLMs in DFIR and the automation of the timeline analysis process. This approach highlights the power of Generative AI, particularly LLMs, and opens new possibilities for advanced threat detection and incident reconstruction.
Article
Computer Science and Mathematics
Computer Science

Hao Yan,

Zixiang Wang,

Yi Zhao,

Yang Zhang,

Ranran Lyu

Abstract: Image generation optimization is an important research direction in the field of deep learning, which aims to improve the performance of image generation models and the quality of generated images. In recent years, researchers have made significant progress in image generation optimization with the development of deep generative models such as generative adversarial networks (GANs) and variational autoencoders (VAEs). These models are able to generate high-quality, realistic images by learning the distribution of image data. In this study, a deep learning-based image generation optimization model was adopted, which combined the advantages of GAN and VAE. The model architecture consists of a generator and a discriminator, where the generator is responsible for generating the image and the discriminator is used to judge the authenticity of the image. In addition, the model also introduces attention mechanism and self-supervised learning strategy to further improve the quality and diversity of generated images. In the training process, a large-scale image dataset is used, and a variety of optimization algorithms are used to improve the stability and efficiency of the model. By evaluating various indicators of the generative model, including image quality, generation speed and model convergence, it was found that the introduced attention mechanism and self-supervised learning strategy significantly improved the performance of the model.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Muhammed Abnas,

Muhammed Imkan K M,

Ajmal J S,

Abhiram P Vasudevan,

Shereena Thampi,

Rosy K Philip

Abstract: This literature survey focuses on various methodologies and frameworks that have been developed for speech recognition, accent recognition, and dialect translation. The research is aimed at building a comprehensive understanding of how existing technologies like Wav2Vec, SpeechBrain, and hybrid CTC/Attention models can be applied to create a speech converter API that translates colloquial Malayalam speech into standard Malayalam text and further into English. The goal of this API is to bridge communication gaps caused by the linguistic diversity within Malayalam dialects, thus improving accessibility in fields such as education, social media, and public services.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Mathan Kumar Mounagurusamy,

Thiyagarajan V S,

Abdur Rahman,

Shravan Chandak,

D Balaji,

Venkateswara Rao Jallepalli

Abstract: Early management and better clinical outcomes for epileptic patients depend on seizure prediction. The accuracy and false alarm rates of existing systems are often compromised by their dependence on static thresholds and basic Electroencephalogram (EEG) properties. A novel Recurrent Neural Network (RNN)-based method for seizure start prediction is proposed in the article to overcome these limitations. As opposed to conventional techniques, the proposed system makes use of Long Short-Term Memory (LSTM) networks to extract temporal correlations from unprocessed EEG data. It enables the system to adapt dynamically to the unique EEG patterns of each patient, improving prediction accuracy. The methodology of the system comprises thorough data collecting, preprocessing, and LSTM-based feature extraction. Annotated EEG datasets are then used for model training and validation. Results show a considerable reduction in false alarm rates (average of 6.8%) and an improvement in prediction accuracy (90.2% sensitivity, 88.9% specificity, and AUC-ROC of 93). Additionally, computational efficiency is significantly higher than that of existing systems (12 ms processing time, 45 MB memory consumption). About improving seizure prediction reliability, these results demonstrate the effectiveness of the proposed RNN-based strategy, opening up possibilities for its practical application to improve epilepsy treatment.
Article
Computer Science and Mathematics
Computer Networks and Communications

Isadora Rezende Lopes,

Paulo Rodolfo da Silva Leite Coelho,

Rafael Pasquini,

Rodrigo Sanches Miani

Abstract: The development of technologies using the Internet of Things (IoT) concept evolves daily. These numerous technologies, such as LoRa (Long Range) transceivers, find applications in various domains, including monitoring natural disasters and those caused by human error. Security vulnerabilities arise concurrently with the advancement of these new technologies. Cyberattacks seeking to disrupt device availability, such as Denial of Service (DoS) attacks, can effectively exploit vulnerabilities in LoRa devices, hindering disaster monitoring efforts. Therefore, our goal is to assess the network parameters that impact the development of a disaster monitoring environment using LoRaWAN. Specifically, we aim to identify the parameters that could result in network availability issues, whether caused by malicious actors or configuration errors. Our results indicate that certain LoRa network parameters (collision checks, packet size, and the number of nodes) can significantly affect network performance, potentially rendering this technology unsuitable for building robust disaster monitoring systems.
Article
Computer Science and Mathematics
Computer Science

Rahil Ashtari Mahini,

Maryam Safaripour,

Achiya Khanam,

Gerardo M. Casanola-Martin,

Dean C. Webster,

Simone A. Ludwig,

Bakhtiyor Rasulev

Abstract:

The Quantitative Structure-Activity Relationship (QSAR) approach for predicting the biological activity and physicochemical properties of mixtures is gaining prominence, driven by the growing demand for highly engineered materials designed for specific functions. Developing mixture descriptors that effectively capture the intricacies of multi-component materials presents a significant challenge due to their structural complexity. We implemented a series of existing and new mixing rules to drive the mixture descriptors and develop mixture-based-QSAR (mxb-QSAR) models. We evaluated 12 additive mixture descriptors, and a novel non-additive combinatorial descriptor derived from the Cartesian product. These descriptors were used to model the fouling release (FR) property of 18 silicone oil-infused PDMS coating polymers by characterizing the removal of Ulva. linza. Various linear and nonlinear mxb-QSAR models were obtained using these 13 mixture descriptors. The best model, derived from the newly proposed Cartesian-based combinatorial mixture descriptors, employed a decision tree in combination with a two-stage feature importance feature selection. This model achieved a coefficient of determination R2 of 0.987 for both training and test sets, along with a cross-validation Q2 LOO of 0.791. The success of the nonlinear model and combinatorial descriptors underscores the significance of complex relationships among variables, as well as the synergistic effects of the components on fouling release properties.

Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Vincenzo Manca

Abstract: The formalism of Functional Language Logic (FLL) is presented, which is an extension of the logical formalism introduced in "Information MDPI" (15, 1, 64, 2024) for representing sentences in natural languages. In the FLL framework, a sentence is represented by aggregating primitive predicates corresponding to words of a fixed language (English in the given examples). The FLL formalism constitutes a bridge between mathematical logic (high-order predicate logic) and classical logical analysis of discourse, rooted in the Western linguistic tradition. Namely, FLL representations reformulate on a rigorous logical basis many fundamental classical concepts (complementation, modification, determination, distribution, \ldots), becoming, at the same time, a natural way of introducing mathematical logic through natural language representations, where the logic of linguistic phenomena is analyzed independently from the single syntactical and semantical choices of particular languages. In FLL, twenty logical operators express the mechanisms of logical aggregation underlying meaning constructions. The relevance of FLL in Chatbot interaction is considered, and the relationship between embedding vectors of LLM (Large Language Models) transformers and FLL representations is outlined.
Article
Computer Science and Mathematics
Computer Networks and Communications

XiaoZong Qiu,

Guo Hua Yan,

LiHua Yin

Abstract: The identification and classification of traffic is of great significance for maintaining network security, optimizing network management and providing reliable service quality. These functions not only help prevent malicious activities such as network attacks and illegal intrusions, but also effectively support the reasonable allocation of network resources and improve user experience. However, although the wide application of network traffic encryption technology enhances the security of data transmission, it also makes the content of traffic difficult to be directly analyzed, resulting in the existing identification technology is inefficient in the face of encrypted traffic and difficult to accurately classify. This not only affects the maintenance of network security, but also limits the further improvement of network service quality. Therefore, developing efficient and accurate encryption traffic identification methods has become an urgent problem to be solved. However, the existing work still has three main inherent limitations: (1) The potential relationship between the flow load feature and the sequence feature is ignored in the feature extraction process. (2) To adapt to the characteristics of different protocols to ensure the accuracy and robustness of encrypted traffic identification. (3) Training effective deep learning models requires large amounts of manually labeled data. This study aims to propose a method of encrypted traffic recognition based on CLSTM (a combination of 2-conv CNN and BiLSTM) and Mean Teacher collaborative learning. By detecting the fusion features of traffic load features and sequence features, the accuracy and robustness of encrypted traffic identification are improved, and the dependence of the model on labeled data is reduced. The experimental results show that the proposed CLSTM-MT collaborative learning method not only outperforms the traditional methods in the task of encrypted traffic identification and classification, but also improves the performance of the model by using only a small amount of labeled data when the cost of data labeling is high.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Jineng Ren

Abstract: Since the beginning of modern computer history, Turing machine has been a dominant architecture for most computational devices, which consists of three essential components: an infinite tape for input, a read/write head, and finite control. In this structure, what the head can read (i.e. bits) is the same as what it has written/outputted. This is actually different from the ways in which humans think or do thought/tool experiments. More precisely, what humans imagine/write on the paper are images or texts, and they are not the abstract concepts that they represent in humans' brain. This difference is neglected by the Turing machine, but it actually plays an important role in abstraction, analogy, and generalization, which are crucial in artificial intelligence. Compared with this architecture, the proposed architecture uses two different types of heads and tapes, one for traditional abstract bit inputs/outputs and the other for specific visual ones (more like a screen or a workspace with a camera observing it). The mapping rules between the abstract bits and the specific images/texts can be realized by neural networks like Convolutional Neural Networks, YOLO, Large Language Models, etc., with high accuracy rate. As an example, this paper presents how the new computer architecture (what we call "Ren's machine" for simplicity here) autonomously learns a distributive property/rule of multiplication in the specific domain and further uses the rule to generate a general method (mixed in both the abstract domain and the specific domain) to compute the multiplication of any positive integers based on images/texts.
Article
Computer Science and Mathematics
Applied Mathematics

Kejia Hu,

Hongyi Li,

Di Zhao,

Yuan Jiang

Abstract: In this paper, we propose a new geometric-shaping design for golden angle modulation (GAM) based on the complex geometric properties of open symmetrized bidisc, termed Bd-GAM, for future generation wireless communication systems. Inspired from the circular symmetric structure of the GAM, we construct the modulation schemes, Bd-GAM1 and Bd-GAM2. However, the optimization problem is hard to solve analytically, we consider this problem in the domain of symmetrized bidisc. Specifically, we joint the complex geometrics properties of symmetric bidisc and MI-optimized probabilistic modulation scheme. The symmetric bidisc is a bounded domain on which the Carathe´odory distance and the Kobayashi distance coincide. We can use the pseudo-metric mD in unit disc D to get the Kobayashi pseudo-distance and Carathe´odory pseudo-distance. With minimum SNR and entropy constraint, Bd-GAM1 and Bd-GAM2 can overcome the shaping-loss. This study finds that each constellation point of Bd-GAM has a unique phase and gain. Compared with the existed golden angle modulation introduced, the new design improves the mutual informaiton (MI), and the distance between adjacent constellation points.

of 821

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated