1. Introduction
Aspect-Based Sentiment Analysis (ABSA) [
1,
2,
3] represents a refined subdivision of sentiment analysis that seeks to comprehend the sentiment directed towards specific attributes or aspects within a text. Originating from the broader discipline of Natural Language Processing (NLP), ABSA has evolved significantly over the past decade [
4]. Traditional sentiment analysis was limited to discerning the overall sentiment of a piece of text, be it positive, negative, or neutral. However, such an approach overlooked the nuanced sentiments consumers often express towards different facets of a product or service in a single review. The development of ABSA was motivated by this need for granularity, allowing for the extraction of more detailed insights from textual data. ABSA’s relevance extends across various domains [
10,
11,
12,
13], from enhancing customer service to refining product features based on consumer feedback, thereby playing a pivotal role in data-driven decision-making processes.
Despite its considerable advancements, ABSA faces several challenges, primarily stemming from the intricacies of human language. These include but are not limited to, the handling of sarcasm, idiomatic expressions, and the contextual significance of words, which can drastically alter the sentiment when applied to different aspects. The advent of deep learning and machine learning techniques has propelled the field forward, offering sophisticated models capable of understanding complex language patterns. Recently, the integration of contextual embeddings, such as those from BERT (Bidirectional Encoder Representations from Transformers), has further enhanced the ability of ABSA models to grasp the nuanced meanings based on context. Furthermore, the exploration of adversarial training methods in ABSA, as evidenced by the development of models like AdvSentiNet, represents a cutting-edge trend aiming to fortify models against manipulative or misleading inputs. These advancements signify a move towards creating more resilient, accurate, and context-aware ABSA systems capable of tackling the multifaceted challenges presented by natural language.
The proliferation of online platforms has led to an exponential increase in consumer reviews, offering a rich source of data for businesses and individuals. However, the sheer volume of available reviews renders manual analysis impractical, necessitating automated sentiment analysis tools. [
18,
19,
20,
21] This research focuses on Aspect-Based Sentiment Analysis (ABSA), a nuanced form of sentiment analysis that assesses sentiments towards specific aspects within a sentence. For example, the sentence "The soup was delicious but we sat in a poorly-lit room" expresses a positive sentiment towards "Food quality" but a negative sentiment towards "Ambience". Previous studies have tackled ABSA using various hybrid models, combining rule-based approaches with neural networks for enhanced accuracy. The LCR-Rot-hop model [
22,
23], an extension of the LCR-Rot neural network, incorporates representation iterations for better sentiment prediction towards specific aspects. The subsequent evolution, LCR-Rot-hop++, integrates contextual word embeddings and hierarchical attention, setting a new benchmark in ABSA performance.
The realm of neural network research is burgeoning, with Generative Adversarial Networks (GANs) introduced by Goodfellow et al. [
24], emerging as a promising area. GANs involve training two networks in tandem: a generator that creates new input samples and a discriminator that distinguishes between real and generated samples. This setup not only enhances the generator’s ability to produce realistic samples but also strengthens the discriminator’s (or classifier’s) robustness [
25]. GANs represent a groundbreaking development in the field of machine learning [
29]. At their core, GANs consist of two competing neural network models: a generator and a discriminator. The generator’s objective is to create data samples indistinguishable from genuine data, while the discriminator’s role is to accurately distinguish between the generator’s fabricated data and real data. This adversarial process is akin to a game, where both networks continuously improve through competition with each other, leading to the generation of highly realistic synthetic data [
30]. The versatility of GANs has led to their widespread application across various domains, including but not limited to, image and voice generation, style transfer, and more recently, enhancing natural language processing tasks. In the context of Aspect-Based Sentiment Analysis (ABSA), GANs introduce an innovative approach to generating synthetic data samples or adversarial examples, which can be used to train more robust and sophisticated sentiment analysis models. This capability to augment training datasets with realistic, complex samples promises to address some of the enduring challenges in ABSA, such as dealing with nuanced and context-dependent sentiment expressions, thereby significantly advancing the field.
Although GANs have primarily influenced image generation, their application to text analysis poses challenges due to the variable length of sentences [
23,
34]. A novel approach to this challenge was demonstrated by leveraging GANs to generate adversarial samples, not through perturbations of existing data but by creating entirely new samples. This method, applied to the BERT Encoder, showcased an improvement in accuracy against baseline models. Our research, AdvSentiNet, builds upon this foundation by generating adversarial samples for the HAABSA++ model, which is intricately designed for ABSA. This study explores the efficacy of adversarial training in enhancing ABSA accuracy, contributing a novel perspective to the field.
The paper is organized as follows:
Section 2 reviews related works on ABSA and adversarial training.
Section 3 describes the datasets utilized in our study. In
Section 4, we detail the AdvSentiNet framework and our adversarial training methodology.
Section 5 discusses the empirical results, and Section 6 concludes the paper with a summary of findings and directions for future research.
2. Related Work
The comprehensive review by Schouten and Frasincar [
1] categorizes Aspect-Based Sentiment Analysis (ABSA) methodologies into knowledge-based, machine learning, and hybrid frameworks. ABSA aims to discern the sentiment directed towards specific aspects within a text, a critical task for understanding nuanced consumer feedback in reviews. This analysis is not only limited to identifying sentiment polarity towards predefined aspects in sentences but also encompasses the intricate processes of aspect extraction and detection. The sentiment towards "Food Quality" in the sentence "The steak was mouth-watering, yet the ambiance left much to be desired," illustrates the complex nature of consumer reviews where multiple aspects can elicit varying sentiments within a single sentence.
Hybrid approaches, such as those developed by Wallaart et al. [
2] and further enhanced by Trusca et al. [
23], represent a significant leap forward in ABSA. These methodologies employ a combination of rule-based and neural network strategies to accurately classify sentiment towards targeted aspects when explicit references are made. Particularly, the incorporation of contextual word embeddings and hierarchical attention mechanisms in the AdvSentiNet framework marks a pivotal enhancement, enabling state-of-the-art performance on benchmark datasets like SemEval 2015 [
37] and SemEval 2016 [
38].
GANs, as proposed by Goodfellow et al. [
24], have ignited significant interest for their unique structure consisting of a generator and discriminator duo. This structure facilitates the generation of new, realistic samples through a competitive minimax game, advancing the field of synthetic data creation. The versatility of GANs extends beyond image and voice synthesis, offering novel solutions to longstanding challenges in text-based applications, including ABSA.
The review by Han et al. [
25] highlights four principal advantages of integrating adversarial training in sentiment analysis: natural emotion generation, mitigation of sparse labeled data issues, robust learning across varied contexts, and automated quality evaluation of synthetic samples. These advantages underscore the transformative potential of GANs in enhancing sentiment analysis methodologies.
Odena et al.’s proposal of a semi-supervised GAN [
40], where the discriminator also functions as a classifier, opens new avenues for ABSA application. This model, referred to as a Categorical Generative Adversarial Network (CatGAN), has demonstrated superior performance in various settings, notably in semi-supervised learning environments. The application of CatGANs to ABSA introduces a groundbreaking method for enhancing model training through adversarial techniques.
A novel implementation of adversarial training in ABSA is presented by Karimi et al. [
29], utilizing a modified approach with the BERT Encoder. Instead of generating new samples, this method perturbs real data to create challenging samples for the model, thereby enhancing its learning process. This technique, although diverging from traditional GAN structures, exemplifies the innovative application of adversarial training in refining ABSA models, setting a new benchmark for the field with the AdvSentiNet model.
In summary, the integration of adversarial training and GANs into ABSA methodologies represents a significant advancement in sentiment analysis [
47]. The development and application of the AdvSentiNet model underscore the potential of these techniques to revolutionize the accuracy and robustness of sentiment analysis tools, paving the way for more nuanced and context-aware analysis in consumer feedback interpretation [
52,
53].
3. Method of AdvSentiNet
This section delineates the methodology adopted for sentiment classification within restaurant reviews, focusing on the introduction and application of the AdvSentiNet model. Initially, we elucidate the foundational algorithm, subsequently detailing the integration of the CatGAN methodology and its adaptation for enhancing the framework, ultimately functioning as the discriminator within this advanced setup. Additionally, the section outlines the comprehensive training procedure.
3.1. Framework
AdvSentiNet leverages a hybrid approach for aspect-based sentiment analysis, initially employing an ontology for sentiment determination towards a specific aspect. Failing conclusive results from the ontology, a backup neural network, significantly enhanced from its predecessor in [
23], steps in.
Derived from [
54], the ontology underpinning AdvSentiNet encompasses three pivotal classes. The
SentimentMention class categorizes sentiment expressions into subclasses based on their aspect-independent or aspect-dependent sentiment values, or if the sentiment value varies with the associated aspect. The
AspectMention and
SentimentValue classes respectively manage the aspect association and the sentiment polarity (positive or negative) of words. This structured ontological approach ensures a rigorous yet flexible framework for initial sentiment classification, accommodating words like `expensive’ within specific sentiment and aspect subclasses. Should the ontology-based analysis yield ambiguous or incomplete results due to mixed sentiments or unaccounted lexicalizations, the model defers to its neural network component.
Building on the work of [
23], the neural network component of AdvSentiNet incorporates BERT contextual embeddings and hierarchical attention mechanisms, optimizing sentiment classification efficacy. This evolution marks a significant advancement from the original LCR-Rot-hop++ model, selectively applying the most effective methods for contextual embedding and hierarchical attention as identified in their findings. Notably, the neural network employs a Transformer Encoder for embedding computation, benefiting from BERT’s pre-trained contextual understanding. The processing of embedded sentences into Left, Target, and Right segments, followed by bi-directional LSTM layers, facilitates nuanced attention to the sentence’s structure. The iterative application of a two-step rotary attention mechanism enriches the model’s interpretative depth, enhancing sentiment classification accuracy.
The integration of Generative Adversarial Network (GAN) principles introduces a novel dimension to the AdvSentiNet model, allowing for the simultaneous training of a generative model and a discriminative (or classifying) model. This section expounds on the adaptation of the neural network component to function as both a sentiment classifier and a discriminator, embodying the essence of a Categorical GAN (CatGAN).
Following the insights from [
55], the discriminator within the GAN framework is adapted to concurrently act as a multi-class classifier, extending beyond binary fake-real distinctions to include sentiment classifications. This methodological innovation enriches the model’s analytical capabilities, enhancing its discrimination and classification robustness. The optimization challenge presented in the GAN framework encapsulates the dueling dynamics between the generative and discriminative components, aiming for equilibrium where generated samples are indistinguishable from real data. The formulation integrates regularization terms to mitigate overfitting, ensuring model generalization.
3.2. Implementation and Training
The generative component, conceptualized as a fully connected Multi-Layer Perceptron (MLP), generates representation vectors mimicking those processed by the enhanced LCR-Rot-hop++ network, adapting output dimensions to accommodate sentence variability. The discriminative component, leveraging the final MLP layer of the neural network, is fine-tuned to classify sentiment effectively while discerning between generated and real samples.
Training this sophisticated model involves intricate balancing, ensuring the generative component’s outputs evolve to be increasingly realistic, enhancing the overall model’s performance and resilience. Hyperparameter optimization, guided by empirical testing and validation, further refines the model’s effectiveness. In sum, the AdvSentiNet framework, augmented with CatGAN methodology, represents a pioneering approach in ABSA, advancing the field through its sophisticated integration of ontological analysis, neural network enhancements, and adversarial training principles.
Following the methodologies outlined in [
2] and [
23], we refine the AdvSentiNet’s training regimen to encompass 200 epochs, striking a balance between computational efficiency and model performance. In each epoch, we meticulously select a batch comprising 20 real samples alongside an equivalent number of synthetically generated samples. The initialization of model weights adheres to a uniform distribution U(-0.01,0.01), ensuring a randomized yet controlled starting point, while biases are set to zero to maintain neutrality at the inception.
The inclusion of all model parameters in the regularization terms is crucial for combating overfitting and ensuring model robustness. Specifically,
encompasses the parameters within the final MLP layer in addition to those throughout the LCR-Rot-hop++ architecture, excluding the MLP.
consolidates the parameters attributed to the generative component of the model. The generator’s input, following a uniform distribution U(0,1), has a dimensionality of
, aligning with the embedding dimension
as recommended by [
23]. It’s imperative to acknowledge that the entirety of the training set is utilized for adversarial network training, mirroring the approaches of [
2,
23], despite potential biases when testing.
AdvSentiNet’s training mechanism, inspired by the framework presented in [
24], incorporates a strategic update schedule wherein the discriminator’s parameters are refined each iteration, and the generator’s parameters are adjusted every
iteration, allowing for a nuanced balance between generation and discrimination capabilities.
The theoretical foundation for achieving an optimum in the adversarial network, as delineated in [
24], mandates sufficient capacity within both the generator and discriminator to model and differentiate complex sample distributions accurately. Nonetheless, determining the exact capacity requisite remains an open question. The alternating training schedule, modulated by the parameter
k, is designed to prevent premature optimization of either network component, thereby facilitating a more balanced and effective learning process.
Convergence in the context of GANs, as highlighted in [
55,
56], remains a challenging aspect due to the inherent conflict in the optimization objectives of the generator and discriminator. This dynamic can potentially lead to oscillations rather than convergence, emphasizing the need for careful monitoring and adjustment of training parameters. Specifically, maintaining a delicate balance where neither component outpaces the other significantly is crucial for fostering a productive adversarial learning environment. Such equilibrium ensures that both the generative and discriminative aspects of the model evolve in tandem, enhancing the overall capability of the AdvSentiNet framework to classify and generate realistic sentiment-laden textual data.
5. Concluding Remarks and Future Directions
This research marks a significant advancement in the domain of ABSA by integrating adversarial training techniques into the neural network segment of the previously established framework by [
23], hereby referred to as AdvSentiNet. Our exploration into adversarial dynamics revealed that by adopting distinct learning rates and momentum parameters for the generator and discriminator components, we could substantially mitigate convergence challenges commonly associated with adversarial networks. This methodological refinement led to notable enhancements in model performance, elevating the accuracy from 81.7% to 82.5% for the SemEval 2015 dataset and from 84.4% to 87.3% for the SemEval 2016 dataset. More impressively, when deploying only the neural network mechanism, adversarial training facilitated a leap in accuracy from 80.6% to 88.2% for the 2016 dataset, thereby surpassing the hybrid model that included an ontological component. For the 2015 dataset, we observed an accuracy improvement from 80.7% to 82.2%. These enhancements in model performance significantly exceed those documented in [
29], representing the prior foray into the application of adversarial training to ABSA, where an adversarial perturbation strategy was employed instead of our approach, which leverages a classic GAN framework with distinct generator and discriminator entities.
As we look forward, our agenda is set on investigating a spectrum of strategies aimed at further stabilizing GAN training within the ABSA context. This includes delving into techniques such as feature matching [
60], which could enhance the generator’s ability to produce more realistic outputs by aligning features of the generated samples with those of real data. Additionally, minibatch discrimination [
55] offers a promising avenue for improving model diversity by preventing the collapse of generated samples into a singular mode. Historical averaging [
55], by encouraging parameter consistency over iterations, may also serve to stabilize training dynamics.
Moreover, the potential for the generator to create complete textual sentences, rather than merely attention vectors, beckons further exploration. This adjustment could significantly enrich the adversarial training process, offering a more holistic approach to generating and evaluating sentiment-laden content. Experimentation with alternative architectures for the generator, possibly incorporating state-of-the-art language models, stands as another exciting frontier that could elevate the AdvSentiNet model’s capabilities.
Lastly, a more granular comparison with the adversarial perturbation methodology presented in [
29] is warranted. By aligning datasets and benchmark models, we aim to conduct a direct assessment that elucidates the comparative advantages of traditional GAN frameworks over adversarial perturbation techniques in the realm of ABSA.
In conclusion, the AdvSentiNet model represents a pioneering step towards harnessing the nuanced potential of adversarial training in sentiment analysis. By continually pushing the boundaries of innovation and exploring novel methodologies, we aim to further enhance the accuracy and reliability of ABSA models, thereby contributing to the vast tapestry of natural language processing research.