1. Introduction
Artificial intelligence (AI) has witnessed a substantial surge in its integration into modern healthcare with an increasingly important role in medical diagnosis and treatment [
1,
2]. The development of algorithms and programs has empowered medical AI technologies to analyze symbolic models of diseases and their correlations with patient symptoms [
3,
4,
5]. In discussions surrounding AI applications in the healthcare landscape, there has been a relative lag and oversight of a crucial aspect – human communication differences and disorders – despite the significant progress made in other areas [
6].
Human communication disorders (HCD) stand out as one of the greatest barriers to effective health communication, posing significant challenges in the interaction between healthcare professionals and patients [
6]. The profound impact of HCD extends beyond the immediate challenges in doctor-patient communication. It also has far-reaching consequences on a person's mental well-being and overall quality of life.
HCD manifest in various forms and encompass a wide range of conditions, including challenges in speech, language, voice, and hearing. Individuals with these disorders may experience mild to severe difficulties in comparison to their peers within the same cultural and linguistic group. These conditions can be categorized, based on the key physiological systems involved, as “hearing disorders” (resulting from impaired sensory-perceptual hearing mechanisms), “speech disorders” (stemming from impaired sensory-motor speech mechanisms), and “language disorders” (arising from impaired higher cortical functions controlling language use) [
7]. They can also be classified, by the age at which behavioral differences are first observed—either in childhood or adulthood, as “congenital communication disorders” (evident at birth), “developmental communication disorders” (manifesting behaviorally during childhood), and “adult-onset communication disorders” (disrupting communication after the establishment of mature skills) [
6].
Delving into the specifics of these disorders, the umbrella of HCD is extensive, encompassing developmental communication disorders such as cleft lip and palate and other craniofacial disorders, developmental dysarthria, developmental verbal dyspraxia, developmental phonological disorder, specific language impairment, developmental dyslexia, and language and communication challenges associated with intellectual disability, emotional disturbance, and autism spectrum disorders. Acquired communication disorders include acquired dysarthria, apraxia of speech, aphasia, and communication problems resulting from head and neck cancer, right hemisphere damage, dementia, traumatic brain injury, and psychiatric disorders. Additionally, there are voice, fluency, and hearing disorders, which cover functional and organic voice disorders, stuttering and cluttering, as well as hearing disorders [
8].
HCD are prevalent worldwide, affecting people across various cultures and languages, from children to adults. They impact the quality of life for individuals experiencing these disorders and pose a significant burden on society. Current estimates suggest that approximately 17% of the overall U.S. population wrestles with some form of communication disorder [
9], including speech or hearing disorders of varying severity. Among these, hearing loss affects approximately 11%, while speech, voice, or language disorders affect around 6% [
10,
11]. In Australia, approximately 2.7 million people have communication disorders, and in the United Kingdom, an estimated 2.5 million people are affected by speech and language problems, with 800,000 experiencing disorders so severe that understanding them becomes nearly impossible for anyone except those closest to them. Globally, approximately 25% of children have communication disorders, though this prevalence decreases with age, possibly due to natural resolution or intervention [
11].
Traditionally, the diagnosis of HCD has relied on clinical observations, standardized assessments, and subjective evaluations by speech-language pathologists (SLPs) [
12]. These assessments involve interviews, norm-referenced tests, and observational measures to evaluate speech production, language comprehension, and other communication skills [
9]. Conventional treatment approaches often rely on behavioral interventions, including specific programs, repetitive drills, and therapeutic exercises delivered on an individual basis or in groups of varying sizes [
8]. Despite the commitment of clinicians and the proven effectiveness of these traditional methods for many individuals, they face challenges [
13]. The inherent subjectivity in assessments, resource-intensive one-on-one therapy sessions, and limited integration of technological aids hinder the scalability and efficiency of traditional diagnosis and treatment [
14,
15,
16]. Additionally, the shortage of qualified therapists in remote or underserved areas accentuates these challenges [
17,
18,
19], underscoring the critical need to integrate new and innovative technologies into CSD research and clinical practice.
The emergence of AI has opened up transformative possibilities for research. Integrating AI techniques holds the promise of bringing numerous advantages for patients, researchers, and therapists [
20,
21]. Indeed, these techniques have demonstrated their effectiveness in enhancing diagnostic accuracy [
22,
23], offering personalized interventions [
24,
25], and improving recovery effects [
26]. With these compelling benefits, the integration of AI into the study of HCD is gaining momentum, drawing an increasing number of researchers and practitioners towards this innovative and promising trend [
13]. This paradigm shift holds promise in overcoming practical limitations and enhancing the overall effectiveness of diagnosis and treatment approaches. For instance, the 2023 American Speech-Language-Hearing Association (ASHA) Convention featured a research symposium, coordinated by Jordan Green, PhD (MGH Institute of Health Professions), to showcase groundbreaking developments with a focus on the transformative impact of artificial intelligence (AI). The symposium, funded by the National Institute on Deafness and Other Communication Disorders (NIDCD), aimed to highlight experimental approaches, data tools, and educational paradigms to advance AI in CSD. The symposium delved into various AI applications developed by multidisciplinary teams, including clinical practitioners, bioengineers, computer scientists, and data scientists, featuring Google's AI programs, large-scale databases and speech analytics, speech biomarkers for neurological and mental health monitoring, personalized speech recognition for individuals with impairments, and updates on brain-computer interfaces.
A basic search using the term “artificial intelligence” on ASHA journals through ASHAWIRE (
https://pubs.asha.org/) reveals a total of 390 publications, with 103 articles published between 2018 and 2023. As the literature indicates, there is clearly a growing interest among SLPs and audiologists in the application of AI in CSD research and practice. For instance, as school-based SLPs face challenges in creating suitable activities for diverse students with varying disorders and limited time, AI could be a valuable solution in terms of its capability to leverage data to enhance clinical focus and client well-being, which can be very helpful for the onsite clinicians.
Considering the rapid emergence and promising future of this field, it is important to thoroughly explore the current research landscape and development structure to guide future studies in this research domain. Bibliometric analysis serves as a robust and quantitative methodology for assessing academic articles through various metrics, such as citation counts, collaboration networks, and keyword analyses, to map the intellectual structure of a specific field. Its applicability spans diverse disciplines, as evidenced by its extensive use in fields such as health care [
1], human brain research [
27], and speech, language, and hearing sciences [
28]. By employing bibliometric analysis, researchers can systematically explore knowledge structures and identify emerging trends within a research domain [
29]. Given the rapidly expanding literature on AI-based HCD research, a bibliometric analysis is crucial to assist researchers in this field in gaining an informative understanding of the evolving research landscape and fostering international and interdisciplinary collaboration [
30]. With knowledge of the current landscape and future directions, researchers can effectively leverage AI techniques to address communication challenges faced by affected populations, contributing to the advancement of both research and practical applications in the field.
2. Methods
We conducted the current bibliometric review based on the method and workflow adopted by Aria and Cuccurullo [
31] and Munim, et al. [
32]. The approach (
Figure 1) started with (1) bibliometric data collection through a systematic literature search, followed by a set of data analyses. A comprehensive evaluation of the research area was carried out through (2a) descriptive bibliometric analysis, which identifies publication trends, leading journals, studies, authors, and institutions; (2b) network analysis, which visualizes collaborations and relationships; and (2c) co-word analysis, which identifies research themes and evolution trends. Based on the results of these analyses, we (3) discussed current research features, limitations, and potential research directions.
2.1. Search Strategy
The literature was retrieved from Web of Science (WoS) Core Collection and Scopus, two mainstream multidisciplinary databases that provide a complete set of metadata types for bibliometric analyses. The search keywords were identified from a preliminary literature review, focusing on the two areas, i.e., (1) AI technologies and (2) HCD research. The search queries are outlined in
Table 1. Query (1) and (2) were connected with the “AND” operator (see Supplemental
Tables S1 and S2 for the detailed keyword search process using the Boolean function). The final search was conducted on December 31, 2023.
2.2. Screening Strategy
The search results were filtered by document type and language using the databases’ built-in Refine Results panels. Since AI technology is a leading-edge research area that is updating rapidly, we included a diversity of document types, including journal articles, conference papers, review articles, editorial materials, and early-access articles. Letters, notes, books, book chapters, meeting abstracts, corrections, and retracted articles were not included. Non-English language articles and publications without author information were excluded from this bibliometric analysis. A total of 9,324 and 5,711 publications were retrieved from Scopus and WoS databases respectively. The two datasets were merged using the method proposed by Caputo and Kargina [
33]. From the 15,035 merged data (9,324+5,711=15,035), 4,805 (31.96%) publications were found to be duplicates based on titles and digital object identifiers (DOIs). After the duplicates were removed, altogether 10,230 publications were left for manual screening.
The publications that passed manual screening were restricted to those (1) focusing on HCD, and (2) involving AI technologies. A pilot screening was conducted by two authors (M.Z. and E.T.) using a set of 200 articles. During the pilot screening, the publications were categorized into include, exclude, or unsure. Articles marked as unsure were screened by another author (H.D.) and the three authors together discussed whether these articles should be included or excluded. The practical inclusion and exclusion criteria were decided after the pilot screening was finished and consensus was reached for each of the 200 publications. Then, in accordance with the practical inclusion and exclusion criteria, two authors (M.Z. and E.T.) independently screened the remaining papers based on their titles and abstracts. Full text was read when it is necessary. An inter-rater reliability analysis was performed using
Byrt et al.’s kappa [
34] to assess how consistent the two raters were when screening the articles. The result indicated substantial agreement about the inclusion and exclusion of the studies screened, κ = 0.79 [
35]. Any disagreement due to individual differences was resolved by follow-up discussions. Finally, a total of 4,375 papers were included for bibliometric analysis. Of these, 2,239 are journal articles, 1,980 are conference papers, 152 are review articles, and four are editorial materials.
2.3. Data Extraction
The bibliometric information of each relevant publication was extracted, which includes authors, affiliations, article title, document type, journal name/conference name, abstract, keywords, times cited, and cited references. A manual examination ensured that all absent items were accurately supplemented.
2.4. Data Analysis
We used the open-source
bibliometrix R-package to analyze the bibliographic data [
31]. For descriptive bibliometric analysis, we used the number of publications and citations to compute Lotka’s Law coefficients, display the publication trend, and identify the most relevant publications outlets and institutes as well as the most impactful articles.
Lotka’s Law is a bibliometric measure of scientific productivity and authorship concentration. It assumes that in a research field, a few authors are highly productive, whereas a considerable number of authors publish only one article, and that the number of the former is a fixed ratio to that of the latter [
36]. The general formula of Lotka’s Law that exists between the
y number of authors making
x contributions is (
C stands for the constant of a specific scientific discipline):
A higher β value thus indicates a higher degree of authorship concentration, while a lower one implies the absence of a dedicated group of authors in a research field [
32].
Two citation metrics were used to quantify the impact of publications: global citations and local citations. Global citations measure how many times an article has been cited by other articles indexed in the databases from which the collection has been downloaded, and local citations refer to the times that the study has been cited by the other 4,374 studies also in the analyzed collection.
For the network analysis that explores country collaboration, we chose a
Walktrap clustering algorithm because of its superiority in the quality of the computed partition and the running time for large networks [
32,
37]. To examine the interconnections among publication outlets, research topics, and countries, we created a
three-field plot based on a Sankey diagram, which visualized the relationship among top publication sources, keywords, and countries.
We also performed a co-word analysis, which highlights the count of publications containing two keywords simultaneously [
38]. For the visualization of the conceptual structure within the research field, we employed the Fruchterman-Reingold layout [
39], and for the identification of clusters conveying common concepts, the Louvain clustering algorithm was employed [
40]. To normalize the co-occurrences, we utilized the association strength measure [
41].
A series of strategic maps were then plotted where each cluster/theme was placed in a specific quadrant according to its Callon’s centrality and Callon’s density values [
42]. In a strategic diagram, the horizontal axis represents the centrality values and the vertical axis represents the density values. The centrality value estimates the degree of the cluster’s connections with other clusters in the network [
43]. It can be read as the importance of the research theme in this field. The density value, a measure of the internal strength of the cluster, describes how strong the links that tie together the words that make up the cluster are [
44]. It manifests the capacity of a research theme to maintain itself and to develop over time [
45]. The two axes divide the strategic diagram into four quadrants. Research themes that fall in the upper right quadrant (Quadrant I) are called motor themes, which have high values in both centrality and density. Themes positioned in the upper left quadrant (Quadrant II) are highly developed but relatively isolated, i.e., not closely tied with other themes. For the lower left quadrant (Quadrant III), the themes are not well developed and are likely to be emerging or declining in the research field, and for the lower right quadrant (Quadrant IV), the research themes, having a low density, are weakly structured, but they also have the potential to evolve and become important because of their high centrality [
46,
47].
A longitudinal thematic map analysis was conducted to explore the thematic evolution of HCD research using AI techniques. We divided the time span into several time slices based on the important points in the publication trend over time (e.g., a burst in the number of articles or the publication of an impactful article). Strategic maps in different time periods were compared to see how the research themes merge, split, develop, or decline along with time.
3. Results
This study conducted a comprehensive review of articles focusing on AI-based approaches to HCD from 1985 to 2023. The analysis included 4,375 relevant studies published across 1,621 outlets over the past 39 years. These studies involved 13,072 authors, earning an average citation count of 7.90 per document. The majority of authors, totaling 98.84% (12,920 authors), engaged in collaborative efforts, while a minority of 1.16% (152 authors) contributed to single-authored studies.
3.1. Standalone Research Domain
In the domain of AI-based research on HCD, analysis of scientific output reveals a significant pattern. Specifically, 10,091 authors have contributed only one article each, contrasting with a single author's notable contribution to up to 66 published studies. Aligning with Pao's findings in 1986, Lotka's law typically yields β values ranging from 1.78 to 3.78 for most disciplines [
48]. Our study's estimated β value of 2.51 falls within Pao's identified range. Thus, AI application in HCD emerges as an independent research domain, marked by notable authorship concentration.
3.2. Publication Trend
Figure 2 illustrates the annual trends in publications concerning AI in HCD. Over the period from 1985 to 2023, the average growth rate for scientific research papers on AI-based HCD research was 16.51%. Notably, the growth rate varied across distinct time intervals: 8.50% from 1985 to 2002, 27.10% from 2002 to 2012, 19.01% from 2012 to 2018, and 21.65% from 2018 to 2023. The surge in publications was particularly pronounced between 2012 and 2023, constituting 88.14% (3,856 out of 4,375) of all included papers. The output during the initial 28 years (1985–2012) amounted to 607 publications, considerably lower than the 3,768 observed in the subsequent 11 years (2013–2023). These findings underscore a burgeoning interest in AI-based research within the domain of HCD.
3.3. Publication Patterns
Among the included papers, 72.39% (3167 out of 4375) were authored by scholars from 10 countries (as listed in
Table 2, based on the corresponding authors' countries). The United States accounted for 31.89% (1395 out of 4375) of the studies, establishing itself as the most significant contributor, followed by India (17.99%) and China (12.78%). These papers were disseminated across 1621 different journals and conferences, with INTERSPEECH leading the count at 6.80%, followed by ICASSP (2.61%), and Lecture Notes in Computer Science (1.76%), as outlined in
Table 3. In outlining the disciplines encompassed in AI-based HCD research, we referred to the WoS subject categories of the included publications.
Table 4 highlights the top 10 most prolific and impactful research domains, where computer science (60.62%), engineering (39.91%), and neurosciences (8.90%) emerge as the predominant disciplines.
We applied bibliometric measures such as TGC and TLC to discern the most impactful articles.
Table 5 presents the most cited papers ranked according to TLC and TGC per year. By considering both TGC/y and TLC/y criteria, Fraser et al. (2016) [
49], with a TGC/t of 44.44 and a TLC/y of 4.89, stands out as the most influential publication in the realm of AI-based investigations into HCD. Other noteworthy contributions include Tao et al. (2017) [
50], boasting a TGC/t of 51.63, and Luz et al. (2020) [
51], with a TLCS/t of 4.60.
3.4. Characteristics of Research Activities
Scientific collaborations at the country/region level were systematically reported and visualized.
Figure 3a illustrates the collaborative partnerships among the contributing countries, with the line thickness indicating the intensity of collaboration. Notably, scholars from the United States, Western Europe, Canada, China, and Australia demonstrated robust cooperation and exchange.
Figure 3b delves into collaborations among the top 40 most productive countries/regions, showcasing a concentration in America, Asia, and Europe. The USA, depicted as the largest node in green, centrally positioned in the collaboration network, exhibited the highest number of collaborations. The network reveals three distinct clusters: Cluster 1 (green) includes countries such as the USA, UK, and China; Cluster 2 (blue) features Germany, Spain, Netherlands, etc.; and Cluster 3 (red) comprises countries like India, Saudi Arabia, and Indonesia.
Figure 3c highlights Single-Country Partnership (SCP) and Multi-Country Partnership (MCP) within the top 10 most productive countries. It is evident that, regardless of the country, MCP constitutes a significantly small proportion of collaborations.
Keywords constitute the essential terms extracted by researchers in their studies.
Table 6 presents the frequency of the most relevant keywords. These keywords can be categorized into four groups: health condition, technology, element, and function. This classification illustrates the specific AI techniques employed by researchers, the disorders targeted, the content analyzed, and the objectives pursued in their studies.
The co-word network, as depicted in
Figure 4, illustrates keyword co-occurrences. Each node represents a keyword, and the relationships are reflected by the size of the connecting lines. Node size corresponds to the frequency of each word, while line size is proportionate to co-occurrence strength. The analysis reveals four distinct clusters: machine learning, deep learning, dysarthria, and support vector machine (represented by the most central term).
The machine learning cluster, highlighted in red, stands out prominently, emphasizing the high frequency of the term “machine learning” and underscoring its significance. This observation suggests that machine learning plays a pivotal role in AI-based research on HCD. Associations with terms such as “autism,” “dementia,” “Alzheimer's disease,” “aphasia,” and “rehabilitation” underscore the critical role of machine learning techniques in both research and rehabilitation addressing communication challenges associated with these disorders.
The second most relevant cluster, presented in blue, depicts the relationship between the terms “dysarthria,” “(automatic) speech recognition,” “speech therapy,” and “assessment.” This cluster suggests that the evaluation and treatment of dysarthria often involve the integration of automatic speech recognition technology.
The purple-themed cluster is associated with keywords like “support vector machine,” “classification,” “Parkinson's disease,” and “MFCC.” This cluster focuses primarily on the application of support vector machines for classifying Parkinson's disease based on features such as MFCC.
The green cluster, centered around “deep learning,” encompasses terms like “convolutional neural network,” “transfer learning,” and “sign language.” This cluster implies a research focus on utilizing deep learning, transfer learning, and convolutional neural network techniques for sign language recognition.
Figure 5 presents a three-field plot illustrating the interplay among primary publication outlets (left), author keywords (middle), and countries (right) within the realm of research on HCD facilitated by AI techniques. This visual representation is generated using Sankey diagrams, where the size of the boxes corresponds to the frequency of occurrences, and the width of the links represents the degree of linkage. The findings reveal that the majority of publication outlets encompass all popular keywords, with notable contributions from
IEEE Access,
Sensors,
INTERSPEECH, and
ICASSP. Diverse countries exhibit significant contributions across these thematic areas. The USA emerges with a notably higher research volume in autism and machine learning compared to other themes. India also stands out with substantial research contributions in autism, machine learning, dysarthria, and deep learning. Other countries exhibit a relatively balanced effort across diverse thematic areas. The visualization underscores the thriving and interconnected dynamics of publication outlets, thematic areas, and countries within this research domain. It signifies a flourishing landscape where various themes are actively explored by researchers from different countries and disseminated through diverse publication outlets, portraying a vibrant research environment.
3.5. Research Hotspot Tendencies
Figure 6 illustrates the evolution of the top 10 keywords employed by authors in (a) cumulative occurrences and (b) occurrences per year from 1995 to 2023. All keywords exhibit an ascending trend, with “machine learning” reaching the highest occurrence, followed by “autism” and “dysarthria.” The occurrences of “autism” experienced a notable acceleration since 2015, while those of “machine learning” and “deep learning” saw rapid increases since 2017 and 2019, respectively. The growth of other keywords remained relatively stable. “Autism” claimed the position of the most frequently used keyword starting from 2016 but was surpassed by “machine learning” in 2022. Overall, the annual occurrences of these keywords also reveal an upward trend, indicating the increasing prominence of various popular topics within the domain of AI-based HCD research.
For thematic analysis exploring the importance of topics based on their density and centrality, author's keywords were selected as the unit of analysis. The analysis was conducted on 1000 author's keywords with a threshold of 10 words per thousand documents. This resulted in the identification of seven clusters of research themes on the thematic map (
Figure 7a), each labeled with three descriptors. Specifically, two clusters were situated within the motor themes area, three within the basic themes, and two within the emerging or declining themes. It is worth mentioning that, during the analysis, no clusters corresponding to niche themes (fading themes) were discovered, suggesting that the research field is in a dynamic and progressive stage, currently undergoing development.
The themes located in the upper-right quadrant, distinguished for their robust interconnectivity with other clusters and substantial internal connections within the cluster, are classified as motor subjects. Key terms within this quadrant, including “feature extraction” and “machine learning,” are firmly established, offering vital insights for HCD research. The motor theme of “machine learning,” which exhibits the highest centrality and density when considered together, exhibits a centrality of 25.06 and a density of 41.23, encompassing principal keywords like “dementia” and “Alzheimer’s disease.”
Themes in the lower-right quadrant are considered promising, attracting attention from scientists and becoming more connected to other topics, yet still in the initial stages of development. Primary themes in this quadrant include “support vector machine,” “autism,” and “dysarthria.” The “support vector machine” theme has gained significant attention from researchers with a centrality of 21.27 and a density of 36.59, featuring keywords such as “classification” and “Parkinson’s disease.” The “dysarthria” theme is also becoming a focal point for researchers, incorporating keywords like “ASR” and “intelligibility.” These foundational themes are crucial to AI-facilitated HCD research and have the potential to evolve into motor themes as they acquire complexity and become more internally structured.
Emerging or declining themes are located in the lower-left quadrant. Due to their low centrality and minimal influence, themes in this quadrant may be in a state of development or decline. Themes within this quadrant encompass “deep learning” and “aphasia.” “Deep learning” includes keywords like “convolutional neural network” and “sign language,” while the “aphasia” theme involves keywords such as “stroke” and “speech therapy.” These themes represent potential study fields that could benefit from unique and creative ideas.
The thematic evolution spanning the years 1985 to 2023, along with the thematic divergences and integrations, is visually elucidated through the Sankey Diagram (
Figure 7b). Our temporal segmentation categorizes the period into four distinct zones: 1985–2002, 2003–2012, 2013–2018, and 2019–2023. A discernible pattern emerges, revealing the sustained research focus on specific disorders such as aphasia, dysarthria, and autism, as well as on key AI techniques like machine learning and support vector machine. This continuity is observed across the three periods since their emergence around 2003, underscoring their persistent centrality in the landscape of research interests.
The evolution of themes over time mirrors the developmental trajectory of AI technologies, progressing from the early stages of machine learning to the adoption of deep learning methodologies, initially artificial neural networks and subsequently, support vector machines. Additionally, the temporal shifts in thematic emphasis underscore the evolving landscape of AI-based HCD research. This evolution transcends a singular focus on speech and language disorders; it extends to encompass those whose speech and communication abilities are influenced by a myriad of other conditions such as mental illness. This broadening scope reflects a nuanced awareness within the research community, expanding beyond traditional boundaries to address the diverse population impacted by speech and communication-related challenges.
The dynamic thematic map (
Figure 8) provides a comprehensive overview of the evolving research themes within each temporal period, shedding light on the consolidation or divergence of study foci. In the initial phase spanning 1985–2002, eight thematic clusters emerged, predominantly situated in the lower-left quadrant with relatively low density and centrality. This early period signifies a nascent stage in HCD research leveraging AI technologies. The years from 2003 to 2012 witnessed a notable scaling out of clusters, marked by increased centrality [
66]. During this period, keywords such as human-robot interaction, decision support system, and hidden Markov model gain prominence among scientists, establishing connections with other thematic elements.
In both the periods of 2013–2018 and 2019–2023, instances occur where the density or centrality of specific keywords diminish. For instance, the cluster comprising autism and the cluster associated with support vector machines, initially categorized as motor themes during 2003–2012, both experienced a reduction in density during 2013–2018, relocating to the basic theme region. Similarly, deep learning, initially situated in the basic theme region during 2013–2018, witnesses reduced centrality, shifting to the emerging theme region in 2019–2023. This phenomenon can be attributed to their integration with new keywords, giving rise to the formation of novel clusters, underscoring that research on these topics has experienced a revitalized focus and important breakthroughs over the past decade. This dynamic evolution reflects the innovative exploration and development of fresh research directions within AI-based HCD research.
4. Discussion
This study explores the 4,375 research articles about AI-based HCD research published between 1985 and 2023. By using bibliometric data, we examined publication trend and patterns, characteristics of research activities, and research hotspot tendencies. We also reviewed the benefits of using AI in HCD research, and the challenges and future developments of this field.
4.1. Main Findings of Bibliometric Analysis
Since its initial publication in 1985, AI-based HCD research experienced a gradual increase over the following 27 years. However, since 2012, the field has undergone a notable acceleration. In the past five years, the number of publications has witnessed a rapid surge. This recent growth can be attributed to various factors. Technological breakthroughs in AI during this period significantly contributed to the explosive adoption of AI in HCD research [
67]. Additionally, the successful application of AI technology in healthcare has further fueled advancements in using AI for HCD research [
68]. According to the observed exponential growth pattern of AI-based HCD research, it is anticipated that publications in this field will continue to expand in the future.
Based on both output and citation counts, AI-based HCD research is frequently featured in conferences within the domains of speech and language processing, computer science, and biomedical engineering. This inclination likely arises from the expeditious dissemination of research findings through conference papers, as researchers strive to promptly share their results. AI-based HCD research is characterized by its interdisciplinary nature, leveraging advancements in computer science and engineering that have paved the way for AI development. Clinical industries directly associated with HCD, such as audiology and speech-language pathology, alongside rehabilitation, are integral components of this research landscape. Acoustics, language, and linguistics play important roles as speech and language are key components influenced by HCD. In addition, neurosciences and psychology contribute valuable insights into its underlying mechanisms. The application of AI technology in the HCD field relies on substantial contributions from each of these domains.
The field of AI-based HCD research has attracted global attention, with collaborations initiated among researchers from various countries. Despite this, the majority of research remains confined to single-country publications. This tendency may stem from the cultural and linguistic attributes inherent in speech, language, and communication, making cross-national and cross-cultural collaboration more challenging. However, conducting research across nations, cultures, and languages is crucial in practice when applying AI technology to HCD studies. This is essential for gaining a deeper understanding of patient needs in diverse cultural and linguistic contexts, facilitating the sharing of best practices globally, and providing more effective solutions for a broader population. Simultaneously, it is advantageous for developing AI algorithms applied to HCD, as it enables access to larger and more diverse datasets.
Examining the top keywords within identified categories reveals that the primary domains of disorders in AI research encompass autism, dysarthria, dementia, Parkinson’s disease, and aphasia. The research hotspot tendency analysis shows that autism, in particular, has captured the majority of attention in recent years, possibly due to the elusive nature of its exact causes and the diagnostic challenges it presents [
69,
70]. Given the unique characteristics of each individual with autism and the absence of direct diagnosis tests or identical treatment plans, AI algorithms emerge as helpful tools, contributing to the analysis of behavioral assessment data. This facilitates the identification of autism subtypes, paving the way for more personalized and targeted therapeutic interventions. AI technologies, including machine learning, are actively in development to support individuals with autism, enhancing their communication abilities and aiding in the development of social skills [
71]. Furthermore, AI holds the potential to utilize speech and language data as biomarkers for the diagnosis of ASD [
72]. This is particularly beneficial for children with autism, as early detection and intervention can profoundly impact their future lives.
4.2. Benefits of AI in the HCD Field
AI plays a crucial role in revolutionizing the landscape of communication disorders, offering a multitude of benefits for affected individuals, researchers, and SLP professionals.
For individuals dealing with communication disorders, early diagnosis and treatment are particularly important, especially in children, as early intervention plays a pivotal role in development, preventing long-term negative outcomes. AI systems, adept at detecting subtle patterns in speech or language development, contribute to this by enabling earlier and more targeted interventions, significantly improving treatment effectiveness [
73]. Furthermore, AI's benefits extend to personalized treatment approaches for clients. Using objective assessment data and client characteristics, AI facilitates the development of personalized algorithms that tailor interventions to individual needs [
74]. This approach, supported by adaptive learning, ensures efficient progress and aids SLPs in making informed decisions, optimizing treatment effectiveness for individuals with communication disorders.
In everyday life, AI technology provides significant assistance through applications, wearable devices, and platforms. Applications such as ELSA Speak and Speakprose, both utilizing AI, facilitate language skill and pronunciation practice with instant feedback, offering accessible communication options for users with diverse abilities [
75]. AI voice generators like Murf convert text to speech, creating personalized synthetic voices for those unable to speak [
76]. Wearable technology equipped with AI capabilities tracks speech patterns, enabling continuous monitoring and real-time information collection [
77,
78]. These features offer valuable data for long-term treatment and early identification of changes in communication abilities. AI-powered teletherapy platforms connect clients with SLPs remotely, and virtual reality platforms simulate real-life communication scenarios, offering frequent evaluations [
79,
80]. These advancements enhance accessibility to healthcare services, especially for individuals facing geographical barriers or difficulty accessing in-person care.
For researchers, the proficiency of AI in analyzing health data and medical images accelerates research activities, leading to improved diagnoses and uncovering patterns that may elude manual analysis [
81]. This capability proves particularly advantageous in the field of communication disorders, where the analysis of speech and voice data, including measures like pitch, intensity, and voice quality, is highly suitable when conducted using AI algorithms. These biometrics offer quantitative data for researchers, facilitating a deeper understanding of the acoustic and physiological aspects of speech and voice in individuals with communication disorders.
For SLP professionals, AI techniques are instrumental in their work, as they reduce workload and offer valuable references for comparing and contrasting with conventional diagnostic results [
82]. Additionally, the integration of AI, particularly through ASR, transforms clinical documentation by automating the conversion of verbal inputs into precise form fields [
83]. This not only enhances efficiency but also enables SLPs to reclaim more of their time from mundane paperwork tasks. The newfound time can then be directed towards more meaningful aspects of patient care. Furthermore, AI can play an important role in the education of SLP professionals, such as assisting SLP students in practicing clinical decision-making by providing evidence-based recommendations [
84]. This integration of AI into educational settings contributes to the ongoing development and enhancement of clinical skills among aspiring speech-language pathologists.
4.3. Challenges and Future Developments
However, the integration of AI into the field of HCD also presents several issues and challenges that require careful navigation. One prominent challenge is the ethical dilemma surrounding the application of AI in diagnosing and treating communication disorders [
24,
85]. The deployment of these techniques must be grounded in the principles of protecting patient welfare and upholding rigorous professional standards. Achieving ethical AI use requires a clear understanding of the extent to which it can be employed in research and clinical applications for communication disorders. In the future, it is important to establish ethical boundaries to ensure that the integration of AI remains within acceptable limits, preventing unintended consequences and potential harm to patients. Regular updates to ethical guidelines are also necessary to keep pace with the evolving landscape of AI technologies.
Privacy and data protection emerge as another critical issue, given the involvement of sensitive speech and voice data [
24]. Safeguarding patient information and complying with privacy regulations becomes a complex challenge, necessitating the development of robust data protection measures. The potential for speech data misuse to manipulate or impersonate an individual's identity introduces an additional layer of concern. Given the distinctive identifying characteristics inherent in speech data, addressing the risks of identity theft or fraudulent use becomes imperative, urging the implementation of robust data protection measures in the future against potential unauthorized access and misuse.
The seamless integration of AI tools into the clinical workflow may be a practical challenge, requiring clinicians to acquire new competencies and address the potential for AI to disrupt established workflows [
86]. Professionals accustomed to traditional methods may also exhibit resistance to change. To ensure successful adoption, it is crucial that AI enhances rather than disrupts established clinical practices [
87]. It is therefore important to strike the right balance between human expertise and AI autonomy by determining the extent of AI's influence without compromising the human touch. While using AI as a supportive tool, SLPs must maintain control over clinical decisions [
86]. Future training and education for SLPs should explicitly include this dimension, ensuring that continuous professional development and training are in place to keep practitioners updated on AI advancements and the evolving guidelines for its usage.
The current landscape within the domain of AI-based HCD research and practice reveals a notable absence of standardized protocols and regulatory frameworks governing the integration of AI techniques [
88]. Many studies on AI in communication disorders are small, inconsistent, and non-standardized, leading to gaps between research, clinical potential, and actual clinical applications [
88]. The absence of uniform guidelines may result in inconsistencies in assessments and treatments. To bridge these gaps and promote the robust adoption of AI tools in this field, it is essential that, in the future, standardized practices and regulatory frameworks are established to guarantee the reliability of AI tools and facilitate their widespread acceptance in the clinical setting.
Another issue that cannot be overlooked is algorithmic bias. The efficacy of AI systems relies heavily on the quality and diversity of the data on which they are trained [
89]. In cases where training datasets lack diversity and fail to be representative, the performance of AI systems may be compromised, leading to inaccurate diagnoses or suboptimal treatment recommendations [
90]. Biases within AI tools may manifest, contributing to distorted assessments and interventions, particularly affecting individuals from underrepresented populations. Moving forward, concerted efforts are essential to effectively address this issue. This involves the curation of diverse and representative datasets, ensuring that AI algorithms are exposed to a broad spectrum of cases during model training. Regular auditing and evaluation of algorithms should also be systematically instituted to identify and rectify biases. Crucially, collaborative endeavors involving researchers, developers, and SLP professionals are pivotal in establishing protocols that effectively mitigate bias and promote fairness in AI applications for communication disorders. Looking to the future, the development can unfold as follows: On the technical front, computer scientists will continue to resolve issues and advance the technology at its core level. Simultaneously, researchers and professionals will engage in continuous research, education, and active involvement with affected communities. This comprehensive approach can enable individuals with communication disorders from diverse demographic groups to equally benefit from AI applications.
Moreover, the interpretation of AI outputs can be challenging, especially with complex models. In clinical settings where clear explanations of diagnoses and treatment recommendations are crucial, the challenge of interpreting AI-generated results becomes pronounced [
89]. Clinicians, who are essential in the decision-making process, must understand and trust the outcomes produced by AI algorithms. Unfortunately, AI models, particularly those rooted in deep learning, are often perceived as “black boxes” due to the opacity of their internal workings [
89]. This lack of transparency or explainability can instill hesitation among clinicians, impeding the adoption of AI-based tools. Thus, ensuring that AI-generated insights are not only accurate but also interpretable and aligned with clinical expertise is essential for establishing trust among SLP professionals and patients alike. Moving forward, the collaborative efforts between AI experts and clinicians in designing and refining these models will prove instrumental. Researchers and developers can concurrently work towards creating AI systems with more transparent decision-making processes, facilitating clinicians in understanding and validating the rationale behind the generated recommendations.
In addition to the previously mentioned challenges, including ethical considerations, data privacy, and methodological improvements, ecological validity stands out as another critical concern in AI-based HCD research [
88]. The development and testing of many AI models in controlled research settings, while successful in that context, may not translate to optimal performance when deployed in real-world clinical practice [
88]. The absence of robust external validation for these AI models underscores a gap between laboratory-based development and practical implementation that cannot be overlooked. To guarantee tangible benefits for patients, it is vital to promote cooperation between computer scientists and speech therapists and strengthen industry-academic partnerships [
88]. By fostering collaboration across disciplines and maintaining a commitment to ongoing improvement, the integration of AI in the field of communication disorders can progress ethically, effectively, and inclusively.
4.4. Limitations of the Study
This study has some limitations. Firstly, as the primary goal of this research is to offer a general and comprehensive overview of the current research landscape regarding the application of AI in the field of CSD, the article covers the broad umbrella of HCD, and therefore, it does not conduct a more in-depth analysis of a specific disorder. Future studies can focus more narrowly and, building upon this study, delve into specific disorders under HCD. Secondly, we did not include non-English papers in our analysis. As a result, there is a chance that we might have missed relevant studies published in languages other than English, particularly those conducted in non-English-speaking countries. Future research could broaden the scope of included languages in the literature, seeking and incorporating studies published in various languages to ensure a more comprehensive and globally representative analysis of AI-facilitated HCD research.
5. Conclusions
This study represents the first bibliometrics-based comprehensive examination of academic articles on communication sciences and disorders research utilizing AI techniques. Our research gathered bibliographic data from 4,375 studies spanning the WoS and Scopus databases, involving 13,072 scholars and published in 1,621 academic outlets. By scrutinizing publication trends, conducting network analysis, and employing co-word analysis, the study reveals the current landscape and developmental structure of AI-based HCI research. It provides insights for future investigations in this field. Over the past decade, AI-driven HCD research has experienced rapid growth, demonstrating a trend of continuous expansion. Disorders such as autism, dysarthria, dementia, Parkinson's disease, and aphasia are among those most frequently approached using AI techniques, with machine learning, ASR, support vector machine, and deep learning being prominent AI techniques in the HCD domain. While AI technology offers substantial benefits for HCD research and clinical practice, it also poses certain challenges. Addressing these issues requires collaborative efforts across nations, cultures, languages, and disciplines, as well as collaboration among the technological, research, and clinical realms.