Submitted:

04 November 2024

Posted:

05 November 2024

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
Background/Objectives Adult hearing-impaired patients qualifying for cochlear implants typically exhibit less than 60% sentence recognition in their best hearing aid condition, either in quiet or noisy environments, with speech and noise presented through a single speaker. This study examines the influence of deep neural network-based (DNN-based) based noise reduction on cochlear implant evaluation. Methods Speech perception was assessed using AzBio sentences in both quiet and noisy conditions (multi-talker babble) at 5 and 10 dB signal-to-noise ratio (SNR) through one loudspeaker. Sentence recognition scores were measured for 10 (5-bilateral, 5-unilateral) hearing-impaired patients using three hearing aid programs: calm situation, speech in noise, and spheric speech in loud noise (DNN-based noise reduction). Speech perception results were compared to bench analyses comprising the phase inversion technique, employed to predict SNR improvement, and the Hearing-Aid Speech Perception Index (HASPI v2), utilized to predict speech intelligibility. Results The spheric speech in loud noise program improved speech perception by 20 to 32% points as compared to the calm situation program. This improvement might make it difficult for some patients to achieve the
Keywords: 
Subject: 
Medicine and Pharmacology  -   Otolaryngology

1. Introduction

Cochlear implants (CIs) are highly effective medical devices for restoring hearing in patients with varying degrees of hearing loss. The Centers for Medicare & Medicaid Services (CMS) and private insurers provide guidelines for CI eligibility based on speech perception criteria. CMS requires patients to have bilateral moderate-to-profound hearing loss with speech perception scores below 60% in their best aided condition, as measured by sentence testing [1]. However, these guidelines lack specificity regarding speech presentation level, background noise, and sentence material, which can all impact scores.
Many clinics assess CI candidacy using AzBio sentences [2] in both quiet and noisy listening conditions [3,4,5,6]. The sentences and noise are typically presented from a single loudspeaker positioned in front of the speaker, with signal-to-noise ratio of 10 or 5 dB often used to qualify patients for cochlear implantation. Individuals qualified at this SNR level often demonstrate significant listening improvements in both quiet and noisy settings following implantation [6]. This assessment approach aims to expand candidacy to include patients with significant residual hearing and poor speech comprehension
Modern hearing aids employ noise reduction signal processing and directional microphones to reduce background noise and enhance speech perception. Traditional noise reduction signal processing is most beneficial for steady-state noises by predicting the noise envelope and attenuating the noise. However, benefits are more limited benefits in less-predictable, fluctuating noises like multi-talker babble [7,8]. Directional microphones are effective when speech and noise sources are spatially separated [9]. However, due to the single-speaker test condition used in CI candidacy assessments in which speech and multi-talker babble are co-located, noise reduction signal processing and directional microphones are unlikely to significantly influence CI evaluation outcomes
With the advancement of artificial intelligence (hereinafter referred to as deep neural network [DNN]), modern hearing aids are now capable of using DNN-based sound cleaning. DNN-based features are likely to provide significant and measurable improvements in speech perception in fluctuating noise, such as multi-talker babble, even when speech and noise are co-located, as is commonly the case in CI evaluations. In this study, we evaluated the impact of DNN-based noise reduction on CI evaluation outcomes in patients with significant hearing loss.
In addition to speech intelligibility results, this article presents bench evaluation data demonstrating improvements in SNR and speech intelligibility achieved through DNN-based noise reduction. Modeling speech intelligibility and SNR can significantly enhance our understanding of potential hearing device benefits and complement traditional behavioral testing. To assess SNR improvements from DNN-based noise reduction, the phase inversion technique developed by Hagerman & Olofsson was employed [11]. For predicting speech intelligibility, the Hearing-Aid Speech Perception Index (HASPI) was utilized to determine whether modeling can accurately predict behavioral outcomes observed in participants. HASPI v2 incorporates an advanced auditory framework that simulates the effects of hearing loss, including reduced audibility, diminished non-linear compression, widened cochlear filtering, and inner hair cell synapse modeling [12,13,14]. This framework enables the prediction of speech intelligibility for a given speech signal.

2. Methods

2.1. Patient Evaluations

This study presents speech perception scores for ten patients with significant hearing loss. Five of these patients had bilateral hearing loss and referred to our clinic for CI evaluation. Five other patients had unilateral hearing loss with a CI in one ear. These patients were evaluated without their CI but with a hearing aid in the contralateral ear, while the bilateral hearing-impaired patients were fitted bilaterally. All participants underwent speech perception testing in both quiet and multi-talker babble conditions. These measurements were collected during their clinical visits to determine their eligibility for cochlear implantation. The retrospective review of test results from patient medical records was approved by the Mayo Clinic Institutional Review Board (24-010370).
Pure tone air- and bone-conduction audiometry was performed using an Otometrics Madsen Astera audiometer, ER-3A insert earphones and B-71 bone vibrator. The audiometric thresholds were used to program a pair of Phonak Audéo Sphere Infinio 90 receiver in the canal hearing aids with power domes attached. All hearing aids were fit by matching the hearing aid output to the target gain recommended by DSL-v5 for adults [15].
Speech perception was evaluated using AzBio sentences presented at 60-dB SPL in both quiet and multi-talker babble conditions at SNRs of 5 and 10 dB. Sentence testing was performed using a single loudspeaker located in front of the patient. Multi-talker babble was presented continuously while measuring sentence recognition. One list of 20 sentences was used to measure speech perception for each listening condition.
Sentence recognition scores were measured for three manual hearing aid programs: (1) calm situation, (2) speech in noise, and (3) spheric speech in loud noise, each linked to the associated AutoSense settings. The calm situation program is the program most sensitive to sound all around the hearing aid wearer. The speech in noise program utilizes a noise reduction signal processing algorithm and adaptive directional microphone technology. The spheric speech in loud noise program incorporates DNN-based noise reduction and adaptive directional microphone technology.

2.2. Hearing Aid Recordings and Experimental Setup

Hearing aid recordings were conducted in a sound-treated, semi-anechoic laboratory using a KEMAR manikin equipped with average adult pinna replicas and G.R.A.S. RA0045 ear simulators. A Genelec 8331A speaker was positioned one meter from the KEMAR to mimic the experimental setup used for the behavioral component of the study.
The electrical signal from a G.R.A.S. power module 12AK was captured using an RME Fireface 800 soundcard interfaced with an Optiplex 7000 PC. Adobe Audition CC was used to record the input signal. The entire setup was calibrated using a G.R.A.S. 42AA pistonphone and an NTi-XL2 sound level meter
Measurements were performed using the same speech and noise materials employed in the behavioral component of the study. Phonak Audéo Sphere Infinio 90 hearing aids, fitted with titanium custom slim tips (no venting), programmed for either moderate or severe hearing loss (N4 or N5 hearing thresholds, respectively) [15] and fitted to DSL-v5 adult gain targets.
A total of 18 conditions were recorded, combining two standard audiogram configurations (N4, N5), three program settings (calm situation, speech in noise, spheric speech in loud noise), and three SNR levels of 0, 5 and 10 dB.
SNR analysis was conducted using the phase inversion technique [10]. This technique involves presenting three versions of a speech and noise sample: (1) The speech and noise are presented in their original phase (2) The phase of the speech signal is inverted (3) The KEMAR outputs from these three versions of the signal are used to calculate the level of both speech and noise, from which the SNR associated with each hearing aid program can be determined.
For predicting speech perception and intelligibility the recorded signals were analyzed using Hearing-Aid Speech Perception Index (HASPI v2), a MATLAB-based tool acquired from the author (Kates, personal communication, November 25, 2022). HASPI v2 simulates the human auditory system to predict speech perception and intelligibility. HASPI v2 analyzes recording as follows (1) Input: The tool requires the recorded test signal, a reference signal consisting of the clean, unprocessed sentences, their sampling rates, the listener’s hearing thresholds at 0.25, 0.5, 1, 2, 4, and 6 k Hz, and the reference sound pressure level (SPL) corresponding to a root mean square (RMS) value of 1. (2) Processing: Both the recorded and reference signals are processed through the simulated auditory system. (3) Output: The tool compares the outputs of the simulated system for the recorded and reference signals to predict speech perception and intelligibility.

3. Results

3.1. Patient Evaluation

Figure 1 shows the left and right audiometric air-conduction thresholds for the five patients with bilateral (B1 to B5) hearing loss and five patients with unilateral (U6 to U10) hearing loss. All patients had moderate-to-profound degree of hearing loss. These audiometric thresholds were used to fit the new hearing aid either unilaterally or bilaterally.
Figure 2 depicts the speech perception scores of the ten participants for the three different hearing aid programs for calm situation, speech in noise, and spheric speech in loud noise program. On average, participants achieved an 87% sentence recognition score in quiet conditions. When exposed to multi-talker babble with a 10 dB SNR ratio, scores decreased to 59% for the calm situation program, 63% for the speech in noise program, and 82% for the spheric speech in in loud noise program (see first panel in Figure 4). With a 5 dB SNR, scores dropped further to 36%, 44%, and 67%, respectively.
Compared to the calm situation program, the spheric speech in loud noise program showed average improvements of 23% points and 32% points at 10 and 5 dB SNRs, respectively (see first panel Figure 4). Similarly, the spheric speech in loud noise program showed average improvements of 20% points and 23% points at 10 and 5 dB SNRs respectively compared to the speech in noise program.
The Shapiro-Wilk test confirmed the normal distribution of the data. A one-way analysis of variance (ANOVA) was conducted to compare speech perception scores across the three programs for the ten participants, with the expectation that intelligibility would be highest in the spheric speech in loud noise program. The ANOVA revealed statistically significant differences in sentence recognition scores among the programs under both 10 dB (F (2, 10) = 4.604, p < 0.05) and 5 dB (F (2, 9) = 12.104, p < 0.001) multi-talker babble conditions.
Pairwise comparisons using the Bonferroni t-test indicated significant differences (p < 0.05) between the spheric speech in loud noise program and the calm situation and speech in noise programs for both noise levels. However, no significant differences (p ≥ 0.05) were found between the calm situation and speech in noise programs under either multi-talker babble condition.

3.2. Bench Evaluation

Figure 3 demonstrates the SNR improvement provided by the spheric speech in loud noise program compared to the calm situation and speech in noise programs for 0, 5, and 10 dB SNR. The results indicate that spheric speech in loud noise program consistently increases the SNR of the aided signal. On average, the spheric speech in loud noise program outperformed the calm situation program by 4.8 dB for the N4 audiogram and 5.0 dB for the N5 audiogram across SNR levels. Similarly, when compared to the speech in noise program, the spheric speech in loud noise program offered an average benefit of 4.5 dB for N4 and 4.6 dB for N5 audiograms across SNRs. It’s worth noting that the advantage of spheric speech in loud noise over the speech in noise program was smaller in magnitude at 10 dB SNR as compared to the 0 and 5 dB SNR due to differences between the calm situation and speech in noise programs.
Figure 4 shows average sentence recognition scores for the ten patients using three different hearing aid programs at 5 and 10 dB SNR. These results can be compared with the HASPI v2 model’s predicted scores that are illustrated in the other two panels for N4 and N5 audiograms. Note that patient scores were not measured at 0 dB SNR. The results show that HASPI v2 predicted scores are approximately 10 to 15% higher than those measured for the 5-bilateral and 5-unilateral hearing-impaired patients.

4. Discussion

This study evaluated speech recognition for hearing-impaired individuals: five with bilateral hearing loss and five with unilateral hearing loss and a CI in the contralateral ear that was not used during speech intelligibility measurements. The hearing-impaired ear/s were fitted with a Phonak Audéo Infinio Sphere 90 hearing aids, programmed with three different settings: calm situation, speech in noise, and spheric speech in loud noise programs. Given that many participants scored below 60% with conventional listening programs, they would qualify for cochlear implantation. However, the spheric speech in loud noise program enabled most participants to achieve scores above 60% correct. These findings raise two crucial questions:
First, given the advancements in DNN-based noise reduction technology, how should patients be qualified for cochlear implantation? Lower SNRs, such as 0 dB, may be required to identify hearing-impaired patients, particularly those with asymmetrical hearing loss with one exceptionally well-performing ear, in need of cochlear implantation. Even a 0 dB SNR stimulus presentation may not be sufficient to qualify some hearing-impaired listeners for cochlear implantation, as DNN-based noise reduction can provide an additional 5 dB of SNR improvement (see Figure 3). A limitation of this study is the lack of speech perception data at 0 dB SNR. SNR analysis and HASPI modeling suggest that some patients might achieve scores below 60% at this level. Alternatively, CMS criteria could be revised to include ear-specific evaluations using single-word stimuli in quiet. This change would account for the potential of advanced hearing aid technologies, such as DNN-based noise reduction, to mitigate the need for cochlear implantation in certain cases.
Second, should cochlear implantation no longer be considered for individuals with audiograms similar to those of our study participants, given the significant improvement achieved with DNN? SNR analysis shows that the current implementation of DNN-based noise reduction technology appears to offer a significant improvement of 4-5 dB over traditional speech-in-noise programs. This improvement can be complemented by incorporating directional microphones, particularly in real-world scenarios where speech and noise sources are spatially separated. The advancement of noise reduction technologies in hearing aids has the potential to delay or even eliminate the need for cochlear implantation in some hearing-impaired individuals.
Eventually CIs can derive the benefit of DNN-based noise reduction technology by incorporating this technology in their speech processors. While most CI users perform well in quiet listening conditions [17], many struggle in noisy environments, even at relatively favorable SNRs of 10 or 5 dB. The combination of DNN-based noise reduction and directional microphones could significantly improve speech perception for CI users in challenging listening situations.
While the initial results comparing DNN-based noise reduction with traditional algorithms are encouraging, a few challenges remain for widespread adoption of DNN-based technologies: DNNs are computationally intensive, which can significantly increase battery power consumption. This could limit the wear time of smaller hearing aids, such as receiver-in-canal or behind-the-ear devices. The effectiveness of DNN-based noise reduction may be compromised in open-fit hearing aid configurations [18]. The direct sound from the noisy environment mixes with the denoised signal from the hearing aid in the ear canal, potentially reducing the perceived benefit of the noise reduction technology. To fully realize the potential of future DNN-based noise reduction improvements, it is crucial to address these challenges through advancements in hardware and software, as well as innovative hearing aid designs.
Bench evaluation using SNR analysis [11] quantified the SNR improvement provided by the hearing aid and DNN-based noise reduction algorithm. Slight differences in SNR benefit between N4 and N5 audiograms may be due to variations in gain and automatic gain compression. While SNR analysis provides valuable insights into noise reduction, it cannot predict speech intelligibility. HASP v2 is a sophisticated computational model that can be used to predict how well a hearing aid will perform in various listening environments [14]. By modeling the complex interactions between hearing loss, hearing aid processing, and various acoustic environments, HASPI v2 can be used to predict speech intelligibility. Our results show that the HASPI v2 predicted speech perception scores for the N4 and N5 audiograms, by a slightly greater magnitude than the actual speech perception scores measured for our patients. It is possible that individuals with overall poorer speech perception, relative to their peers with similar audiograms, are more likely to be referred for cochlear implantation because speech perception scores vary considerably among individual with comparable audiograms. Given the impracticality of measuring speech perception for all types of audiograms and in all listening conditions and acoustic environments, tools such as SNR analysis and HASPI v2 provide valuable methods to predict the noise reduction capability of hearing aids and provide a rough estimate of speech perception for patients.

5. Conclusions

Traditional noise reduction features in hearing aids have shown limited improvements in SNR and speech perception due to potential signal distortions [8,9]. Patients with aided speech understanding below the threshold that defines cochlear implant candidacy are therefore referred for implantation. DNN-based noise reduction algorithms enable some patients to achieve speech recognition scores above the 60% threshold. A widespread adoption of DNN technology should encourage discussions on whether cochlear implant candidacy criteria should be revisited. Future research should explore the full potential of DNN-based noise reduction across larger sample sizes, and by testing various levels of noise reduction and evaluating the maximum achievable improvement, especially for cochlear implant users who are highly susceptible to background noise interference.

Funding

Internal departmental funding was utilized without commercial sponsorship or support for hearing assessments with patients. Technical measurements and model predictions were performed by the Sonova Innovation Centre, Toronto.

Institutional Review Board Approval

This study was approved by the Mayo Clinic Institutional Review Board (24-010370).

Statement of Originality

This material has not been previously published and is not currently under consideration for publication elsewhere.

Conflicts of Interest

AAS is a consultant for or has research support from Sonova AG, Cochlear Americas, and Envoy Medical.; B.A.S., J.M.V., V.K, S.C.V., and J.Q. are employees of Sonova AG, the manufacturer of the hearing aids used in this study.

References

  1. Zwolan TA, Kallogjeri D, Firszt JB, Buchman CA. Assessment of Cochlear Implants for Adult Medicare Beneficiaries Aged 65 Years or Older Who Meet Expanded Indications of Open-Set Sentence Recognition: A Multicenter Nonrandomized Clinical Trial. JAMA Otolaryngol Head Neck Surg. 2020 Oct 1;146(10):933-941. [CrossRef] [PubMed]
  2. Spahr AJ, Dorman MF, Litvak LM, Van Wie S, Gifford RH, Loizou PC, Loiselle LM, Oakes T, Cook S. Development and validation of the AzBio sentence lists. Ear Hear. 2012 Jan-Feb;33(1):112-7. [CrossRef] [PubMed]
  3. Dunn C, Miller SE, Schafer EC, Silva C, Gifford RH, Grisel JJ. Benefits of a Hearing Registry: Cochlear Implant Candidacy in Quiet Versus Noise in 1,611 Patients. Am J Audiol. 2020 Dec 9;29(4):851-861. [CrossRef] [PubMed]
  4. Mudery JA, Francis R, McCrary H, Jacob A. Older Individuals Meeting Medicare Cochlear Implant Candidacy Criteria in Noise but Not in Quiet: Are These Patients Improved by Surgery? Otol Neurotol. 2017 Feb;38(2):187-191. [CrossRef] [PubMed]
  5. Thai, A. , Tran, E. , Swanson, A. , Fitzgerald, M. , Blevins, N. , Ma, Y. , Smith, M. , Larky, J. & Alyono, J. (2022). Outcomes in Patients Meeting Cochlear Implant Criteria in Noise but Not in Quiet. Otology & Neurotology, 43 (1), 56-63. [CrossRef]
  6. Schauwecker N, Patro A, Holder JT, Bennett ML, Perkins E, Moberly AC. Cochlear Implant Qualification in Noise Versus Quiet: Do Patients Demonstrate Similar Postoperative Benefits? Otolaryngol Head Neck Surg. 2024 May;170(5):1411-1420. [CrossRef] [PubMed]
  7. Alcántara JL, Moore BC, Kühnel V, Launer S. Evaluation of the noise reduction system in a commercial digital hearing aid. Int J Audiol. 2003 Jan;42(1):34-42. [CrossRef] [PubMed]
  8. Brons I, Houben R, Dreschler WA. Acoustical and Perceptual Comparison of Noise Reduction and Compression in Hearing Aids. J Speech Lang Hear Res. 2015 Aug 1;58(4):1363-76. [CrossRef] [PubMed]
  9. McCreery RW, Venediktov RA, Coleman JJ, Leech HM. An evidence-based systematic review of directional microphones and digital noise reduction hearing aids in school-age children with hearing loss. Am J Audiol. 2012 Dec;21(2):295-312. [CrossRef] [PubMed]
  10. Andersen, A. H., Santurette, S., Pedersen, M. S., Alickovic, E., Fiedler, L., Jensen, J., & Behrens, T. (2021). Creating clarity in noisy environments by using deep learning in hearing aids. Seminars in Hearing, 42(3), 260–281. [CrossRef]
  11. Hagerman, B., & Olofsson, A. (2004). A method to measure the effect of noise reduction algorithms using simultaneous speech and noise. Acta Acustica United With Acustica, 90(2), 356–361.
  12. Kates, J. M. (2013). An auditory model for intelligibility and quality predictions. 050184–050184. [CrossRef]
  13. Kates, J. M., & Arehart, K. H. (2014). The Hearing-Aid Speech Perception Index (HASPI). Speech Communication, 65, 75–93. [CrossRef]
  14. Kates, J. M., & Arehart, K. H. (2021). The Hearing-Aid Speech Perception Index (HASPI) Version 2. Speech Communication, 131, 35–46. [CrossRef]
  15. Scollie, S., Seewald, R., Cornelisse, L., Moodie, S., Bagatto, M., Laurnagaray, D., Beaulac, S., & Pumford, J. (2005). The Desired Sensation Level Multistage Input/Output Algorithm. Trends in Amplification, 9(4), 159–197. [CrossRef]
  16. Bisgaard, N., Vlaming, M. S., & Dahlquist, M. (2010). Standard audiograms for the IEC 60118-15 measurement procedure. Trends in amplification, 14(2), 113–120. [CrossRef]
  17. Carlson ML. Cochlear Implantation in Adults. N Engl J Med. 2020 Apr 16;382(16):1531-1542. [PubMed]
  18. Bentler, R., Wu, Yu-Hsiang; Jeon, J. (2006). Effectiveness of directional technology in open-canal hearing instruments. The Hearing Journal 59(11):p 40,42,44,46-47, November. [CrossRef]
Figure 1. Air-conduction thresholds for five patients with bilateral (B1 to B5) and five patients with unilateral (U6 to U10) hearing loss.
Figure 1. Air-conduction thresholds for five patients with bilateral (B1 to B5) and five patients with unilateral (U6 to U10) hearing loss.
Preprints 138519 g001
Figure 2. Sentence recognition scores in quiet and in the presence of multi-talker babble with a signal-to-noise ratio of 10 and 5dB for three different manual programs: calm situation, speech in noise, and sphere in loud noise.
Figure 2. Sentence recognition scores in quiet and in the presence of multi-talker babble with a signal-to-noise ratio of 10 and 5dB for three different manual programs: calm situation, speech in noise, and sphere in loud noise.
Preprints 138519 g002
Figure 3. Signal-to-noise ration improvement (dB) with the Spheric Speech Clarity program relative to calm situation (squares), speech in noise programs (circles) for N4 (left) and N5 (right) audiograms across 0, 5, and 10 dB SNRs.
Figure 3. Signal-to-noise ration improvement (dB) with the Spheric Speech Clarity program relative to calm situation (squares), speech in noise programs (circles) for N4 (left) and N5 (right) audiograms across 0, 5, and 10 dB SNRs.
Preprints 138519 g003
Figure 4. The left panel shows average sentence recognition scores for 10 hearing-impaired patients using three different hearing aid programs in 5 and 10 dB signal-to-noise ratio (SNR) multi-talker babble. The middle (N4) and right (N5) panels display the predicted speech scores, as calculated by HASPI v2, for the three programs in multi-talker babble conditions with SNRs of 0, 5, and 10 dB. Error bars represent the standard error of the mean.
Figure 4. The left panel shows average sentence recognition scores for 10 hearing-impaired patients using three different hearing aid programs in 5 and 10 dB signal-to-noise ratio (SNR) multi-talker babble. The middle (N4) and right (N5) panels display the predicted speech scores, as calculated by HASPI v2, for the three programs in multi-talker babble conditions with SNRs of 0, 5, and 10 dB. Error bars represent the standard error of the mean.
Preprints 138519 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Alerts
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated