An African American man speaking into a microphone, with a real-time digital display analyzing his speech in the background, emphasizing the struggles with voice recognition technology in a high-tech environment.
Challenges in Voice Recognition An African American individual encountering difficulties with speech recognition technology highlighting the need for improved system accuracy

Voice Recognition Technology Struggles with African American English

By Darius Spearman (africanelements)

Support African Elements at patreon.com/africanelements and hear recent news in a single playlist. Additionally, you can gain early access to ad-free video content.

[playht_player width="100%" height="90px" voice="en-US-DavisNeural"]

Linguistic Discrimination in Speech Technology

Recent research has revealed that automatic speech recognition (ASR) systems often struggle to accurately understand African American English (AAE) speakers (NSF PAR). Unfortunately, this can lead to linguistic discrimination. Moreover, it can also cause negative impacts for AAE speakers when using voice-enabled technologies.

Higher Error Rates for AAE Speakers

Bar chart showing word error rates for different demographic groups. Black men have the highest error rate at 0.41, followed by Black women at 0.30, and white men at 0.21.
This bar chart illustrates the average word error rate WER for Black men Black women and white men across major automated speech recognition ASR systems showing higher error rates for Black speakers particularly Black men

Studies have shown that ASR error rates are significantly higher for individuals who speak AAE (NSF PAR). This is in comparison to mainstream American English speakers. However, the findings indicate a concerning bias in these systems (Stanford News).

“One group commonly misunderstood by voice technology are individuals who speak African American English, or AAE. Since the rate of automatic speech recognition errors can be higher for AAE speakers, downstream effects of linguistic discrimination in technology may result.” (NSF PAR)

Distinct Features of African American English

AAE is a major dialect with its own unique characteristics (Eastern Michigan University). These stem from its historical origins and cultural identity. Some key features include:

  • Vowel mergers
  • Consonant cluster reductions
  • Omission of certain grammatical markers

The linguistic differences between AAE and mainstream American English likely contribute to the higher ASR error rates (Eastern Michigan University). Consequently, this happens when the systems lack adequate training on AAE data.

Pie chart showing that 93% of African American participants modify their speech to interact with automated speech recognition systems, while only 7% do not modify their speech.
This pie chart shows that a significant majority of African American participants 93 reported modifying their speech to be better understood by ASR technology indicating a need for these systems to accommodate diverse dialects

Addressing the Bias in ASR Systems

To mitigate this bias, researchers emphasize the need to diversify the training data used for ASR systems (NSF PAR). Therefore, by better representing AAE and other underrepresented varieties of English, the performance can be improved (New York Times). This would benefit all speaker groups.

In summary, promoting equitable access is crucial as voice technologies become increasingly prevalent in our daily lives. Preventing marginalization in their development is also important. Ultimately, accounting for linguistic diversity, particularly AAE, is an essential step (NSF PAR). It can help reduce bias and ensure fair and accurate performance across the board.

About the author

Darius Spearman is a professor of Black Studies at San Diego City College, where he has been pursuing his love of teaching since 2007. He is the author of several books, including Between The Color Lines: A History of African Americans on the California Frontier Through 1890. You can visit Darius online at africanelements.org.