Ahem. . . Some Privacy Please: AI and Privacy Concerns For Persons with Speech, Language, & Hearing Disorders

Dates to remember. . .

  • In 1813, Jane Austen published her heart-wrenching novel Pride and Prejudice.

  • Commanding the ships Vostok and Mirny, voyager Fabian von Bellinghausen discovered Antarctica in 1820.

  • The 1958 patent of the Lego brick by the Lego company fueled our childhood dreams of building a tower to the moon using Legos.

  • Canadian singer, songwriter, pianist and producer Sarah McLachlan was born in 1968.

  • The 1985 record release of We Are the World sold over 20 million copies benefiting the USA for Africa charity.

  • Sadly, 1986 was the year that the space shuttle “Challenger” exploded 73 seconds after taking off, killing seven astronauts.

The significance of these recounted events is that all of the afore-mentioned events took place on the same day. . . January 28th.

As a propos to our discussion, January 28th has been distinguished as a day for observing data privacy, aka Data Privacy Day. Data privacy, what?!?!? If you’re like me, this is the first time you’re hearing about Data Privacy Day. No need to feign ignorance if you’re already familiar with it though. Kudos to you. For everyone else, here’s a quick explanation.

Data Privacy Day came about as an initiative following a European polling discovering that their constituents did not have a clear understanding of their data protection rights. To remedy this technical malady, January 28th of 2007, was designated as the first Data Protection Day pursuant of the European Council’s opening of the Convention for the Protection of Individuals in relation to Automatic Processing of Personal Data which was signed into law in 1981.

Not to be left behind, the United States enacted House Resolution 31 to declare January 28th National Data Privacy Day with the resolution being unanimously passed in a 402–0 vote. The resolution legally affirmed:

  1. Promoting data privacy awareness at the local and state level.

  2. Integrating topical discussions on data privacy and protections by educators and industry professionals in high schools.

  3. Increasing consumer awareness of data privacy issues while encouraging individuals to exercise their agency in enacting safeguards to protect their privacy.

And thus, we celebrate National Data Privacy Day, let the festivities begin! I’ll bring the guacamole if you bring the chips. . . oh, and don’t forget the salsa! Really, we should celebrate it every day and every other day as our digital data is exposed every second of every day and then some. Privacy Day could easily turn into Privacy Year, Lifetime, and Beyond as the grave is no guarantee that our data will be protected even at the time of our demise. . . but lets not get morbid about things.

Now lets consider how data privacy impacts an important group in our population. . . persons with disabilities. According to the 2021 United States Census Bureau, there are at least 42.5 Americans presenting with a disability relating to hearing, vision, cognition, walking, self-care or independent-living difficulties. Considerably, many individuals present with various comorbidities expressing themselves in the diagnoses of two or more impairments. Further outlined in the 2021 U.S. Census Bureau:

  • 46% of older adults 75 and older reported a disability.

  • 24% of adults between the ages of 65 and 74 reported a disability.

  • 12% of adults between the ages of 35 and 64 reported a disability.

  • 8% of adults under the age 35 reported a disability.

  • Walking, independent living and cognition were the most commonly reported disabilities.

  • Disabled Americans had lower rates of technology adoption for some devices, i.e. desktops, laptops, and smartphones.

  • The percentage of U.S. public school students who receive special education or related services has increased over the last decade, from 6.4 million to 7.3 million.

Now lets dig a little deeper. . . Within this population, there is a smaller subset of persons who present with speech, language, and hearing deficits. According to the American Speech-Language-Hearing Association (ASHA, 2023)

  • Nearly 1 in 12 U.S. children ages 3–17 has had a disorder related to voice, speech, language, or swallowing.

  • Nearly half of U.S. children ages 3–17 with a voice, speech, language, or swallowing disorder have not received intervention services in the past year.

  • 3 million+ Americans stutter.

  • Approximately 9.4 million adults report having a problem using their voice that lasted one week or longer.

  • Approximately 2 million Americans suffer from aphasia.

  • Approximately 37.5 million Americans report having some trouble hearing.

  • More than half (51%) of all adults report having hearing problems, but only 11% have sought treatment.

  • An estimated 12.5%, approximately 5.2 million, of children in the U.S. ages 6 to 19 show evidence of noise-induced hearing loss.

  • Approximately 26 million Americans, ages 20–69, have a hearing loss.

Whoa, whoa, whoa! Yes, I know. I’ve thrown a lot of numbers at you, not to discombobulate you but to better ground your understanding of how many people may be impacted by data privacy concerns when considering how artificial intelligence may affect this demographic. Out of all the above-outlined information, these are key to our discussion:

  • 37.5 million Americans report having some trouble hearing.

  • 35 million 3 to 21 year olds have a speech or language disorder.

  • 3 million+ Americans stutter.

  • 2 million Americans suffer from aphasia.

  • Disabled Americans have lower rates of technology adoption for some devices, i.e. desktops, laptops, and smartphones.

  • Approximately 18.5 million individuals have a speech, voice, or language disorder, (National Institute on Deafness and Other Communication Disorders, 2010).

Now, lets bring our friend, artificial intelligence (AI), into the discussion. As defined, AI combines computer science and robust datasets, to enable problem-solving. To acquire robust datasets, one needs data, large amounts of data. Datasets consisting of images, texts, audio, videos, numerical data points, etc., for solving various AI challenges such as

  • Image or video classification

  • Object detection

  • Face recognition

  • Emotion classification

  • Speech analytics

  • Sentiment analysis

Now, let’s marry the two, AI and persons with communication disorders that is.

Persons with a speech, language, and/or hearing deficit represent a smaller subset of individuals, presenting with a disability, exposed to or using AI technologies in their daily routines, possibly resulting in a less than robust dataset for AI researchers and developers to use in their development of assistive technologies for this target group.

“We can not have an Artificial Intelligence system without data. Deep Learning models are data-hungry and require a lot of data to create the best model or a system with high fidelity. The quality of data is as important as the quantity even if you have implemented great algorithms for machine learning models. The following quote best explains the working of a machine learning model.”

“Garbage In Garbage Out (GIGO): If we feed low-quality data to ML Model it will deliver a similar result (Khan, R., 2022).”

In a compilation of interviews for the publication “Shrinking the ‘data desert’: Inside efforts to make AI systems more inclusive of people with disabilities”, principal innovation architect lead at Microsoft who oversees the AI for Accessibility program stated:

“We are in a data desert. There’s a lot of passion and energy around doing really cool things with AI and people with disabilities, but we don’t have enough data.”

“It’s like we have the car and the car is packed and ready to go, but there’s no gas in it. We don’t have enough data to power these ideas.”

Some of the challenges to collecting good datasets, include, but are not limited to:

  • Insufficient Data — Non-availability of large samples of data points required by Machine Learning algorithms.

  • Quality — The real-world datasets are unorganized and complex. They are of low quality almost by default.

  • Privacy and Compliance — Most sources do not share their data due to some privacy and compliance regulations. For example medical, national security, etc.

For persons presenting with a disability, privacy is something that may be more illusive when it comes to artificial intelligence technologies. Microsoft researcher, Meredith Morris cited persons presenting with rare disabilities may be at a higher risk of privacy exposure “if contributing data to AI systems or participating in research studies evaluating AI technologies” due to difficulty “truly anonymizing data”.

“These increased privacy risks to people with disabilities are amplified by the aforementioned bias issues that people with disabilities may face if their disability status is exposed, and creates an incentive for people with disabilities to withhold their data from research studies, an issue that can further amplify the inclusivity problem of AI systems. Hence, reflection on how current research practices may impact risks of deductive disclosure (e.g, [1]), as well as development of stronger technical and legal privacy frameworks are critical to creating accessible AI technologies.” (Morris, M. 2020, p.2).

In Detecting Neurodegenerative Disorders from Web Search Signals (2018), AI researchers were able to determine whether someone was disabled or not based on their key strokes or mouse usage through online data traces. For instance, researchers were able to detect a persons with Parkinson’s disease through their mouse movements.

We are at a point of disequillibrium, which is good. For persons presenting with a communication deficit, there is an (a) increased need for larger data sets to improve AI technologies for this population and (b) an increased risk of privacy exposure for this group as well. What to do? Well, that will not be solved in this article, I just wanted to bring a little light to the issue. What we do know is that we all value our privacy and it should be available to all without discrimination.

This professional series on artificial intelligence in speech therapy will focus on ethical considerations as related to (a) privacy, (b) errors, © expectation setting, (d) simulated data, (e) inclusivity, (f) bias, and (g) social acceptability.

REFERENCES

American Speech-Language-Hearing Association (2023). Facts about The American Speech-Language-Hearing Association. Retrieved October 5, 2023, from https://www.asha.org/about/press-room/quick-facts/.

Khan, R. (2022). Importance of Datasets in Machine Learning and AI Research. DataToBiz: Simplifying the Complex. Retrieved October 5, 2023, from https://www.datatobiz.com/blog/datasets-in-machine-learning/.

Langston, J. (2020). Shrinking the ‘data desert’: Inside efforts to make AI systems more inclusive of people with disabilities. Retrieved October 5, 2023, from https://news.microsoft.com/source/features/ai/shrinking-the-data-desert/.

Morris, M. (2020). AI and accessibility: A discussion of ethical considerations. Microsoft Research. Retrieved August 29, 2023, from https://arxiv.org/pdf/1908.08939.pdf

National Institute on Deafness and Other Communication Disorders (2010)].

White, Ryen W., Doraiswamy, P. Murali, and Horvitz, E. (2018). Detecting Neurodegenerative Disorders from Web Search Signals. npj Digital Medicine, 1(8), 1–4.

Nikosi Darnell