There are many assistive technologies available today to help people who have some sort of speech impairment, known collectively as the Augmentative and Alternative Communication (AAC) field. Usually these communications involve some type of computer-generated speech, or recordings of common sayings performed by an actor.
Voice banking is one option that lets those who can no longer speak effectively to continue communicating using their own natural voice.
This is done by having the person record a list of words and phrases while they still have the ability to do so. As the name implies, this “bank” of spoken words is then used to generate language using a speech generation device (SGD), such as a computer or tablet.
If the bank is large and varied enough, it can create a virtually infinite combination of words and sentences.
Obviously, the drawback of voice banking is the user must still have the ability to record clear speech. So, the candidate is typically someone who has been diagnosed with a condition that is known to lead to loss of speech, such as motor neuron disease (MND), and have the time and wherewithal to make recordings while they still are able.
But the benefit is equally obvious: to allow the person to continue to communicate verbally in perpetuity using their own voice.
“For a person who is losing their voice, having their own voice in their speech generation device would perhaps encourage them to use it more than some sterile packaged voice,” said Craig Burns, an Assistive Technology Specialist with Easterseals Crossroads with 21 years of experience in AAC.
And as with many things, technological advancement is making the process more accessible and cheaper. Voice banking used to be cost-prohibitive for most people, according to Burns. “With the recent web-based options, I believe it will become more mainstream.” he said.
Making bank “deposits”
Generally, creating a voice bank takes a minimum of six to eight hours to record over a period of time, usually weeks or months, and involves producing around 1,600 or more sentences. This process can take longer if the candidate needs to take lots of breaks.
The speech recorded must be fully intelligible and clear — what you “deposit” in the voice bank is exactly what you will get out. Luckily, withdrawals are unlimited!
Voice banking is different from “digital legacy” services, which are used to record important messages for posterity.
One of the challenges in voice banking is the best time to do it is while speech is not yet fully impacted. The candidate may have just received their diagnosis of whatever condition is affecting their speech, and may not yet have come to terms with the possibility they may lose their voice. So awareness and education are key.
There are two versions of voice banking: digitized speech and synthesized speech. Digitized speech involves directly recording anticipated phrases, names, etc. and recording each one: “How are you,” “I want to go to the store,” etc. Usually these common phrases are then placed on a computer as a sound file, which the user can access using a touch screen.
Synthesized speech involves creating a computer file of all the sounds a voice makes using the alphabet and combinations of letters, which are then combined to create words. This method is obviously more complex, but gives a greater amount of versatility and a larger potential bank of speech that can be produced.
Much of the recording can even be done by the user on their own, according to Burns. “The person recording a voice should have a microphone that is of higher quality than most people have as well as a place to record that has very little, if any, background noise.”