How can I replace the screen reader audio with a prerecorded audio file?

Question

I work on a multilingual website that will contain many languages that are not normally written, and I wonder if there are any ways to get this working for people using screen readers? Is it possible to give a text an attribute to make the screen reader play a prerecorded sound instead of trying to read the text by itself?

The whole menu system will be translated into the languages that are not supported by any screen readers.

score 2 · Answer 1 · edited May 23 '17 at 12:18

The two popular screen readers are JAWS and NVDA. You can see what languages JAWS supports, 28 in total. NVDA supports 43 languages (I couldn't find a list).

I wonder if there are any ways to get this working for people using screen readers

There is a few things you could do that come to mind:

Declare the language of the page via the <body lang="">, so that if the screen reader happens to know how to interpret it, it uses that language
Put links to common language translations near the top of the page so if somebody lands on a random page from a search engine hit, they can change languages quickly.

Is it possible to give a text an attribute to make the screen reader play a prerecorded sound instead of trying to read the text by itself?

The lang attribute makes the screen reader switch to another language if it understands it. You can provide links to audio files to be listened to, I would be a little cautious with providing your own audio player. Not all audio players are accessible, the two common issues with these are the controls are not labeled and they cause focus trap.

Unlabeled controls make the assistive technology say "unlabeled" or something similar, so you cannot tell the buttons apart from each other. Focus trap effects people who use the keyboard to navigate a page, this is usually using the Tab key, and instead of getting out of the audio player, it goes to the first element of the audio player again.

From Comments

How I can make the screen reader play these files instead of trying to read the text.

The only thing you can do is use ARIA to hide the content via the aria-hidden='true' attribute. You can check my answer about aria-hidden for more details. Essentially you would do something like:

<article aria-hidden="true">
 <h1>Some Really cool language</h1>
 <p>Blah blah blah</p>
 <section aria-hidden="false">
  <h2>Audio of language</h2>
  <p>below is an audio sample of ____. Blah blah blah</p>
  <p class="offScreen"><!-- it may be a good idea to put additional 
  info for people using assistive tech --></p>
  <p>audio stuff</p>
 </section>
<article>

CSS

.offScreen{
 position: absolute;
 top: 0;
 left: -999px;
 }

Thanks for the answer. I'm very aware about the lang attribute, but it will not help, as the screen reader will not be able to read a lot of the texts on the website (like I said: many of them are not normal to even write!). We're going to add audio files for the text anyway, and the questions is how I can make the screen reader play these files instead of trying to read the text. — user11448, May 19 '13 at 20:39
I like the idea of using the offset text to provide additional context; but disagree with using aria-hidden="true" to hide the original content; this hides it from the user outright, and prevents them from eg. copying and pasting to send to a colleague, or stepping through it letter by letter. Also, note that aria-hidden doesn't nest this way; an outer aria-hidden will hide the entire subtree under it, so the inner section will not be accessible, even though it has aria-hidden="true" (similar to how display:none, visibility:hidden work in CSS and HTML5's hidden.) See my answer for more... — BrendanMcK, May 29 '13 at 09:06

score 2 · Answer 2 · answered May 20 '13 at 23:44

Ryan, I've seen this question asked elsewhere about "click" languages, as of southwestern Africa. So far as I know, there is no written alphabet that is intrinsic to these languages. Scholars might record the languages phonetically, but more common techniques involve adding exclamation points and perhaps other basic keyboard characters to indicate the vocalizations that cannot be conveyed by European alphabets. The Kx'a family of languages is one such group.

If you look for RFC 1766 on sourceforge.net, you'll find a list of 122 languages or variants of languages that map to specific values of the lang attribute. And RFC 1766 itself shows how to add Klingon and other "experimental" languages to the mix.

So there are several issues, it seems:

If a language has not yet been mapped, how does one create a mapping of its characters and groups of characters (its graphemes) to its sounds (its phonemes)?
Assuming that's all that is required, how does one get that mapping associated with a new value for the lang attribute? (To get that new value, RFC 1766 says to create, complete, and submit a simple form. But, given that the document called RFC 1766 is 18 years old, how reliable is that information? And just where does the mapping of symbols to sounds fit into the picture?)
Ultimately, how does one get a screen reader to recognize that mapping and the corresponding value of the lang attribute?

BrendanMcK · Answer 3 · 2013-05-30T07:10:01.363

My somewhat contrarian take: don't try to automatically replace the text with pre-recoreded content; instead focus on ensuring that the user is aware that both are available, and can access whichever is most appropriate for them based on the tools they have at their disposal.

Some more background context might help: from your description, it sounds like this is perhaps an academic or research site, that has fragments of text in these languages, with audio; but where the remainder of the site structure - headings and supporting narrative text - is in some 'well-supported' language (English, etc.)? (What is the encoding system used for this test?)

If so...

Be aware that a screenreader user does not typically read an entire page top-to-bottom in a completely linear fashion; they can browse the page using the heading structure. In a well-marked-up-page, the user has the freedom to skip over the portions that they are not interested in or which are not relevant to them. Focus on providing this flexibility rather than making (well-meaning, but potentially incorrect) policy decisions on behalf of the user.

Don't assume that a screenreader user is using speech in the first place; they could be using Braille, whether due to the fact that speech output is not an option for them, or simply because Braille is their preferred form of output.

Finally, don't assume that because a screenreader user can't hear the text properly (due to text-to-speech limitations), that the textual form of the content should be hidden from them entirely; they may still want the ability to cut-and-paste the characters that represent the text so that they can send them to a colleague, for example. Or, depending on the writing scheme used, a screenreader user may still be able to step through the characters letter-by-letter and have the words spelled out to them letter-by-letter - many screenreaders can call out non-latin characters by their Unicode name.

I assumed your first paragraph was a given. Second paragraph is a comment, third is as well, but I had a feeling OP may know that. Fourth, while yes that is true, refreshable Braille displays recognize the `lang` attribute — Ryan B, May 29 '13 at 12:40
I like to (over)clarify my assumptions; have seen too many cases on SO where a question is read in a way subtly but importantly different than OP's intent. My concern is that the OP appears to be trying to do something 'magically' on behalf of the user, which is rarely a good idea; also screenreader voice output is fundamentally a different thing than recorded speech, so conflating the two can often indicate a naive understanding of how screenreaders work. — BrendanMcK, May 29 '13 at 19:41
The whole menu system and site structure will be localized into the languages that are not possible to read with a screen reader. — user11448, Jun 05 '13 at 17:36

score 0 · Answer 4 · answered Jun 05 '13 at 17:49

The issue here is less about JAWS and more about having a Synthesizer which speaks the language and can communicate with JAWS through a driver such as SAPI 5. Development of these languages for the various synthesizer companies can be costly, especially if there is not a good business case driving it such as GPS, ATMs, Call Centers, etc.

There are open source solutions such as eSpeak which you might look into as well. It is not the highest quality but could be an approach if you have access to developers willing to work on such a project.

As for the question regarding an API or method to communicate information to JAWS via prerecorded sound files of the web site? This is not really going to meet the need of the screen reader user who would have no way to navigate the information or interact with it using Links or form field elements. I really think the synthesizer development is the only solution unfortunately.

How can I replace the screen reader audio with a prerecorded audio file?

4 Answers4