Technique G226:Providing audio descriptions by incorporating narration in the soundtrack
About this Technique
This technique relates to:
- 1.2.3: Audio Description or Media Alternative (Prerecorded) (Sufficient when used with G173: Providing a version of a movie with audio descriptions)
- 1.2.5: Audio Description (Prerecorded) (Sufficient when used with G173: Providing a version of a movie with audio descriptions)
This technique applies to any technology that supports audio and video.
Description
The objective of this technique is to provide audio descriptions through narrative incorporated into the soundtrack of the synchronized video, so that people who cannot see are able to understand important visual material.
Since most user agents today cannot merge multiple sound tracks, this technique adds additional context by revising the draft or pre-existing soundtrack so that the narrative includes audio description via a single audio track. This additional information may address actions, characters, scene changes, and on-screen text (not captions) that are important to understanding the content.
The existing narrative is either revised or new narrative is added during pauses in existing dialogue (which potentially limits the amount of supplementary narration that can be added).
This technique is most appropriate in instructional, marketing, and other videos where the narrative is intended to be informational. In such cases, a soundtrack which reinforces the visual "takeaways" in the video will be vital to blind people and people with low vision, and may be of use to many users, including some users with cognitive disabilities.
Examples
An instructional video is scripted with narration that describes important visual content
Someone creating an instructional video demonstrating the features of an application prepares a script where what is being shown visually is reinforced through the narration, to arrive at an efficient and cost-effective means of making a more accessible video.
Several key strategies are followed to improve the video's narration. Each is described more fully in the following subsections:
- Describe any pertinent and meaningful text on the screen
- Avoid saying only “this” or “here” to describe UI components
- For better context, describe elements by sensory perceptions as well as by label
- Fully describe sequences of action, including any dynamic content that appears
- When a main page or dialog appears, say its title and describe its features
- When using a mouse to show something (such as to hover, select, scroll, and open), say what you are doing
Describe any pertinent and meaningful text on the screen
When referring to URLs, dialogs, labels, and headings, read out the text. Sometimes presenters (narrators) just highlight or point to text; speaking the visible text ensures this meaningful text is made accessible to everyone including blind users or those with low vision. When describing actions a user can do, be sure to specifically state the button names to improve the context (for example, "choose the green 'Go' button").
Avoid saying only “this” or “here” to describe UI components
This goes hand in hand with the first rule to announce text on the screen. When presenters (narrators) point out “this button” or say “you'll see this”, they are typically referring to a visual cue they are providing on the screen. Someone who can't see the screen lacks the context to understand what is being referenced. Replace or augment “this” and “here” with the labels/titles to provide context: “Choose the blue Save button”, “The Profile Settings dialog appears, with several options.”
For better context, describe elements by sensory perceptions as well as by label
Including position and other sensory qualities like color can really help some low vision users and users with cognitive disabilities. However, you will want to include other context, such as structural headings, in addition to position (which is usually not very helpful to blind user). For components with visible labels, always read out the label when referring to the component. Where a visible label is absent, but you are aware of other programmatic labeling that will be read by the screen reader (for example, the aria-label property or page regions), use that text. Also include placement and structure (headings) on the page (for example, “the red “Cancel” button at the bottom right of the dialog”, “Select the “online only” radio button in the Settings options”).
Fully describe sequences of action, including any dynamic content that appears
When you are demonstrating a process, be sure to describe all steps you are carrying out. As well, announce when status messages appear, such as “loading”, and when other content appears or disappears on the screen.
When a main page or dialog appears, say its title and describe its features
When a dialog or page appears, read out its title. For a new page, also describe its purpose or any distinguishing characteristics. Practice a natural storytelling style that does not simply read the text on the screen.
When using a mouse to show something (such as to hover, select, scroll, or open), say what you are doing
When performing complex interactions, especially by mouse, it is sometimes helpful to announce what you plan to do before doing it, then narrate while you are interacting with it, and finally summarize what you just did.
Additional narration is added to gaps in the existing soundtrack
A marketing video's important visuals can be mainly inferred from the audio soundtrack. However, it only uses on-screen text to identify new speakers, as well as to provide a url at the end of the video where people can go for more information. In post-production a new narrator announces the onscreen text in gaps in the dialog.
Tests
Procedure
- Open the synchronized media that includes audio description.
- Listen to the movie.
- Check to see if the main narration is used to convey important information in the visual content, such as new speakers and on-screen text.
- Where important visual information is not conveyed through the soundtrack or addressed in the original narration, check to see if additional narration has been added in available gaps in the dialog.
Expected Results
- #3 and #4 are true.