Skip to content

Technique G226:Providing audio descriptions by incorporating narration in the soundtrack

About this Technique

This technique relates to:

This technique applies to any technology that supports audio and video.

Description

The objective of this technique is to provide audio descriptions through narrative incorporated into the soundtrack of the synchronized video, so that people who cannot see are able to understand important visual material.

Since most user agents today cannot merge multiple sound tracks, this technique adds additional context by revising the draft or pre-existing soundtrack so that the narrative includes audio description via a single audio track. This additional information may address actions, characters, scene changes, and on-screen text (not captions) that are important to understanding the content.

The existing narrative is either revised or new narrative is added during pauses in existing dialogue (which potentially limits the amount of supplementary narration that can be added).

This technique is most appropriate in instructional, marketing, and other videos where the narrative is intended to be informational. In such cases, a soundtrack which reinforces the visual "takeaways" in the video will be vital to blind people and people with low vision, and may be of use to many users, including some users with cognitive disabilities.

Examples

An instructional video is scripted with narration that describes important visual content

Someone creating an instructional video demonstrating the features of an application prepares a script where what is being shown visually is reinforced through the narration, to arrive at an efficient and cost-effective means of making a more accessible video.

Several key strategies are followed to improve the video's narration. Each is described more fully in the following subsections:

  1. Describe any pertinent and meaningful text on the screen
  2. Avoid saying only “this” or “here” to describe UI components
  3. For better context, describe elements by sensory perceptions as well as by label
  4. Fully describe sequences of action, including any dynamic content that appears
  5. When a main page or dialog appears, say its title and describe its features
  6. When using a mouse to show something (such as to hover, select, scroll, and open), say what you are doing

Describe any pertinent and meaningful text on the screen

When referring to URLs, dialogs, labels, and headings, read out the text. Sometimes presenters (narrators) just highlight or point to text; speaking the visible text ensures this meaningful text is made accessible to everyone including blind users or those with low vision. When describing actions a user can do, be sure to specifically state the button names to improve the context (for example, "choose the green 'Go' button").

Avoid saying only “this” or “here” to describe UI components

This goes hand in hand with the first rule to announce text on the screen. When presenters (narrators) point out “this button” or say “you'll see this”, they are typically referring to a visual cue they are providing on the screen. Someone who can't see the screen lacks the context to understand what is being referenced. Replace or augment “this” and “here” with the labels/titles to provide context: “Choose the blue Save button”, “The Profile Settings dialog appears, with several options.”

For better context, describe elements by sensory perceptions as well as by label

Including position and other sensory qualities like color can really help some low vision users and users with cognitive disabilities. However, you will want to include other context, such as structural headings, in addition to position (which is usually not very helpful to blind user). For components with visible labels, always read out the label when referring to the component. Where a visible label is absent, but you are aware of other programmatic labeling that will be read by the screen reader (for example, the aria-label property or page regions), use that text. Also include placement and structure (headings) on the page (for example, “the red “Cancel” button at the bottom right of the dialog”, “Select the “online only” radio button in the Settings options”).

Fully describe sequences of action, including any dynamic content that appears

When you are demonstrating a process, be sure to describe all steps you are carrying out. As well, announce when status messages appear, such as “loading”, and when other content appears or disappears on the screen.

When a main page or dialog appears, say its title and describe its features

When a dialog or page appears, read out its title. For a new page, also describe its purpose or any distinguishing characteristics. Practice a natural storytelling style that does not simply read the text on the screen.

When using a mouse to show something (such as to hover, select, scroll, or open), say what you are doing

When performing complex interactions, especially by mouse, it is sometimes helpful to announce what you plan to do before doing it, then narrate while you are interacting with it, and finally summarize what you just did.

Additional narration is added to gaps in the existing soundtrack

A marketing video's important visuals can be mainly inferred from the audio soundtrack. However, it only uses on-screen text to identify new speakers, as well as to provide a url at the end of the video where people can go for more information. In post-production a new narrator announces the onscreen text in gaps in the dialog.

Tests

Procedure

  1. Open the synchronized media that includes audio description.
  2. Listen to the movie.
  3. Check to see if the main narration is used to convey important information in the visual content, such as new speakers and on-screen text.
  4. Where important visual information is not conveyed through the soundtrack or addressed in the original narration, check to see if additional narration has been added in available gaps in the dialog.

Expected Results

  • #3 and #4 are true.
Back to Top