Glossary of Accessibility Terms | Supplementary Resources | MediaScribe Academy

You don't have to memorize every accessibility term to do this work well—but having a reliable reference nearby makes a real difference. When you're mid-article and encounter a term like "synchronized media" or "CEA-608," you shouldn't have to leave the Academy to figure out what it means.

This glossary collects the key terms used across all MediaScribe Academy articles and learning paths. Definitions are written in plain language for government staff—not engineers, not lawyers. If you want the formal technical or legal definition of any term, always refer to the official WCAG documentation at w3.org or consult your agency's legal counsel.

Terms are organized by topic and listed alphabetically within each category. Use the section headings below to jump to the area most relevant to what you're working on.

In this glossary

Accessibility standards and law
Captions and transcripts
Audio descriptions
Visual presentation
Video player accessibility
MediaScribe platform
General accessibility concepts

Accessibility standards and law

ADA (Americans with Disabilities Act) A federal civil rights law that prohibits discrimination against people with disabilities. Title II of the ADA applies to state and local governments, requiring them to make programs and services—including digital content like meeting videos—accessible to everyone.

ADA Title II The section of the ADA that covers public entities, including city and county governments, school districts, and other state and local agencies. Under Title II, government agencies must make their communications and public meetings accessible to people with disabilities.

Assistive technology Hardware or software that helps people with disabilities use digital content more effectively. Examples include screen readers for people who are blind, captioning displays for people who are deaf, and keyboard navigation tools for people with limited motor control.

Conformance The degree to which a piece of content meets WCAG requirements. WCAG defines three conformance levels: Level A (minimum), Level AA (standard), and Level AAA (enhanced). Most accessibility laws and policies require Level AA conformance.

Level A The minimum WCAG conformance level. Content must meet all Level A criteria to address the most significant accessibility barriers. Level A alone is generally not sufficient to meet legal requirements.

Level AA The standard WCAG conformance level required by most accessibility laws and policies, including ADA Title II regulations. Includes all Level A criteria plus additional requirements for captions, audio descriptions, contrast, and more.

Level AAA The highest WCAG conformance level. These criteria address the needs of users with the most significant barriers. Level AAA is not required by most laws but represents best practice in some areas.

Section 508 A federal law requiring that federal agencies make their electronic and information technology accessible. While Section 508 applies directly to federal agencies, it often influences accessibility expectations for state and local governments.

Success Criterion 1.2.2 (captions — prerecorded) A WCAG Level A requirement stating that captions must be provided for all prerecorded synchronized media that includes an audio track—such as recorded city council meetings posted to a website. See also: synchronized media, prerecorded content.

Success Criterion 1.2.4 (captions — live) A WCAG Level AA requirement stating that captions must be provided for all live audio content in synchronized media, such as a city council meeting streamed in real time.

Success Criterion 1.2.5 (audio description — prerecorded) A WCAG Level AA requirement stating that audio descriptions must be provided for all prerecorded video content in synchronized media where the video track conveys information not available in the audio. See also: audio description.

Success Criterion 1.4.3 (contrast — minimum) A WCAG Level AA requirement stating that text and images of text must have a contrast ratio of at least 4.5:1 against the background. Large text (18pt or 14pt bold) requires a ratio of at least 3:1.

Success Criterion 1.4.11 (non-text contrast) A WCAG Level AA requirement stating that user interface components and graphical elements must have a contrast ratio of at least 3:1 against adjacent colors.

Success Criterion 2.1.1 (keyboard) A WCAG Level A requirement stating that all functionality must be operable using a keyboard alone, without requiring a mouse.

Success Criterion 2.3.1 (three flashes or below threshold) A WCAG Level A requirement stating that content must not flash more than three times per second. Content that flashes rapidly can trigger photosensitive seizures in some people.

WCAG (Web Content Accessibility Guidelines) An internationally recognized set of guidelines for making digital content accessible to people with disabilities. Published by the World Wide Web Consortium (W3C). Government agencies typically aim to meet WCAG 2.1 Level AA. Note: WCAG applies to video content, not just websites.

WCAG 2.1 The version of WCAG most commonly referenced by accessibility policies and regulations as of 2026. It added criteria around mobile accessibility and low vision, building on WCAG 2.0.

WCAG success criterion An individual, testable requirement within WCAG. Each success criterion is assigned a level (A, AA, or AAA). For example, Success Criterion 1.2.2 addresses captions for prerecorded video content. When referencing WCAG in your agency's documentation, always cite a specific success criterion rather than making broad claims about compliance.

Captions and transcripts

Caption accuracy The percentage of words in a caption that correctly match what was spoken. The FCC requires 99% accuracy for broadcast television. Government web content should aim for the highest accuracy possible, with 95%+ as a practical standard when using automatic speech recognition with custom vocabulary.

Captions Synchronized text that displays spoken words and relevant non-speech audio (such as "[applause]" or "[alarm sounds]") in a video. Captions differ from subtitles, which typically only display dialogue and assume the viewer can hear the audio.

CEA-608 / CEA-708 Industry standards for closed caption encoding used in broadcast television. CEA-608 is the older analog standard; CEA-708 is the digital standard for high-definition broadcasts.

Closed captions Captions that viewers can turn on or off. They are delivered as a separate data track alongside the video, rather than being burned into the video image. Most streaming platforms and broadcast systems support closed captions.

Custom vocabulary A list of words, names, and phrases added to a speech recognition system to improve accuracy for terms that the system might otherwise mishear. Useful for local place names, official names, department names, and specialized terminology specific to your agency.

Non-speech audio Sounds in a video that are not spoken words but that carry meaning for the viewer. Captions should include descriptions of relevant non-speech audio, such as "[gavel strikes]," "[applause]," or "[fire alarm]."

Open captions Captions that are permanently embedded in the video image. All viewers see them, and they cannot be turned off. Open captions are useful for social media, digital signage, and situations where closed caption support cannot be guaranteed.

Prerecorded content Video or audio that was recorded before being made available to viewers. WCAG treats prerecorded and live content differently—for example, captions for prerecorded video are a Level A requirement (1.2.2), while captions for live content are Level AA (1.2.4). Because prerecorded content can be reviewed before publishing, there is more opportunity to correct accessibility issues before release.

Speaker identification The practice of labeling captions with the name or role of the person speaking when it is not otherwise clear from context. For example: "[MAYOR SMITH]: I now call the meeting to order." This helps viewers who rely on captions follow conversations involving multiple speakers.

SRT (SubRip Text) A common caption file format. SRT files contain caption text paired with start and end timecodes. Most video platforms, including YouTube, accept SRT files.

Synchronized media Audio or video content paired with a time-based visual component, where both must be experienced together to understand the content. A recorded city council meeting with video and audio is synchronized media. WCAG's time-based media requirements (Guideline 1.2) apply to synchronized media.

Transcript A text document containing the words spoken in an audio or video file. Unlike captions, transcripts are not synchronized to the video timeline—they are read independently. Transcripts support searchability, public records requests, and meeting minutes preparation. For audio-only content (such as a podcast), a transcript can serve as the text alternative. For video content, synchronized captions are also required.

VTT / WebVTT A web-native caption file format designed for HTML5 video players and streaming platforms. Like SRT, VTT files include caption text with timecodes.

Audio descriptions

Audio description Narration added to a video during natural pauses in dialogue that describes essential visual content—such as on-screen text, slides, graphs, and significant actions—so that people who are blind or have low vision can fully understand what is happening. Supported by WCAG Success Criterion 1.2.5 for prerecorded synchronized media.

Extended audio description A form of audio description used when natural pauses in the video are too short to include a complete description. The video is paused, the description is delivered, and then playback resumes. Addressed by WCAG Success Criterion 1.2.7 (Level AAA).

Gap detection A process that identifies natural pauses in a video's audio track where audio description narration can be inserted without overlapping dialogue or other important sounds.

Narration pacing The speed at which audio description text is read aloud. The industry standard for comfortable listening is approximately 2.5 words per second. Descriptions that are too fast are difficult to follow; descriptions that are too slow may not fit within the available pause.

SSML (Speech Synthesis Markup Language) A markup language that gives developers control over how text-to-speech systems pronounce words, including emphasis, pacing, and pauses. SSML is used to improve the naturalness of audio description narration.

Text-to-speech (TTS) Technology that converts written text into spoken audio using a synthesized voice. Modern neural TTS systems produce natural-sounding narration suitable for audio description.

Two-pass processing An audio description workflow in which a system first generates descriptions, then checks whether any descriptions are too long to fit within their designated gap, and condenses them as needed to prevent overlap with dialogue.

Visual presentation

Contrast ratio A mathematical measure of the difference in perceived brightness between two colors, expressed as a ratio such as 4.5:1. A higher ratio means greater contrast. WCAG Success Criterion 1.4.3 requires a minimum contrast ratio of 4.5:1 for standard-sized text and 3:1 for large text.

Flashing content Video or animation that flashes or blinks rapidly. Content that flashes more than three times per second can trigger photosensitive seizures in some people. WCAG Success Criterion 2.3.1 sets the threshold for safe flashing rates.

Focus indicator A visible outline or highlight that shows which element on a page currently has keyboard focus. Focus indicators are essential for people who navigate using a keyboard, since they need to see where they are on the page at all times.

Luminance The perceived brightness of a color, used in contrast ratio calculations. Contrast ratio is based on the relative luminance of two colors: the lighter color relative to the darker. This is distinct from hue (the color itself) or saturation (how vivid the color is).

Non-text contrast The contrast requirement for user interface components and graphical objects—such as buttons, icons, and chart lines—that convey information. WCAG Success Criterion 1.4.11 requires a minimum contrast ratio of 3:1 for these elements.

Video player accessibility

Keyboard accessibility The ability to use all features of a video player or website using only a keyboard—no mouse required. This is essential for people with motor disabilities and for some assistive technology users. WCAG Success Criterion 2.1.1 requires that all functionality be operable via keyboard.

Keyboard trap A situation where a person navigating by keyboard can move into a page element but cannot move out of it using standard keyboard keys (such as Tab or Escape). Keyboard traps must be avoided in video players and other interactive elements.

Pause, stop, hide A WCAG requirement (Success Criterion 2.2.2) that users be able to pause, stop, or hide content that moves, blinks, or scrolls automatically—including auto-playing video. This gives users control over content that may be distracting or disorienting.

MediaScribe platform

Archive processing The ability to add captions and other accessibility elements to previously recorded video content. This helps agencies work through a backlog of uncaptioned meeting recordings.

ASR (automatic speech recognition) Technology that converts spoken words into text in near real time. Accuracy improves when ASR is paired with a custom vocabulary that includes names and terms specific to your agency.

Compliance documentation Automatically generated records showing what accessibility measures were applied to content—including what was captioned, when, and in what languages. Useful for audits and demonstrating good-faith accessibility efforts.

Configuration preset A saved group of settings that operators can apply with a single click. Agencies can create different presets for different meeting types—such as city council versus planning commission—to reduce setup time and human error.

Files The section of the MediaScribe interface that serves as the asset library, where uploaded or recorded video files are stored and managed.

Gateway appliance The on-premise hardware device installed in an agency's A/V rack. The gateway captures video and audio via SDI input, sends audio to the cloud for processing, and delivers captioned output back to displays and broadcast systems. Required for MediaScribe Live.

MediaScribe Live MediaScribe's real-time captioning service for live meetings and broadcasts. Requires the Gateway appliance hardware. Captions are generated in near real time as speech is captured.

MediaScribe Narrate MediaScribe's cloud-based audio description service. Allows agencies to generate, review, and publish audio descriptions for prerecorded video content without requiring on-premise hardware.

Mobile branding The section of the MediaScribe interface where agencies configure the appearance of captions on viewers' personal devices—including fonts, colors, and contrast settings.

Mobile captions Live or recorded captions delivered to viewers' personal smartphones or tablets via a QR code. Viewers open captions in a mobile browser with no app download required. Viewers can also select from 72+ languages for real-time translation.

QR code A scannable square barcode displayed during a meeting that links attendees to the mobile caption viewer. Attendees scan the code with their smartphone camera and captions open in their mobile browser.

SDI (Serial Digital Interface) A professional broadcast video standard used to connect cameras, mixers, and other A/V equipment. MediaScribe connects to an agency's existing A/V infrastructure via SDI.

TV branding The section of the MediaScribe interface where agencies configure the appearance of in-room and broadcast caption displays—including font size, color, and background style.

General accessibility concepts

Alternative text (alt text) A written description of an image that screen readers can convey to people who are blind or have low vision. Alt text must be provided for all images that convey information. Decorative images that add no content should use an empty alt attribute so screen readers skip them.

Cognitive accessibility Design practices that make content easier to understand for people who have cognitive or learning disabilities. This includes plain language, clear structure, consistent navigation, and avoiding unnecessary complexity.

Hard of hearing A term describing a person who has some degree of hearing loss but is not fully deaf. People who are hard of hearing may rely on captions, hearing loops, or other accommodations to access audio content.

Identity-first language A language approach that leads with a person's disability—such as "autistic person" or "deaf person"—because some communities prefer this framing as a matter of identity. This contrasts with person-first language. When in doubt, follow the preference of the individual or community you are serving.

LEP (limited English proficiency) Describes people who do not speak English as their primary language and have limited ability to read, speak, or understand English. Real-time caption translation helps government agencies serve community members with limited English proficiency.

Person-first language A language approach that leads with the person rather than the disability—such as "person who is deaf" or "person with a visual impairment"—to emphasize personhood. MediaScribe Academy uses person-first language by default, while acknowledging that some communities prefer identity-first language.

Screen reader Assistive technology that reads digital content aloud for people who are blind or have low vision. Screen readers work with properly structured HTML, accessible documents, keyboard-accessible video players, and other well-designed digital interfaces.

Synchronized captions Captions that are timed to appear in alignment with the spoken audio in a video. Captions that appear significantly before or after the corresponding speech do not meet WCAG standards and create a confusing experience for viewers who rely on them.

Visual impairment A range of conditions that affect a person's ability to see, from partial vision loss to complete blindness. People with visual impairments may use screen readers, enlarged text, high-contrast display settings, or audio descriptions to access video content.

What's next

Now that you have a reference for key terms, you're ready to dig into the curriculum. If you're new to video accessibility, start with the Foundation modules to build your grounding in WCAG and the people you're serving. If you're working on a specific challenge—captioning, audio descriptions, visual presentation, or player accessibility—jump directly to the relevant learning path. This glossary will be here whenever you need it.

Definitions reflect plain-language interpretations intended for government staff. Always consult the official WCAG documentation at w3.org and your agency's legal counsel for binding guidance.

Last updated: February 2026