Closed Captioning

Annosoft’s Speech technology is used extensively in commercial Text-Audio Alignment applications. Given an audio file and a text transcript, the Text Audio Alignment technology will produce millisecond accurate word timings. Customers can use either The Lipsync Tool and the programmer SDK for this purpose.


This application is one our best selling commercial products. We are very proud to be assisting companies achieve 529 compliance and provide commercial content for the hearing impaired.  Customers include Disney, The National Danish Library, as well as many e-learning companies.

From “bouncing ball” cartoons, E-books, Podcasts, E-Learning applications, closed captions in games, YouTube videos, and even alignment of music with vocals, technology makes it easier and does it better. There are some limitations. In quiet environment, the text-audio alignment is very robust for both long and short audio files, but  noisy environments can require some manual work. Environmental settings such as lectures are difficult to fully automate, but we provide a convenient workaround that allows customers to quickly produce content, making accessible content and 529 compliance easier to produce than ever before.


Converting a text transcription into a set of phonemes for speech alignment is non-trivial. Unlike the Textless Phoneme Recognition, each language requires special processing. Currently, we support:

  • Chinese
  • Czech
  • Danish
  • Dutch
  • English
  • French
  • German
  • Italian
  • Japanese
  • Korean
  • Norwegian
  • Portuguese
  • Russian
  • Spanish
  • Swedish

Platforms: Win32, MacOS, Linux

This technology is provided with the Text Based Lipsync SDK, for prices and license information, click here.

We are interested in helping you solve your unique audio problems. Please send us an e-mail.