SDKs
Text Based Lipsync SDK Textless Lipsync SDK
Realtime Lipsync SDK
SDK Price Sheet
Desktop Applications

Lipsync Tool
Flash Lipsync Tool

Free Stuff
Realtime Demo
SAPI Lipsync C++ Code
Info: Phonemes->Display
Annosoft
Home
About Us
Customers


Text Based Lipsync SDK Text Audio Alignment SDK

The Text Based Lipsync SDK is best lipsync technology in the world. With the audio and a text script, this technology produces perfect or near perfect lipsync on short or very long files.

This technology has been production quality for 5 years and 1000s of hours of audio. Our customer list is a testament to the quality of this software.

The lipsync data output from the SDK is in a flexible format. If you have an existing character animation implementation, the lipsync data will be usable in straightforward way.

Uses include:

1. ultra high-quality lipsync
2. automatic subtitling/closed captioning
3. non-linear animation. script driven animation

Why text based lipsync/annotation is valuable?

Although the text-less system is easier for the end user, the text-based system offers a few significant advantages. In addition to perfectly accurate phoneme timings, it accurately times words and user data. Having accurately time-stamped words allows applications to automatically build page-turning applications that exactly match the source audio. The Text-based version also recognizes and timestamps arbitrary XML embedded in the text transcription. Take this example transcription:

After you have run the demonstration. I need to ask you a question.. <animate name="point-2-user"/>. After you have answered the question. Click here <animate name="point-2-button"/>. Thank you.

The power is that with an appropriate animation architecture, scenes can be built by adding application specific markers to the source audio transcription. Arbitrary markers allow applications to build a production process which doesn't require hand timing anything to audio files. The text scripts define the presentation and rely on canned animation sequences to run the scene. Even the actual audio recording can be changed and very little production work will be required.  

Languages

Converting a text transcription into a set of phonemes for speech alignment is non-trivial. Unlike the Textless Lipsync SDK, each language requires special processing. Currently, we support:

1. English
2. German
3. French
4. Spanish
5. Italian
6. Russian
7. Polish
8. Norwegian
9. Portuguese
10. Swedish
11. Danish

Because of the special power of the Text Based Lipsync SDK, supporting new languages is important. We are actively working to broaden the multilingual support of this product.

Platforms: Win32, MacOS

demo page

About the SDKs

Annosoft licenses multimedia speech SDKs. Written in C++ and assembly language, the SDKs are painless to integrate into any C++ application or platform. Additionally, a scriptable ActiveX Control is available for use in Visual Basic or other Microsoft technologies.

Annosoft SDKs are extremely flexible because speech models are not hard-coded into the SDKs. This allows our clients to choose from various "stock" speech models that are the best fit for their application. Our stock models give our clients the ability to tune their application (at any time) for 1)  recognition speed. 2) recognition accuracy. 3) application footprint. Also, custom speech models can be trained based on the audio characteristics and speaker, producing an optimal model in terms of speed and accuracy.

 

info@annosoft.com

 

Copyright (C) 2002-2007 Annosoft, LLC. All rights reserved.