| Text Based Lipsync SDK |
Text Audio Alignment SDK |
The Text Based Lipsync SDK is best lipsync technology in the world. With the
audio and a text script, this technology produces perfect or near perfect lipsync
on short or very long files.
This technology has been production quality for 5 years and 1000s of hours of
audio. Our customer list is a testament to the quality of this software.
The lipsync data output from the SDK is in a flexible format. If you have an
existing character animation implementation, the lipsync data will be usable
in straightforward way.
Why text based lipsync/annotation is valuable?
Although the text-less system is easier for the end user, the text-based system
offers a few significant advantages. In addition to perfectly accurate
phoneme timings, it accurately times words and user data. Having accurately
time-stamped words allows applications
to
automatically build page-turning applications that exactly match the source
audio. The Text-based version also recognizes and timestamps arbitrary XML
embedded in the text transcription. Take this example transcription:
After you have run the demonstration. I need to ask you a
question.. <animate name="point-2-user"/>. After you have answered the question.
Click here <animate name="point-2-button"/>. Thank you.
The power is that with an appropriate animation architecture, scenes can be
built by adding application specific markers to the source audio transcription.
Arbitrary markers allow applications to build a production process which doesn't
require hand timing anything to audio files. The text scripts define the
presentation and rely on canned animation sequences to run the scene. Even the
actual audio recording can be changed and very little production work will be
required.
Languages
Converting a text transcription into a set of phonemes for speech alignment is
non-trivial. Unlike the Textless Lipsync SDK, each language requires special processing. Currently, we support:
1. English
2. German
3. French
4. Spanish
5. Italian
6. Russian
7. Polish
8. Norwegian
9. Portuguese
10.
Swedish
11. Danish
Because of the special power of the Text Based Lipsync
SDK, supporting new languages is important. We are actively working
to broaden the multilingual support of this product.
Platforms: Win32, MacOS
demo page
About the SDKs
Annosoft licenses multimedia speech SDKs. Written in C++ and assembly language,
the SDKs are painless to integrate into any C++ application or platform.
Additionally, a scriptable ActiveX Control is available for use in Visual Basic
or other Microsoft technologies.
Annosoft SDKs are extremely flexible because speech models are not hard-coded
into the SDKs. This allows our clients to choose from various "stock" speech
models that are the best fit for their application. Our stock models give our
clients the ability to tune their application (at any time) for 1)
recognition speed. 2) recognition accuracy. 3) application footprint. Also,
custom speech models can be trained based on the audio characteristics and
speaker, producing an optimal model in terms of speed and accuracy.