Logo

Why did Annosoft release this software?

Annosoft develops high end lipsync software, so it is a reasonable question as to why Annosoft would offer a free solution.

It started as a need to evaluate the capabilities of SAPI 5.1 as a competitive offering. We put both feet forward to develop the best system we could using SAPI.

From our evaluation, where annosoft wins is the amount of time our commercial software will save you on large projects:

  • How much can you depend on the results begin correct?
  • How much time are you willing spend on manual lipsyncing when things don't go?
  • How much time are are you willing to spend training the speech system for all your voice audio and switch the Speech control panel every time you want to lipsync?
  • Do you need to support languages other than English?

In large scale production use, there will be higher production cost associated with SAPI over our commercial offerings. The team will have to work more closely with their animations, and their system components to have good quality.

All that said, it's free upfront. We hope that it can be useful.

The SAPI textless lipsync generates reasonable quality and is more reliable than SAPI textbased lipsync. But it still generates many local misalignments, and the fact that results are used to train the system models, system models will become confused when faced with new speakers or different recording characteristics.

On files over about 10 seconds, performance drops in SAPI. SAPI Text Based Lipsync (in my tests) failed to fully align audio files over 10 seconds. It generates partial results in many cases. Cutting up the audio files manually into smaller pieces is the work-around, albiet a painful one. In my tests, it failed completely on most telephonic audio. This probably relates to the order in which I performed my tests because of an inner model adaptation process.

We haven't done production work with this software. I know that speech training and managing speech profiles are probably going to be an issue. We're very interested in war stories so if you have one!

Our Conclusions

  • Annosoft's Text Based Lipsync is an order of magnitude better. It can align short or long files equally well. The alignment results are consistantly better than SAPI. This doesn't mean that microsoft speech software is lame, it's just not designed for this purpose.

  • Annosoft has broader uses because it's more accurate.

  • Closed captioning and lipsync on very long files. We work with 10 minute files.

  • Support for all languages in our textless lipsync. Support for English, Spanish, French, German and Italian in text based lipsync.

  • Stable codebase with years of production use and fully supported by a senior engineer

  • A proprietary "smoothing" system that makes output much better than just using raw phonemes.

  • For big projects, using SAPI the fiddle factor will become very expensive. Annosoft is more reliable and fully supported. We wrote every line of the code, and you're always talking to the engineer.


Copyright (C) 2002-2005 Annosoft LLC. All Rights Reserved.
Visit us at www.annosoft.com