sapi_textbased_lipsync::lipsync

The lipsync process runs asyncronously. Application will need to poll for completion or decide when it's time to bail.

We use, abuse the command and control grammar. We create a top level grammatical rule, and then add the source text as a lexical transition. I believe that internally the command and control grammar creates transitions for each word. At least that's what the hypothesis seem to show.

We also preprocess the file removing punctuation and other "dirty" characters that seem to negatively impact performance.

00442 {
00443     HRESULT hr;
00444     try
00445     {
00446         m_strInputText = strText;
00447         if (!this->initializeObjects())
00448             throw (HRESULT(E_FAIL));
00449         
00450         if (!this->loadAudio(strAudioFile))
00451              throw (HRESULT(E_FAIL));
00452         
00453         // initialize the grammar
00454             
00455         SPSTATEHANDLE hLipsyncRule;
00458         hr = this->m_grammar->GetRule(L"TextLipsync", NULL,
00459                             SPRAF_TopLevel | SPRAF_Active, TRUE,
00460                             &hLipsyncRule);
00461 
00462         if (hr != S_OK)
00463         {
00464             m_err = L"Failed to create grammar rule for text based lipsync";
00465             throw (hr);
00466         }
00467         
00468         // prepare text for text based lipsync. Tokenize out formatting, punctuation
00469         std::wstring strIn = preprocess_text(strText);
00470         // create the phrase inside the rule
00471         hr = m_grammar->AddWordTransition(hLipsyncRule, NULL, strIn.c_str(), 
00472             L" ", SPWT_LEXICAL, 1, NULL);
00473 
00474         if (hr != S_OK)
00475         {
00476             m_err = L"Failed to create lipsync rule for specified text transcription";
00477             throw (hr);
00478         }
00479         
00480         // finalize the grammar
00481         hr = m_grammar->Commit(0);
00482         if (hr != S_OK)
00483         {
00484             m_err = L"Failed to commit lipsync text rule for specified text transcription.";
00485             throw (hr);
00486         }
00487                 
00488         // turn the grammar on
00489         hr = m_grammar->SetGrammarState(SPGS_ENABLED);
00490         if (hr != S_OK)
00491         {
00492             m_err = L"Error: Failed to disable the grammar.";
00493             throw (hr);
00494         }
00495         // start up recognition
00496         m_recog->SetRecoState(SPRST_ACTIVE);
00497         // enable the rule
00498         m_grammar->SetRuleState(NULL, NULL, SPRS_ACTIVE);
00499 
00500         // now we should be running!
00501 
00502     }
00503     catch (HRESULT _hr)
00504     {
00505         hr = _hr;
00506     }
00507     return (hr == S_OK);
00508 }