Logo

.anno file format

The .anno file format is a simple newline delimited list of markers. For this implementation, there are 4 markers that readers need to support

phn marker

The phn marker describes a phoneme event. It will look like this:

phn 229 318 75 h
phn 318 498 75 EY
phn 498 817 0 x

phn milli-start-time milli-end-time morph-value phoneme-label

The "phn" marker at the beginning of a line indicates a phoneme label. The milli-start-time and milli-end-time and the phoneme-label are the results from the SAPI engine. The morph-value is something the annosoft sdks generate but the SAPI engine does not. It will either be 0 for 'x' silence, or it will be 75.

word marker

The word marker describes a word recognition event with timing for the word

word 229 498 Hey
phn 229 318 75 h
phn 318 498 75 EY
phn 498 817 0 x

word milli-start-time milli-end-time text-of-word

This may or may not be useful. The Lipsync Tool will display them in the timeline

audio marker This is the first marker in the file generated by sapi. It indicates the audio file used in the alignment. You can skip it. The Lipsync Tool will use this to locate the audio file when loading the .anno file. That's why it's written out.

audio C:\wavs\01cleaned.wav
phn 0 229 0 x
word 229 498 Hey

audio path-to-audio-file

%-begin-anno-text-%%

This is always the last item in the file generated by this software.

%%-begin-anno-text-%% 
Hey There, My name is lane. and it's my please to guide you through this online 
admissions information session. My role here is to recommend students into our
program who are committed to graduate and be successful in their career.

So one of the first things I need to know is whether you've got the commitment
to change your future by seeking higher education.

From the drop down menu below, choose one answer to the following question
and don't forget to click submit when your done. 

Why is a college education important to you?
%%-end-anno-text-%%

If text based lipsync is used, the original text is also written to the output. This is the only multiline marker. it may span more lines. If you want to read it, when you see the "%%-begin-anno-text-%%" string, read until you find a "%%-end-anno-text-%%" marker. It may span multiple lines.

The Lipsync Tool uses this to load the transcription so the annosoft version of text based lipsync. (Compare for yourself!)

that it. Pretty parsable format.


Copyright (C) 2002-2005 Annosoft LLC. All Rights Reserved.
Visit us at www.annosoft.com