Crow done animation manually for each speech. He created around 2000 flash frame tags (for each talking sprite) along played sound, exactly 24 frames per second.
Then he placed script objects in these frames, for example AS1/AS2 Flash script object like this:
tellTarget("_parent.head_rig.head_roll.head_set.head_motion.head.jaw")
{
gotoAndStop("o");
}
manually frame by frame adjusting it the way that mouth appears to move synchronously with the sound.
Doing that this way must have taken him incredible amount of time.
No way in hell I'm doing it that way. Even the easy option with mouth animation controlled by a string of letters in XML file is 10x faster&easier to use. Still it's not that cool as automatic speech recognizing sprite would be.