View Full Version : text to speech

25-10-2004, 04:22 PM
For my 3D Adventure Studio project is am looking for a way to add speech.

- Record the speech from the microphone and store them as ogg, but hard for lip-sync. MAYBE.
- Looking at SAPI from microsoft leaves me with one or two voices and needs a redistributable of 130MB. NO. Only Windows.
- I remembered voicemaker that came with storymaker: http://www.billybear4kids.com/story/about-storymaker.html But i cannot get it to run anymore (and i do not rembers how it sounded). NO. But i like the concept of creating your own voices.
- http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=906&lngWId=7 features a german text to phonetics to speech demo with delphi source code. Sounds badly. MAYBE. Combined with a rewritten voicemaker for windows?

Do you people here know of other (better) ways of adding speech to an delphi (game) application? Or for that matter how to get lipsync using recorded voices?

Harry Hunt
25-10-2004, 05:02 PM
At this moment, there's no publicly available text-to-speech engine that sounds anything like a human voice. I believe what comes closest to the real thing is AT&T's text-to-speech engine:
(this can export WAV files, so it might be what you're looking for)

The engine is available for purchase here:

In general I would discourage using text-to-speech in an adventure game. It might qualify as a computer voice but definitely not as a human one.

If you can, use real pre-recorded voices. The lip-synching can be achieved through frequency analysis. If you've watched japanese anime before, you will know that you can get away with very basic mouth-shapes. The most important part is to make sure the mouth is shut whenever it's supposed to be shut. This can be achieved by checking if the volume is below a certain level.

Of course you can get really fancy with this. You could search your speech-files for certain patterns and link them to the different mouth shapes so you'd actually get an O-Mouth whenever an "o" is spoken.

Anyway, if you absolutely want to use text-to-speech, the Microsoft text-to-speech engine probably is the easiest solution, but it sounds pretty bad and the download is huge.

Robert Kosek
25-10-2004, 05:05 PM
http://www.analogx.com/contents/download/audio.htm < Try his "SayIt" program for PC/AI style voices.

Try the online tts demo from AT&T that Harry suggested ... It's quite good.

26-10-2004, 03:51 PM
The AT&T voices do sound good. But i am afraid the text-to-speech is not useable after all for gaming (maybe in a few years time). You just have to few realy diferent voices.

But i found the solution: Pamela which i found at: http://www-personal.monash.edu.au/~myless/catnap/pamela/index.html
That is the solution for my problem. It allows me to:
- load recorded speech
- type the text that is spoken
- choose the language and the poneme's are transcribed over time with the recorded speech
- export a list with times and phoneme's
Now in 3DAS i can play the recorded speech and use the exported list to animate the (head/mouth) mesh.