Fight for the Internet 1!

Wednesday, April 22, 2009

Text to Speech in Linux

After using my friend's Amazon Kindle 2 and its Text-to-Speech (TTS) feature to read something I had written, I became interested in getting a TTS program for my own use. The best solution I have found was a combination of KDE's Text-to-Speech manager and an external TTS program.

KDE's readily available TTS system integration makes me proud of how the open-source and Linux communities attempt to help people for free.

Part 1 Choosing a TTS Program
Most TTS software in Linux does not seem to provide a graphical user interface for control, just a command line utility. This utility can be tied to a graphical controller, but first you must choose which TTS program option is right for you.

Option #1: For the Free of Charge or Open-Source only People
For the people not willing to pay for any TTS software (and yes, there is some good stuff available for Linux), or for those who will only use completely Open-Source material, Festival + some other voices is probably your best option.

Festival
Festival is a general multi-lingual speech synthesis system and it is probably the first thing one will encounter when researching TTS in Linux. It is well supported in Linux and I personally was able to get it to read a text file within minutes.

To get it to work with kttsmgr, there was no editing of configuration files necessary, but to use it directly from command line I had to add this /etc/festival.scm for ALSA sound support:
(Parameter.set 'Audio_Command "aplay -D plug:dmix -q -c 1 -t raw -f s16 -r $SR $FILE")
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Required_Format 'snd)
In festival, the voice synthesis is pretty good, though I found it sometimes too fast, particularly over some punctuation, and the voice is obviously synthetic. If you are hoping for something a little different, read about solution #2.

This is a really great tutorial on how to get Festival working with a variety of other voices that are available.

If you are interested in using Festival, but want different voices, Option #2's voices can be made to work with Festival as well.

MBROLA
MBROLA will frequently pop up on Google during TTS searches, or at least it did for me. Wikipedia explains this project better than I, but simply put: It is a free system for enhancing the quality of TTS systems, but it is not itself a full TTS system. MBROLA can be used along with Festival, as the tutorial link above will show.

Linux-Sound.org/Speech
This webpage is a great resource for the many numerous TTS systems and projects in Linux and the open-source world. If you want additionally information about TTS and other related projects, I suggest you go there.

Option #2: Purchased voiced & Closed source
For those willing to pay a little to buy a TTS voice and are willing to use a closed source TTS project, I suggest Cepstral. Their product is very high quality and is available for Linux (32 and 64 bit), Windows and Mac OS. Take a look at their demos and you can tell fairly quickly if their product is right for you.

Their Windows version comes with a nice and simply GUI program to read text, and provides graphical configuration of the programs voice, which is fairly customizable. Their Linux version(s), as far as I can tell, comes with only a command line utility, but they product documention on their website FAQs about using their program with KDE's TTS manager.

Part 2 Configuring a Frontend Graphical Controller
While there may be other controllers, I used KDE's program: kttsmgr. This is their Text-To-Speech manager and it supports a great variety of TTS command line programs.

There are plenty of tutorials on getting Festival to work with kttsmgr.

Update: Besides making sure you install a festival voice, I am not even sure any additional configuration is necessary beyond going through the kttsmgr and adding a talker.

Option #2 Cepstral with kttsmgr
[Taken from Cepstral's own FAQ docs]
To integrate Cepstral voices into the KTTS text-to-speech system (present in KDE 3.4 or later), first select KTTS from the KDE menu or run kttsmgr from the command line to open the configuration manager.

In the Talkers tab, click the Add button to add a new voice. Now, select the "Show All" option for synthesizers, choose the Command synthesizer, and click OK. You'll now be asked to choose a language. Select anything here, as it will be ignored. Finally, it's time to specify the swift command to run. To speak using the default voice, use:

swift %t -o %w

If you want a specific voice, use the -n switch like this:

swift %t -o %w -n Isabelle

You'll also want to select Latin1 as the character set. Click OK, then Apply to set the current voice. Your voice should now work in any KDE app that uses KTTS.
I found that selecting UTF-8 instead of Latin1 caused no problems, but I am also not using any sort of foreign language texts.

Update: When using Cepstral's swift on the command line, you may encounter OSS sound compatibility errors. Install the package alsa-oss and use the program it provides 'aoss' to act as a sound wrapping layer for ALSA. For example:
aoss swift -f file.txt
Recommendations
Try Festival and if it meets your needs, great. Personally I needed something like Cepstral and I was happy to pay for the voice. It worked great with my software.

5 comments:

  1. This is a great post. The Cepstral voice's work great. I find them easier to understand than the festival voices. From my limited experience, the Cepstral voices also have a wider vocalulary. Your instructions about integrating Cepstral with kttmgr are particularly useful. Thanks for taking the time to post this!

    ReplyDelete
  2. Thanks for the info, has been really helpful. Im doing a project that involves tts for robot applications and Im using Cepstral. Do you know where can I find some C codes examples using swift in linux? Thanks for this post!! lfreyrea gmail

    ReplyDelete
  3. Hi, Thanks so much for writing this. I'm trying to get Cepstral to work in Kubuntu 12.4. There new KDE tts program (jovie) dose not seam to have a way to integrate Cepstral anymore. I ask the people at Cepstral and they say they can't support KDE4. Do you have any idea's on how to get Cepstral to work using KDE 4 with or without jovie. I just want to be able to use a keyboard command to read text from the clipboard.

    ReplyDelete
    Replies
    1. I tried getting it to work myself but had no luck. I did not seem to be able to add a Speech-Dispatcher or anything like that to it. However I think Jovie is a little buggy right now, because no matter what I did, it would not speak in anything other than the default voice (no female or children). I wish you luck. Let me know if you find a solution! (I will make a post if I find one first.)

      Delete
  4. How can I implement Text to Speech in Linux? Any recommended tools or libraries mentioned in the article? greeting Telkom University

    ReplyDelete