The KDE Accessibility team is in the process of integrating speech synthesis into KDE. Not only does this mean better support for visually-impaired and speech-impaired users, but the new features should also prove for a fun desktop experience overall. An important milestone has been reached with the recent release of the KDE Text-to-Speech System (ktts). If you wish to learn more about speech synthesis support in KDE, you can also read an extensive interview with four developers at the KDE Accessibility Website.
Both ktts and KSayIt (an application to read out longer texts) will be included in KDE 3.4. They are important additions to KMouth (an application for speech impaired people) and the other two assistive technologies in KDE, KMouseTool and KMagnifier.
The interview is part of a planned series of interviews with participants of the KDE Accessibility Project. Future interviews will cover other areas of accessibility in KDE:
- improved support for partially sighted people in KDE 3.4
- support for blind users in KDE 4.0
- foundation of the freedesktop.org accessibility initiative to find a consensus on a common speech driver API
- close cooperation with the GNOME Accessibility Project to ensure that all assistive technologies will integrate into each desktop without problems
- invitation of other accessibility projects to the last KDE conference, and participation in the FSG Accessibility Workgroup
Comments
Hi.
This sounds (easy joke) very cool to me. Accesibility is a very important goal to me, but accesibility features are not only useful for people with disabilities. I use the text zooming in konqueror a lot when I want to read something, and I'm very separated from the screen. Now I hope that with ktts I can still read websites or emails while I'm tidying my room or having a shower. ;-)
I'm sure other geeks will find other interesting uses to this features.
Thanks!
Accessibility aside, you can't beat the coolness factor having your computer talk to you:-) Using ktts to speak the messages from knotify sounds like fun.
One question tho, since I almost always have music plying. Can you set ktts to fade the volume from the mediaplayer (noatun/juk) to like 25% of current volume before it speaks and turn the volume up afterwards? I don't think playing it on top of the music without doing anything with the volume or temporary stopping it would not give anything particularly comprehendable.
file it as a wish on bugs.kde.org... the sooner, the better. should be easy to do via dcop...
Especially with JuK and Noatun having a similar dcop interface.
The coolness factor is the kind of thing Microsoft uses to sell you their soft (despite how much cooler KDE is).
Said that, it's a good to have feature for practical reasons, specially for accesibility ones, and I'm glad this is the path chosen by KDE developers.
> you can't beat the coolness factor having your computer talk to you:-)
Agreed, but, I want to be able to talk to it.
IBM's ViaVoice was fairly good. I have a (oldish) copy of ViaVoice from my MDK 8.0 Pro CD's (and for whatever reson the CD that ViaVoice is on can't be read... :-\ - I'm on Gentoo these days anyway). ViaVoice worked when I used it last. I want to be able to have a mic sitting in front of me and be able to say "Computer play eno" and have music play. Can I do this yet?
> IBM's ViaVoice was fairly good.
I agree. The problem is that IBM still refuses to make ViaVoice available for Linux endusers.
The speech recognition part is not available at all for Linux. The speech synthesis part of ViaVoice is available on Linux for about half of the languages that are supported on Windows, but only if you buy 300 copies at once for $1500 altogether.
I think that "talking to my PC", with the virtual reality stuff, was the biggest and most useless hype of the past years.
Interfaces should be completely differents to be really usable and useful.
And anyway, even if only used as text typing substitute, let's think about the chaos that would be generated in a tipical office...
SO what keeps you from talking back? Use Sphinx and Perlbox Voice (perlbox.org). Installing Sphinx is really trivial and Perlbox will do the rest. Also, Perlbox has KDE desktop integration. Works like a charm to me...
But what if you want ktts to do karaoke? You won't want it to fade the volume down.
as someone who works often with tts engines, please please please include support in the api for following:
1) dictionary file for correct pronounciations.
2) string substitution (for example "AFAIK" to "as far as i know")
The latter is already supported using the SSML API ;)
I went down to 32 chocolate st.
The issue with automagic string replacement is that it may not expected by the user, but this could indeed be a nice feature..
File a wish and we'll see.
I don't know much about this stuff, but perhaps you can support an api to connect to internet servers like
www.leo.org
This is a big online dictionary provided by the technical university of munich (TU),
use this link to have a look http://dict.leo.org/?lang=en&lp=ende&search=
You can search for english and german words and there's also a french dictionary online. After searching for a word you can have it pronounced by clicking on the speakers symbol.
It's a nice feature and perhaps there is a possibility to automatically connect to the server, download the sound files and have them played.
Greetings
Andy
Department of Informatics / Technical University of Munich
=
Fakultät für Informationstechnik / Fachhochschule München
Fakultät für Informatik / Technische Universtität München
=
Department of Computer Science / Munich University of Technology
Egal was auf leo.org bzw der Drehscheibe stehen mag... ;-)
ktts is a really good looking project.
does anybody know the status of the backends?
can any of them produce a good voice?
The new Festival voices, i'm told, sound _very_ natural.
Pretty cool feature, but what about non-english languages, like danish, german, spanish, japanese, icelandic and so on...
KDE is in soooo many languages, will it continue to be that in speech?
Just my 0,02 kr.
Kalna
Festival supports english (British and American), Spanish and Welsh text to speech, Epos supports Czech and Slovak (read the article). These are the text to speech engines that ktts uses.
There are probably non free tts engines that are available in other languages. I wonder if it is possible to use them?
Derek
Also German is supported via Hadifix (txt2pho plus MBROLA) and via IMS German Festival mod. Finnish is supported via Festival. See the KTTS Handbook for details.
http://accessibility.kde.org/developer/kttsd/
Besides the dozens of other languages needed, conspicuously absent are French and Italian. If anyone figures out how to get these languages working, please email your links and instructions; I'll put them in the KTTS Handbook.
AT&T used to sell their natural voices packages that supported English (UK & US), French, Spanish (ES), German, and Korean. There is a Linux client available, though the software is harder to find nowadays since AT&T stopped supporting it. This is non-free software. A company called Scansoft also makes a variety of engines: English, French (FR & CA), Spanish, Dutch, German, Japanese. I'm not sure if they have Linux versions of these engines (I've only seen Windows clients). The AT&T one, at least, should not be hard to plug in, and sounds much better than the free TTS engines.
I think you're talking about IBM ViaVoice? To properly support it, we need a permanent licensed copy. Anyone care to donate?
So how about adding capability to control the computer via speech too (on blind support for kde-4.0)?
Open Mind Speech
OSSRI
CMU Sphinx
and others come to mind... If one could be supported/used, which one? ;-)
It is annoying already have to read so many "K", now we will hear them.
On Gentoo, you can remove all the K's. Simply add a sed filter to the build chain. Works like a charm. I recommend you switch!
I am the fundator of Proklam, latter renamed to KTTSD (by me) and now it seems, it was renamed to KTTS. And I say C O N G R A T U L A T I O N S! Keep the good work!
When someone asks me what's good about free software, I always mention this, I founded the project but I couldn't finish it, and someone else picked it and it's evolving, it's awesome, Long live to free software!!!
hey man! thanks for Proklam!! it's also good to be able to thank directly the guys that make your computer run :)
As a visually impared person(20/100) and huge KDE nut I think this is really good news :)
I tried oralux 0.6a and was sort of dissapointed. Emacspeak would be nice it you took away emacs and replaced it kwrite :) Then again oralux sounded better then my attempts at getting festival to spit out ogg's via the console.
If I go totally blind one day I have faith that I will be able to pop in my live distro dvd with the entire project gutenberg collection and listen my ears out :) I think it will be cool when people who are totally blind now can do the same too! ^_^
Support for blind users sounds sweet. Is there any information available yet on how this will be accomplished?
What would be _really_ nice is a OCR system for KDE which actually works properly. I have a friend who is slowly going blind. He now cannot read ordinary text, and finds life a bit shallow. I know there are commercial packages which work, but my friend is does not have a private income and cannot afford any of them. Anybody know of a solution?
Have you tried Kooka's ocr before?
http://www.kde.org/apps/kooka/
The free versions of SUSE come with an OCR system, that seems to be commercial, but works very well.
> Is there any information available yet on how this will be accomplished?
The Qt-toolkit used by KDE already has support for screen readers on Windows and Macintosh. The next version (Qt4) will also support the Linux screen reader Gnopernicus, and a number of other assistive technologies.
We are closely cooperating with the GNOME project to create common standards for accessibility purposes.
We will make more information about this available during the next weeks.
Olaf
It would seem to me that a better use of development time would be things that make Linux more user friendly for the masses. Things like software installation for example. Less time needs to be spent on "foo-faa" features like rotating background software while blending it with a webpage or yet another screen saver option. Don't get me wrong, KDE talking to you is going to be very cool, but there are more important things that need to be fixed first. In reality most of the common tasks that people do in Windows are too difficult on Linux for the end-user. Microsoft will rule the world until a more concentrated effort on making KDE (or Gnome) a really usable environment for Linux and not just some "cool thing" for geeky types to play with.
KDE talking to you is more about accessibility than having a toy to play with. It is very important for those who need it.
I built a french version of perlbox which uses espeak instead of festival.
The advantage of espeak is that it supports many more languages than festival actually does. I love Festival and Sphinx which helped me starting with voice synthesis / recognition but years passed and still nothing really usable in the straightforward way.
So to use perlbox voice french (which can be modified to be used with other languages) we need :
- Perl
- Espeak that you will compile from sources, packages included in distros are generaly outdated
- Sphinx 2 in it's simplest version, no matter if in English
- extract the archive you downloaded in the directory /tmp
- as root enter in the newly extracted archive named perlbox-voice-fr-1.0
- launch ./install.pl
- click ok
finished.
Logout from your root account and launch ./perlbox-voice
Enjoy
download from http://www.r-kraft.com/perlbox-voice-fr-1.0.tar.bz2
or from http://perlboxfr.tuxfamily.org/
you may have a qucik view of this version at working in it's very alpha version, so it wasn't able to identify every words but only few of them, for now it has evolved so it should be usable even with complex words http://www.dailymotion.com/relevance/search/rkraft_fr/video/x2meug_dscf1...