Trolltech, IBM and KDE to Demo Voice-Control

Trolltech, IBM (NYSE:IBM - news), and KDE have teamed up at LinuxWorld Expo in New York and are demonstrating IBM's ViaVoice speech-recognition technology running on Qt and KDE. With ViaVoice integrated into Qt/KDE, it will be possible to control Qt/KDE desktop applications with speech input -- from launching applications to menu selections to text entry. Developers can easily integrate this technology into existing applications; in fact, in many cases no changes have to be made. The Trolltech press release follows.

 

Santa Clara, California -- Trolltech, IBM (NYSE:IBM - news), and KDE are teaming up at LinuxWorld to demonstrate IBM's ViaVoice speech-recognition technology running on Trolltech's Qt, a cross-platform C++ GUI framework in the K Desktop Environment.

The technology preview will be running during the entire show at Trolltech's Booth, No. 1557 at LinuxWorld, which will be held at the Jacob Javitz Convention
Center January 31 through February 2, 2001.

"This combination of technologies will greatly accelerate the creation and adoption of speech-enabled applications for the Linux desktop," says Patricia
McHugh, Director, New Business Development, IBM Voice Systems.

Matthias Ettrich, a senior software engineer at Trolltech and the founder of KDE, elaborates: "When ViaVoice is integrated with Qt, it will be possible to control
Qt-based Linux desktop applications with speech input that is as simple as -- if not more simple than -- keyboard input. Developers can build speech-capability
into the structure of their application from the beginning."

In other words, the two technologies running together eliminate several of the obstacles that have hampered widespread adoption of speech-recognition on the
desktop, including: inefficient resource-use; sub-optimum performance; and the difficulty of "bolting on" this functionality after a typical application has already
been written.

ViaVoice has already shown that it can handle the two typical speech-recognition tasks: command and control; and dictation. In addition, however, ViaVoice on
Qt supports: TTS (text to speech), in which the system can read any kind of text input and translate it into speech; and a function that allows programmers to
define a "grammar" in BNF format. The engine will then recognize phrases that match the grammar, e.g., special input modes for dates or numbers such as
"Monday, the first of June" or "two thousand one hundred and seventy five."

About Trolltech

Trolltech develops, supports, and markets Qt, a C++ cross-platform toolkit and windowing system. Qt and Qt/Embedded let programmers rapidly build
state-of-the-art GUI applications for desktop and embedded environments using a "write once, compile anywhere" strategy. Qt has been used to develop
hundreds of successful commercial applications worldwide, and is the basis of the K Desktop Environment (KDE). Trolltech is headquartered in Oslo, Norway,
with offices in Santa Clara, California, and Brisbane, Australia. www.trolltech.com

CONTACT: Trolltech
Aron Kozak, 408/219-6303
[email protected]
or
Al Shugart International
Jessica Damsen, 831/464-4746
[email protected]

Dot Categories: 

Comments

and for anyone who wants to flame me further,
What I speak of is KDE 2.0.1., not kde 1.x and also not kde 2.1 beta something.

Well, the only guy flaming here is you. KDE 2.0.x works with OSS since the beginning. Yes, if you use SuSE-packages then you need to install ALSA but this has got reasons which are not related to KDE at all. And now go and grab your brown paperbag ... (PS: I didn't ever install ALSA and it's not installed on my system. Strange enough kaiman, noatun etc. work decently using aRts ;)

by Lukas Tinkl (not verified)

Well the fact it doesn't work for you doesn't automatically mean it's not possible... I'd suggest you to check your kernel and sound modules, there's no reason for KDE to barf on OSS! You'd have to believe me, but I *do* use KDE with OSS...

Regards,
Lukas Tinkl [[email protected]]

by JC (not verified)

Huh, I'm running the alsa drivers on my box and sound works great for everything EXCEPT arts (I've got a VIA integrated 686 chipset). arts makes my speakers sound like a truck is running over them - horrible.

How do you go about debugging arts anyway? I haven't been able to find any info on troubleshooting...

by Stefan Westerfeld (not verified)

You can start artsd -l0 to get some debugging information. Other than that, I can tell you that KDE2.1 will support ALSA directly, so you will have the option to use aRts -> ALSA sound or aRts -> OSS sound.

by Chris Lee (not verified)

Hey, have you got the 2.4.x Linux kernel yet? Via686 soundcards (integrated into the motherboard) are supported. : )

Whoa- I just realized something. Sound works under Linux. My Visor works under Linux. My cd-burner works under linux. My bleeding-edge-CVS KDE2 builds work under Linux. Wait - no, my printer doesn't work under Linux. Whew, thought I was dreaming or something... heh. Almost had to pinch myself...

Anyway, yes - the sound support alone is worth the kernel upgrade.

by Geir Kielland (not verified)

Strange statement. I've been using OSS-drivers without problems on my Aureal 2 soundcard for a long time.
In kde2.1 beta, you can choose sound I/O-method Autodetect, OSS or ALSA in kcontrol. Artswrapper translates or something.... Are now running ALSA-drivers (new soundcard) and had to change output-method to get sound in UT and Quake3.

Try it out some more, wait to till the finished product is out before you screem up....

by Mathias Homann (not verified)

OK OK OK

Shame on me!!!!

I tried the OSS support with the kernel drivers for my old SB AWE64, which did not work in KDE2, but had ever worked before... Then I bought a SB Live, and after seeing no matching card found in the control center in kde2, installed alsa, which showed up in controlcenter...
but now I installed as a test the emu10k1 oss driver and kde 2.0.1 works fine with it, it just can't give me informations about it in controlcenter... feel free to flame me...

You flaming idiot!

welcome, long lost brother...

Since IBM ViaVoice is not Free Software (in the GNU sense), it's of no use!!! Again it's like building a car with it's bonet locked up!! Pity that people have not learned from the KDE mistake!!

umm... I doubt kde will ever *need* to have ViaVoice to work with a kbd and mouse. But the way I read this, there will be a nice API for such things to exist in KDE, and ViaVoice will be able to use it (if you wish to purchase a copy) and maybe someday someone will write an OSS one and those who want voice recognition won't have to buy ViaVoice.

But in the meantime, why not let them have the choice of buying functionality that isn't currently available in OSS form? I probably won't purchase it, but I can see how some people would, and it's no loss to the OSS core.

If you don't like it don't use it. The problem is that voice recognition needs to be implemented on the toolkit-level to provide people who can't enter text via a keyboard with a reasonable input-method. The API being used for this needs to be in place as soon as possible (the later it will come the more difficult it would be). If you are not satisfied with Viavoice's license feel free to create an LGPLed speech-recognition. But be aware that this won't really be a simple project ...

There is a BSD licensed speech recognition project.
It is called CMU Sphinx, and is hosted at SourgeForge.

Great!!

What the hell do KDE or gnome need to bother about ViaVoice ...

this sphynx project should be used in Open Source Desktop...not a Devil Commercial Product...

regards
gb

by Cyber Czar (not verified)

Since when does everything regarding Linux have to be free?

I am sick of this juvenile attitude within the Linux community that everything has GOT to be free or it sucks.

To expound on your analogy, it's like you were given a car for free but:
a) You want the gas/ petrol to be free, too.
b) You may not have been given the tools to repair it for free BUT:
1) How many of us actually HAVE the time to repair our cars anyway?
2) Last time I BOUGHT a car, I wasn't given the tools to repair it.
3) Forget the fact that since the car is free, the mechanic's shop can charge me less to repair it since he didn't have to waste his hard earned money on the car to begin with.
4) Even if you WERE given the tools to repair your car (which you got for FREE) the population of this world (en masse) does not have the skills or more importantly the DESIRE to repair their own cars.)

What about all the time, energy, and resources which went into designing and manufacturing the car? Did you think all the artisans who designed and built the car should not have been paid for their services? How do you expect them to survive?

The Linux community is sounding more and more communistic. "Give me, give me, give me. You take care of me, I don't want to."

Granted, on paper communism is ideal, but in practice and reality haven't we learned that it just doesn't work?

For goodness sake.

Be thankful that we DO have speech recognition software. Closed or open, this is a proud day.

Bravo bravo I'm getting sick of the idealists as well!! Open source is very cool but software companies need to make money and since ibm is really doing cool stuff with linux get off there back!!!! Some people whine about everything. It always seems to be the gnome people whining whining grow up losers.

Craig

by anonymous (not verified)

I agree with your subject line. But maybe you ought to study a subject a bit before you attack somebody over it. Are you aware of such things as lisencing issues? Do you know that e.g. Debian wouldn't package KDE before QT became GPLd because it thought there was a lisencing conflict? Now I don't know anything about how the thing would integrate with QT but I think that it is possible that some kind of conflict with GPL would be born and I think this is the thing that people were commenting on.

Note, that in the message RK talked about free in the GNU way. This means not the price at all, as you assumed but the freedom to get and re-distribute the source code. Their worry is a good one and ought to be consider well and good, and not juvenile attitude of wanting everything for free (gratis).

In essence this makes you a big mouthed idiot who doesn't know what he is talking about.

Qt is available in several licences.

Craig

by gberrido (not verified)

I think that what is bad in integrating commercial soft, is that a commercial company have the exact opposite philosophy (making money),
and a non profit project like KDE (programmers writting nice pieces of code FOR FUN).
The first one is in logic of Buziness,
the 2nd in a logic of offer.

I mean this is the same difference between
commercial music, produced just to make money,
and real artist which just care about what he plays, without asking himself if his song will make a hit or not...

the concern is not having stuff for free, it s the freedom of the project.

regards
gb

The word FREE was not MEANT to relate to money rather freedom to know what you run on your own CPU. So i do agree with the author who says that the Linux community has gone bonkers over "Free" software thinking it should be given to them without renumerations.

For those who think they should get a FREE ride and not FREEDOM to view/modify their local cpu electrons please READ the GPL to understand the original FREE philosophy.

;) This will cost you guys 2 cents but you are FREE to interpret it anywhich way you like.

by Ron Gage (not verified)

If this all happens (direct integration of KDE/ViaVoice/TTS), then this could make some really cool opportunities for games, simulators, and even training devices!

Imagine a "game", under KDE, that fully simulates an Air-Traffic Controller's job. Said game would have a "controller" speaking his commands to the various aircraft in his airspace and those aircraft would respond (via speech) back to the controller. A true to life trainer/simulator.

Best part about such a game, we could honestly tell our Windows based friends that "Sorry, but this technology is not supported under Windows!" I would just love to be able to tell some people that!

by Zeljko Vukman (not verified)

Speech recognition is probably great :), but can we stop for a while and fix a simple keyboard layout recognition. I have spent last two weeks trying to fix croatian keyboard layout and when I did it I lost my danish keyboard layout. Switching keyboard layout is something essential for my work, and do not misunderstand me - it worked well in KDE 1.1.2.
What I want to say is: I am not against "big" steps, but do not forget that little-big things make software great.
BTW, Krayon should have more PR :). Progress that Krayons developers made is simply amazing.

by Shawn Gordon (not verified)

This release says you can do command control, and text to speech, but what about the speech to text? I know all of us can read and write faster than talking or hearing, but I'm thinking about for handicapped people, this could be a real godsend for some types of applicaitons.

by carbon (not verified)

It specifaclly mentions dictation

by X-Nc (not verified)

I'm thinking about for handicapped people, this could be a real godsend for some types of applicaitons.

Speaking as one of said "handicapped" population, this would be more than a godsend. I have ViaVoice for Linux right now and it's helped tremendiously with any document/WP I've needed to do but the vast majority of my work is either email or coding. From years of experiance, voice recognition and coding just don't mix. But email would save me hours of time and physical pain.

As for licensing, free-as-in-speach is always best. But anything that will work is what is needed. Especially in situations like this.

---
If I actually could spell I'd have spelled it right in the first place.

Um, I have used Via Voice, and contrary to what is stated here, even after an hour of voice training i could get no better then 50 % accuracy on anything but numbers. Also, is this aRts integrated? If it is, then we could create a universal speech recognition library, and use whatever backend you want (viavoice or something else). I.E. the backend would simply analyze the text of the message, and decide what command to run, and leave the actual recognition to the engine that it is connected to.

by Shawn Gordon (not verified)

I used it on Win98 with Lotus Wordpro and the accuracy was very high after just a few minutes of training, and this is with the budget version they bundled in the Millenium suite a couple years ago. I have to think the technology has evolved even from that point.

Via Voice is definatley a YMMV product. For me I attained a very high hit rate within well less than an hour. The more I used it the more accurate it got. Even when I got a cold it was still pretty darn accurate.

Others I know had problems similar to what you describe. Very similar to the problems some small percentage of the population still cant master pilot script.

by Shawn Gordon (not verified)

There is nothing here about how to get it and use it or who to contact. Does anyone else have more specific details?

by Shawn Gordon (not verified)

I looked at the Trollech site, and there is nothing. I looked at the IBM site and the most current information for ViaVoice SDK for Linux is from February.

As someone who lives by press releases, we typically make it a point to update our web site before we send out announcements that way people can find out details immediatley.

by Erik Severinghaus (not verified)

> As someone who lives by press releases, we
> typically make it a point to update our web site
> before we send out announcements that way people
> can find out details immediatley.

Yeah, which is one of the reasons everyone around here loves TheKompany so much (that and the kick-ass software). I assume your comment was tongue-in-cheek to some extent, since I think everyone here knows IBM doesn't exactly live by that policy. TrollTech on the other hand suprises me a bit... I couldn't find anything either tho, does anyone have a url??

Erik

by Matthias Ettrich (not verified)

The press release clearly says it's a demo. It's not vapoware (since it actually runs), but as of today, it's not a product but a demo. Not more, not less. I hope the feedback at LinuxWorld Expo will be encouraging enough to turn it into a product, though.

All I can say at this point is that you probably should talk to IBM.

by Shawn Gordon (not verified)

But Matthias, what is the point of it if no one else can do anything with it? Saying "talk to IBM" is like saying "talk to Pakistan" - where do you start? I'm keenly interested in this technology and making use of it. I don't really understand the point of making a big announcement about something that may not be available for anyone to use. If it's there, and it works, then it sounds like it's a product. Saying it's a demo does not imply that it is unavailable, it means you are demonstrating the technology in practical terms. I want to support you guys and this effort, I'm just trying to find out how to do that.

by Daniel Franklin (not verified)

I would be particularly impressed if this also worked with CMU's open-source Sphinx speech engine? Hopefully the API is reasonably flexible and you could plug a Sphinx back-end into it.

- Daniel

NOT HAPPY :(Is it *totaly* impossible to integrate ViaVoice without including any of the commercail code? I mean, if KDE and ViaVoice is integrated, what about us who refuse to buy software for homeuse? Ehh? If I don't want/need ViaVoice, then I don't need the code, that integrates it, thus I would like to compile without it, right? How am I gonna do that if the code is closed? There MUST me somthing to do here! I mean from my point of view, this is the first step towards a (non)-solution that gives IBM some sort of controle over KDE.. not? Will I ultimately have to buy ViaVoice (read: get an elegal copy)? I have ALWAYS been a Die Hard(TM) KDE fan, but if this keeps up, and there isn't found a solution to the problem so the code gets re-opened, I will be forced to switch to GNOME (please not God, pleeeeease not), since I am way to paranoid to use closed code. For all I care IBM can be as much a crook corp. as Micro$oft. How do I know that I can still trust the KDE-team if I can't read to code? Who knows, maby we will find NSA backdores in KDE someday! DON'T LET THIS HAPPEN!!!!/me is begging on his knees: "Please reopen the code and kick IBM until they figgure out a way to implement ViaVoice in a SAFE and OPEN way"/kidcat a Die Hard KDE fan.

I totally agree, and what about the other platforms if we could not have access to the sources ? Currently I'm running Linux on a PPC Amiga.

I'm very unhappy too seeing KDE turning into a non free-software open-sourced solution!!!

See? I'm not the only one! HELP ME GUYS! KEEP POSTING! THIS IS NOT ACCEPTABLE! KDE IS TOO GOOD TO BECOME ANOTHER EXPLORER!!!/kidcat, a worried Die Hard KDE fan!

Guys, you go a bit too fast. They only chose
KDE to make a demo. KDE source code is GPL and
is going to stay that way. It was never the
question to link KDE to ViaVoice. No cigar, then.
Why such a panic ?

PHEEEEEEW!/kidcat

And remember guys... HTML is not html here :) /kidcat

by kidcat (not verified)

Hi again... 'cus I'm not giving up on this!Daniel mentioned CMU's Sphinx. So maby it's not so good as ViaVoice... yet! If the KDE community in general are as pissed of about the closeing of the code, as those I have talked to, this might be the answer! http://www.speech.cs.cmu.edu/sphinx/ Those who can, help. Those can't, keep screaming loudly!/kidcat (the little annoing one who refuse to see KDE go NonOpenSource)-Render onto "Ceasar what is Cearsar's; Render onto God what is God's".... Also translates as: Render onto Micro$oft what is Micro$soft; Render onto KDE what is KDE's"... get my point?-