The Road to KDE 4: Phonon Makes Multimedia Easier

Like the previously featured articles on new KDE 4 technologies for
Job Processes or SVG Widgets, today we
feature the shiny new multimedia technology Phonon. Phonon is designed to take
some of the complications out of writing multimedia applications in
KDE 4, and ensure that these applications will work on a multitude
of platforms and sound architectures. Unfortunately, writing about
a sound technology produces very few snazzy screenshots, so instead
this week has a few more technical details. Read on for the details.

Phonon is a new KDE technology that offers a consistent API to use audio or video within multimedia applications. The API is designed to be Qt-like, and as such, it offers KDE developers a familiar style of functionality (If you are interested in the Phonon API, have a look at the online docs, which may or may not be up to date at any given moment).

Firstly, it is important to state what Phonon is not: it is not a new sound server, and will not compete with xine, GStreamer, ESD, aRts, etc. Rather, due to the ever-shifting nature of multimedia programming, it offers a consistent API that wraps around these other multimedia technologies. Then, for example, if GStreamer decided to alter its API, only Phonon needs to be adjusted, instead of each KDE application individually.

Phonon is powered by what the developers call "engines" and there is one engine for each supported backend. Currently there are four engines in development: xine, NMM, GStreamer and avKode (the successor to aKode). You may rest comfortably in the knowledge that aRts is now pretty much dead as a future sound server, and no aRts engine is likely to be developed. However, aRts itself may live on in another form outside of KDE. The goal for KDE 4.0 is to have one 'certified to work' engine, and a few additional optional engines.

Other engines that have been suggested include MPlayer, DirectShow (for the Windows platform), and QuickTime (for the Mac OS X platform). Development on these additional engines has not yet started, as the Phonon core developers are more concerned with making sure that the API is feature-complete before worrying about additional engines. If the Phonon developers attempt to maintain too many engines at once while the API is still in flux, the situation could become quite messy (If you would like to contribute by writing an engine, jump into the #phonon channel at irc.freenode.org).

When an engine is selected by the user or application, Phonon will use the selected engine to determine what file formats and codecs each backend supports, and will then dynamically allow the KDE application to play your media. As it currently exists in the KDE 3 series, the user would have to manually change engines in each application (Kaffeine, Amarok, JuK, etc.) rather than being able to select engines for use across KDE.

Once an engine is selected for Phonon, it allows the programs to do the standard multimedia operations for that engine. This includes the usual actions performed in a media player, like Play, Stop, Pause, Seek, etc. Support also exists in Phonon for higher-level functions, like defining how tracks fade into one another, so that applications can share this functionality instead of re-implementing it each time. Of course, some applications will want more control over their cross-fading, and so are still free to design their own implementation.

The engine with the greatest progress so far is xine, which I was able to set up and run on my system. I was unable to get the NMM (notoriously hard to compile/setup) or GStreamer engines to compile on my system, whilst avKode is currently disabled by default. I would show you a screenshot of Juk or Noatun playing audio with Phonon, but right now these applications look just like their KDE 3.x versions (only with a somewhat ugly/broken interface!). When they are getting polished for release, I will show them off in a later article.

Matthias Kretz offers a short video which, if you turn your speakers on while watching, demonstrates device switching. Phonon lets you switch audio devices on the fly, and you can hear the specific moment when the music switches from his various outputs (headphones, speakers, etc.).

Matthias also submits the following screenshot of output device selection using Phonon's configuration module. This is also a work-in-progress, and so take it with a grain of usability salt.

There are not many things that I can take a screenshot of which show Phonon in use (screenshots of an audio framework are notoriously difficult to compose!), but I can describe one of the neat side effects of using Phonon: network transparency. KDE has long used KIOSlaves to access files over the network as easily as if they were stored on your local computer. Multimedia apps like JuK or Amarok should be able to add files transparently over the network to their collections without having to be concerned about whether or not the back-end engine is aware of how to deal with ioslaves. This support is already partially implemented in KDE 4, and is most visible through audio thumbnails, which are working for many people over any KIO protocol, including sftp:// and fish:// - two popular protocols among KDE power users. They do not yet work for me due to some instability in the fish:// KIOSlave of my current compilation, but the developers in the #phonon IRC channel claim that it this functionality will be ready and working when fish:// is more stable.

So, Phonon, while still in development, is going to be a great pillar technology for KDE application programmers, making their job easier and removing the redundancy and instability caused by constantly-shifting back-end technologies, and (eventually) making support for other platforms a piece of cake. This means that those developers can spend more time working on other parts of their applications to ensure KDE Multimedia applications shine even more brightly than they currently do.

A couple of quickies here to note: Mark Kretschmann, lead developer for Amarok has officially opened up Amarok 2.0 development this week, and seems to be quite interested in what Phonon can do for Amarok 2.0. He doesn't rule out keeping their own engine implementations, like they currently do in the Amarok 1.4 series. However, given its early stage of development, Phonon can likely be adjusted to ensure that it will do everything Amarok asks of it.

If you're looking for a way to help out with KDE and are not a programmer, Matthias Kretz, lead developer of Phonon (Vir on IRC) has requested some help in keeping the Phonon website up-to-date.

And lastly, a few translations of these articles have been popping up around the world in various languages. Sometimes more than one translation is happening for a specific language. If you are translating or plan to translate these articles, send me a message so that we can save everyone some work and avoid redundancy (lets keep the redundancy-reduction spirit of Phonon alive!).

Until next week...

Comments

by Andre Somers (not verified)

KDE 3.5 is an old platform? What are you talking about? KDE 3.5 was released november 29, 2005. That is just over a year old. It has seen six minor revisions since then, so it is actively maintained. All these releases have made the platform more stable, faster, and more polished. It is a very mature and stable platform, and probably the platform of choice for the less adventurous users for quite some time to come.

Instead of bitching that KDE 4 is supposed to be ready, stop wasting time and actually start contributing to *getting* it ready.

by otherAC (not verified)

"Well fair enough but Microsoft did wait till the last minute to put a date on the release of Vista"

Well, they did wait untill the last minute for the exact date of release.
But after the release of WindowsXP they promised that the next version of Winodws would be released in 2002, 2003, 2004, 2005, 2006, late 2006, 2007..
That is a delay of 5 years between the first estimated release date and the exact release date.
And for the successor of Vista (Vienna), Microsoft has given an estimated release date of autumn 2009..
Well see if they will make it or if Vienna will be released 5 years later..

As for KDE4, KDE never mentioned an estimated release date...

by D Kite (not verified)

Is there any thought to supporting kde3 apps? Many depend on arts, so any kde3 apps will require an old unmaintained arts installed. Kde3 apps will be around for a long time yet.

How about a library in kde4 that looks like arts but calls phonon?

Could be called pharts?

Derek

by me (not verified)

pharts is a cool name! I want it. Even if its completely useless :)

by Matthias Kretz (not verified)

If there'll ever be a aRts backend to Phonon I want it to be called pharts. :)

My hope is that aRts will continue to work for those people that have to use it in KDE4 times. I'm not able to maintain aRts!

Anyway aRts is able to use ALSA and dmix then does the rest to be able to use aRts and Phonon applications at the same time. There are still ways to break such a setup, but those are solvable.

Implementing a lib that looks like aRts but calls Phonon is next to impossible.

by Duncan (not verified)

I suspect KDE3 compatibility will depend on the distributions. Gentoo already has KDE slotted into /usr/kde/, where slot is x.y minor version, so for example 3.4.x and 3.5.x could exist beside each other for a time, and I'm sure that wouldn't be taken away for a change as big as KDE4. KDE 3.5.x and KDE 4.0.x will therefore exist beside each other until the Gentoo sysadmin decides to unmerge 3.5.x. When KDE 4.1 comes out, it'll be yet another slot.

It has been awhile, but I was back on Mandrake for the KDE 2.x -> 3.x upgrade, and while their arrangement was somewhat different (KDE files were distributed into appropriate directories directly under /usr, so in /usr/bin, /usr/share, and /usr/lib, for whatever version of KDE shipped with the distribution release), making it difficult to have but one "main" version installed at any point, early in the 3.x cycle they installed it to (IIRC) /opt (the as-shipped KDE default, AFAIK), so again, 2.x and 3.x could and did exist beside each other, for those admins wishing that it be so. As 3.x matured to the point they could ship it as the "main" KDE version for a Mandrake release, they killed 2.x and moved 3.x into the main /usr dirs along with everything else.

There shouldn't be anything stopping the various KDE versions from running on the same system, the environments one at a time just as one can run KDE or GNOME, one at a time, on the same system, or indeed, various apps from one version running under the environment of the other, just as KDE apps can run on GNOME and GNOME apps on KDE, as long as your distribution arranges it that way. If you grab the sources and direct compile your own, that you are of course creating your own distribution, so it would be up to you to configure them to install to different locations if you didn't want conflicts.

As for depending on aRts specifically, as long as aRts can be set to share the device (that's the problem, as it was designed before that was normally possible and it still doesn't like to share tho it's generally possible and most other apps now share), you should be able to run KDE3/aRts apps on a KDE4 desktop, just as you can now run KDE3/aRts apps on a GNOME desktop, as long as all the necessary dependencies remain installed.

However, while many apps depend on aRts to be installed if compiled with that dependency (and a few require the dependency), fewer apps now depend on it actually /running/, as they (and the KNotify system as well) can be configured to play to ALSA or whatever directly instead of to aRts. I no longer run aRts here, as I got tired of not being able to run anything else because it was hogging the sound devices, and of all the problems keeping aRts working reliably. I still had to keep USE=arts in my USE flags (Gentoo), as disabling that disabled a bunch of other stuff one wouldn't intuitively think was related to aRts (it's likely some of that was Gentoo linking of KDE audio features to USE=arts even when it wasn't related to arts itself, however), even tho I no longer run aRts itself.

Thus, it's likely that with proper KDE3 configuration, you should be able to quit running aRts itself even if you have to keep it installed as part of your KDE3 dependency tree, and can pipe sound directly thru ALSA or whatever, just as I do now, and as KDE's photon will be doing indirectly thru xine/gstreamer/nmm/whatever. Of course, you won't get the benefit of the per-category and individual app volume settings in your KDE3 apps, only in your KDE4 apps, but one wouldn't expect it, either, since they are still KDE3 and not KDE4.

Duncan

by Jonathan Dehan (not verified)

i'm just thinking aloud but xmms2 backend + amarok = dream media player/server.

i love having my music go no matter what my computer is doing (in and out of x, gui crashes mainly) and allowing all qt multimedia apps to take advantage of the convenience features of xmms2 is well, just ... *drool*

by Troy Unrau (not verified)

Support would depend on someone writing an engine for phonon that interfaces to xmms2. At the moment, there isn't anyone working on this.

I love the idea of having different categories for sound, communications, music, etc. I'd like to suggest you add and extra category for music creation software.

Secondly I'd like to suggest that maybe you can set it that if something in one category is playing it can mute or turn down the sound on another category. So if something in Communication turns on music will turn off.

by Troy Unrau (not verified)

Music creation software is probably beyond the scope of phonon and would want to use a lower level interface directly, like arts or jack. Rosengarden, for instance, would not be one of the applications suggested to transition to phonon.

by Matthias Kretz (not verified)

1. suggestion)
How would you call the category? I'm reluctant to add another category. I think there are too many categories already.

2. suggestion)
Yes, I want such a policy manager, too. Anybody who wants to work on it? I can certainly guide somebody how to do it.

by Ben (not verified)

just wondering how Phonon will work with MIDI, anyone have any idea?

by Matthias Kretz (not verified)

MIDI is orthogonal to Phonon. The "Desktop" doesn't have any need for MIDI. MIDI-Applications could use Phonon for audio (PCM) stuff, but they probably want a different lib or do it all themselves.

I'm not saying you cannot make use of MIDI in general for the desktop, but it's such a special case that for now it's ignored completely in Phonon.

by Ben (not verified)

Thats a shame, I play quite a few MIDIs and would have liked to control the volume inside phonon with everything else.

by Matthias Kretz (not verified)

Two possiblities:
- hardware synth: nothing for Phonon, you have to use the hardware mixer to control the volumes
- software synth: either the software synth provides the same dbus interface as Phonon applications do, or it uses Phonon to do the audio output. The former is not standardized yet. The latter is still on the todo list.

by Ljubomir (not verified)

I tend to disagree. Linux desktop is still waiting for decent karaoke players and Guitar Pro clones. This is "fun stuff", and its important.

by Johann Ollivier... (not verified)

Did you tried ktabedit, a good fork from the un-maintained kguitart?

by Matthias Kretz (not verified)

You have to consider that so far I've been designing and implementing Phonon mostly on my own. I should be learning for Uni instead. There's no way I can provide API for MIDI for KDE 4.0. I agree that it might make sense to have a Qt-style MIDI API, but it won't be me doing that.

by Davide Ferrari (not verified)

MIDI is not general purpose, it's indeed a particulr use-case (amateur musicians in particular and a few others). So, I find perfectly normal a lower priority.

by Fabio A. (not verified)

I don't know whether it's a problem with video/audio synchronization, but the response latency of the system to the issued commands (via the GUI) looks awful to me, it looks to be close to one second!

by Matthias Kretz (not verified)

Well, if you want to fix that you have to fix libxine. Other frameworks might be faster with switching, but libxine doesn't care that much and simply lets the buffer of the first soundcard run empty until the other soundcard starts playing. You could try and make it flush buffer of the first soundcard, but that's
1) not to fix in Phonon, but in libxine
2) very low priority: Why should I optimize for device switching which is a rather rare case?

by Fabio A. (not verified)

Device switching is the least of issues, also lowering or highering the volume happens close to one second later than the actual manipulation of knob. Even hitting the stop button sorts its effect with that great delay.

by Matthias Kretz (not verified)

Volume is handled by libxine too, and apparently the volume is applied to the data that's not in the audio buffer yet. It's very bad when using the OSS emulation of my Headset - that's like 4 seconds delay. Completely unusable. :(

My feeling for the stop delay is something ~100ms. If you see/hear such a great delay it's probably a/v out of sync for you. You really should try it yourself to judge.

Btw, from clicking stop to the call to xine_stop it's really not far: as soon as the X event has reached the processEvents method in the main loop it's the clicked() signal and then Phonon is called. A few instructions later the stop command is send to the xine thread, which (if the command queue is empty - that's the normal case) calls xine_stop after a few asserts I added for debugging.

by Darkelve (not verified)

Is the Phonon framework relevant for the technology known as OpenAL?

http://www.openal.com/

"OpenAL is a cross-platform 3D audio API appropriate for use with gaming applications and many other types of audio"

Or am I way off base here?

by Matthias Kretz (not verified)

This again something that has to be integrated on the backend side. If e.g. GStreamer supports to use OpenAL then the Phonon GST backend can be written to make use of it. If you're going to write a first person shooter game then you probably don't want to use Phonon anyway but rather OpenAL directly.

Perhaps one day we'll see integration of OpenAL into a backend and then a GUI the define where you want to hear notifications and so on. :)

by Darkelve (not verified)

Thanks for the explanation! (and for not making fun of me :p )

by Bjarne Alderhaug (not verified)

I think this is an enormous mistake because it puts KDE on the sideline rather than getting actively involved with Gstreamer and helping to make Gstreamer into a totally kick-ass framework and instead taking the cowardly "let us see" attitude that serves noone.

Phonon is also built on a number of fallacies:
1. It assumes that if Gstreamer changes it's API, only Phonon (not the apps) needs to be changed, without considering that Gstreamer may introduce new APIs that can't be handled by the current Phonon API, thus requiring an API change that requires change in applications as well.
2. It assumes that Phonon is going to be more API stable than Gstreamer. Given that Gstreamer is approaching maturity, this is not at all proven.
3. Phonon will never be able to support everything that the lower level systems can. For every new feature that the lower level systems get, Phonon will get there slower.
4. Phonon is setting itself up for a Q & A nightmare, where apps will have different capabilities depending on what capabilities the subsystem has.
5. Phonon will be no more capable and stable than the subsystem. If all the subsystems are half-baked, then Phonon will be as well. Instead KDE could have focused on getting in on helping out with one subsystem to make sure it will stay alive and well.

1) Seems like a perfectly reasonable assumption to me.

2) Gstreamer is a lot bigger and complex than Phonon, I'd be very suprised if Gstreamer had a more Stable API.

3) Welcome to UNIX, we've got years of history so you should do a bit of catching up, starting with 80/20 ;). 80% or more of Desktop applications only need: Play, rewind, fast forward, stop, pause and a progress bar. Phonon dose that and more. If you need more than Phonon can provide use a back end directly.

4) I'll leave this one, I can't argue either way.

5) http://aseigo.blogspot.com/2006/05/id-like-another-black-eye-please.html

6) Why is Gstreamer better than NMM? Phonon makes it really easy for Kprograms to support both, giving users a choice. Sounds a lot better to me than plain Gstreamer support.

by Troy Unrau (not verified)

Gstreamer does not support all of KDE's platforms, it's API has been shifting enough to break previous KDE applications (like amarok) and we'd like to not repeat it.

We'd also not like to repeat our mistake by selecting one dedicated sound engine, like we did for arts. So imaging Phonon as higher level KDE/Qt bindings for gstreamer for basic multimedia applications (playback, recording, simple effects) - but having been designed in such a way so as to be able to use other engines as well as necessary. On a mac, this could be xine or QuickTime, since they play nice on that platform.

So we aren't shunning gstreamer - we're providing a high-level API for gstreamer which happens to look the same as our high level audio API for all those other engines. And the app won't care what engine is being used.

by Bjarne Alderhaug (not verified)

"Gstreamer does not support all of KDE's platforms, it's API has been shifting enough to break previous KDE applications (like amarok) and we'd like to not repeat it."

Then get more heavily involved with GStreamer. Make sure it supports all your platforms and if you are heavily involved, you can drive the development of the API.

Or mabey KDE can just intergrate with the local media framework?

by otherAC (not verified)

with phonon, kde can :)

Sounds familiar for me: last time KDE was blamed for not using CORBA/Bonobo...

by Arnomane (not verified)

Well can you explain to me why I should use GSteamer instead of libxine? Libxine just works for me.

by Bjarne Alderhaug (not verified)

Libxine is much more limited than GStreamer and can't provide the full feature set that GStreamer is providing.

Yes, Libxine may work better in some cases for video and music playback CURRENTLY, but GStreamer does the whole thing including advanced recording.

The KDE guys have a lot of skills that could really benefit the GStreamer project. We could have one really kick-ass stable system rather than 5 half-baked ones.

by otherAC (not verified)

Gstreamer depends om Gnome technology (glib and gtk-docs), if the gstreamer wanted to make it an independent platform agnostic soundengine for both kde and gnome, they should not have used those dependencies...

by Kevin Krammer (not verified)

I am quite sure that gtk-doc is for one just a build time requirement and additionally most likely optional.

And while glib's origins are with the GIMP project and through GTK+ has found its way into GNOME's software stack, it is not a GNOME technology, just like Qt is not a KDE technology but an externally provided base dependency the respective projet builds upon.

Saying a project should not use glib if it wants to be considered desktop independent is as ill-advised as saying you can't use QtCore and in both cases following such an advice would just lead to more bugs, in the worst case even security exploitable ones.

by otherAC (not verified)

True, but lets put it the other way round: if GStreamer uses the Qt equivalent of Glib, would the Gnome desktop adopt it as easily as they have done right now?

by Kevin Krammer (not verified)

I think this question can't be answered on the example of Gstreamer, because the developers who created it are used to using glib and additionally are often associated with GNOME.

We could try to base it on an example of a glib based technology that has not been created by developers associated with GNOME but is still widely used there.

However I don't have enough knowledge about technologies used by GNOME to find such an example.

by otherAC (not verified)

Well, there is nothing wrong with gstreamer using glib (i can't imagine a linux desktop without it and kde 3.3.5 has it as recommended dependency), but what bothers me is that since KDE decided to drop aRts, the GStreamer community is pushing hard and agressivly to make KDE adopt GStreamer in stead of something else like phonon (that can use gstreamer).

It makes me wonder what would happen if it was the other way round: let's say that developers associated with KDE/Qt started creating GStreamer with a dependency on the Qt equivalent of Glib andt GTK-docs and Gnome was looking for a ESD replacement, would they have adopted GStreamer?

by Kevin Krammer (not verified)

Well, they didn't adopt aRts, which hadn't any Qt dependency and was the best solutions of its time.

I'd say we see it when Akonadi becomes available :)

by Chani (not verified)

I think this little argument demonstrates *exactly* why phonon is so important. we don't want to get caught up in religious wars over backends; we just want the sound to WORK!

by otherAC (not verified)

Can't agree more :)

Why does everyone who thinks KDE should focus on a preexisting backend believe that it's gstreamer they should focus on?

Sheesh, be happy phonon-using apps will still in many cases be using gstreamer, without getting annoyed if gstreamer's API changes.

Myself, I'll happily be using the xine backend. I've used both, and prefer it and I'm sick of the gstreamer astroturfers trying to force something different on me.

Why not to use ffmpeg (integrated in xine, mplayer, videolan, openwengo, gizmo, vivia, ..) - fastest multimedia engine used for decoding/encoding and grabbing media from DVB, Video4Linux, Fireware devices and CD/VCD/DVD or Internet resources and also used for streaming.

IMO gstreamer is ugliest (stability, performance, documentation) engine build on top of ffmpeg. Does anybody try to use gstreamer-edtor, pitivi and other related projects higly tied to gstreamer?

The avkode engine is intended to pretty much be an interface to ffmpeg, (or other standard decoding libs where ffmpeg doesn't provide the codec). Not really well developed, but definately lighter weight than the other backends will likely be.

My KMplayer needs xine engine for listening to mp3 radio streams and it needs mplayer engine (embedded in konqueror) to show HD quicktime trailers.
The xine engine can't show those HD quicktime trailers from http://www.apple.com/trailers/ as it can't buffer enough cache of a big HD trailer (e.g. 150 MB).
Mplayer engine is not used for mp3 radio streams as it takes too long to fill the cache (I've set it to 12 MB), therefore the use of xine for mp3 radio streams.

How will Phonon deal with a problem like this?
I want KMplayer to be able to play mp3 radio streams immediately and to buffer big HD trailers enough so the video doesn't stutter.
For me now both xine and mplayer engine need to be supported.
(I know nothing about Phonon KDE multimedia technology, my apologies if my question is not relevant)

by superstoned (not verified)

both are supported by phonon, and the app can tell phonon what it wants to use. so this has to be in kmplayer, not phonon, i think... not sure tough.

by Pim (not verified)

The road to KDE 4?

Release early, release often.

KDE 4 will take up when development releases are available. The question is how user will react when they need to wait more than 1.5 years for a new version of KDE.

I don't understand why KDE does not branch: Stable version (3.5.x) and developmen version with weekly snapshots.

by Tim Beaulen (not verified)

Quote:
"I don't understand why KDE does not branch: Stable version (3.5.x) and developmen version with weekly snapshots."

It does.
Although weekly snapshots = your daily svn checkout.