The Road to KDE 4: Phonon Makes Multimedia Easier

Like the previously featured articles on new KDE 4 technologies for
Job Processes or SVG Widgets, today we
feature the shiny new multimedia technology Phonon. Phonon is designed to take
some of the complications out of writing multimedia applications in
KDE 4, and ensure that these applications will work on a multitude
of platforms and sound architectures. Unfortunately, writing about
a sound technology produces very few snazzy screenshots, so instead
this week has a few more technical details. Read on for the details.

Phonon is a new KDE technology that offers a consistent API to use audio or video within multimedia applications. The API is designed to be Qt-like, and as such, it offers KDE developers a familiar style of functionality (If you are interested in the Phonon API, have a look at the online docs, which may or may not be up to date at any given moment).

Firstly, it is important to state what Phonon is not: it is not a new sound server, and will not compete with xine, GStreamer, ESD, aRts, etc. Rather, due to the ever-shifting nature of multimedia programming, it offers a consistent API that wraps around these other multimedia technologies. Then, for example, if GStreamer decided to alter its API, only Phonon needs to be adjusted, instead of each KDE application individually.

Phonon is powered by what the developers call "engines" and there is one engine for each supported backend. Currently there are four engines in development: xine, NMM, GStreamer and avKode (the successor to aKode). You may rest comfortably in the knowledge that aRts is now pretty much dead as a future sound server, and no aRts engine is likely to be developed. However, aRts itself may live on in another form outside of KDE. The goal for KDE 4.0 is to have one 'certified to work' engine, and a few additional optional engines.

Other engines that have been suggested include MPlayer, DirectShow (for the Windows platform), and QuickTime (for the Mac OS X platform). Development on these additional engines has not yet started, as the Phonon core developers are more concerned with making sure that the API is feature-complete before worrying about additional engines. If the Phonon developers attempt to maintain too many engines at once while the API is still in flux, the situation could become quite messy (If you would like to contribute by writing an engine, jump into the #phonon channel at irc.freenode.org).

When an engine is selected by the user or application, Phonon will use the selected engine to determine what file formats and codecs each backend supports, and will then dynamically allow the KDE application to play your media. As it currently exists in the KDE 3 series, the user would have to manually change engines in each application (Kaffeine, Amarok, JuK, etc.) rather than being able to select engines for use across KDE.

Once an engine is selected for Phonon, it allows the programs to do the standard multimedia operations for that engine. This includes the usual actions performed in a media player, like Play, Stop, Pause, Seek, etc. Support also exists in Phonon for higher-level functions, like defining how tracks fade into one another, so that applications can share this functionality instead of re-implementing it each time. Of course, some applications will want more control over their cross-fading, and so are still free to design their own implementation.

The engine with the greatest progress so far is xine, which I was able to set up and run on my system. I was unable to get the NMM (notoriously hard to compile/setup) or GStreamer engines to compile on my system, whilst avKode is currently disabled by default. I would show you a screenshot of Juk or Noatun playing audio with Phonon, but right now these applications look just like their KDE 3.x versions (only with a somewhat ugly/broken interface!). When they are getting polished for release, I will show them off in a later article.

Matthias Kretz offers a short video which, if you turn your speakers on while watching, demonstrates device switching. Phonon lets you switch audio devices on the fly, and you can hear the specific moment when the music switches from his various outputs (headphones, speakers, etc.).

Matthias also submits the following screenshot of output device selection using Phonon's configuration module. This is also a work-in-progress, and so take it with a grain of usability salt.

There are not many things that I can take a screenshot of which show Phonon in use (screenshots of an audio framework are notoriously difficult to compose!), but I can describe one of the neat side effects of using Phonon: network transparency. KDE has long used KIOSlaves to access files over the network as easily as if they were stored on your local computer. Multimedia apps like JuK or Amarok should be able to add files transparently over the network to their collections without having to be concerned about whether or not the back-end engine is aware of how to deal with ioslaves. This support is already partially implemented in KDE 4, and is most visible through audio thumbnails, which are working for many people over any KIO protocol, including sftp:// and fish:// - two popular protocols among KDE power users. They do not yet work for me due to some instability in the fish:// KIOSlave of my current compilation, but the developers in the #phonon IRC channel claim that it this functionality will be ready and working when fish:// is more stable.

So, Phonon, while still in development, is going to be a great pillar technology for KDE application programmers, making their job easier and removing the redundancy and instability caused by constantly-shifting back-end technologies, and (eventually) making support for other platforms a piece of cake. This means that those developers can spend more time working on other parts of their applications to ensure KDE Multimedia applications shine even more brightly than they currently do.

A couple of quickies here to note: Mark Kretschmann, lead developer for Amarok has officially opened up Amarok 2.0 development this week, and seems to be quite interested in what Phonon can do for Amarok 2.0. He doesn't rule out keeping their own engine implementations, like they currently do in the Amarok 1.4 series. However, given its early stage of development, Phonon can likely be adjusted to ensure that it will do everything Amarok asks of it.

If you're looking for a way to help out with KDE and are not a programmer, Matthias Kretz, lead developer of Phonon (Vir on IRC) has requested some help in keeping the Phonon website up-to-date.

And lastly, a few translations of these articles have been popping up around the world in various languages. Sometimes more than one translation is happening for a specific language. If you are translating or plan to translate these articles, send me a message so that we can save everyone some work and avoid redundancy (lets keep the redundancy-reduction spirit of Phonon alive!).

Until next week...

Comments

by otherAC (not verified)

That's even better then weekly snapshots, you can get SVN snapshots every minute.
Heck, even every second :o)

by reldruh (not verified)

Does anybody know how far along phonon actually is? This is a great overview of what it is but it would also be nice to know the current state of the project. The roadmap on the website seems somewhat outdated (it currently says things that were supposed to be completed in Q2 2006 are in progress). Is there any news on how far along the developers are?

by Lans (not verified)

Thank you very much Troy, it's always interesting to read "The Road to KDE4"-articles. You've done, as always, a great job.

As a user, it could be hard to follow the development. Lucky me that this kind o series exist. I want to know about so much about KDE4; Plasma, Konqueror, Krunner etc.

However, I think there is an application which is mentioned very rarely when talking about KDE4: Get Hot New Stuff (don't know if this is the actual name). I think it is an application that is very good, but that would need some love from developers and artists.

Does anybody know the current state of Get Hot New Stuff, and how it is going to work in KDE4?

by Troy Unrau (not verified)

GetHotNewStuff is a library shared between many apps. It has seen some work for KDE 4, however I haven't really looked at it. I'll add it to my list of article topics to research for some day in the future.

Plasma is not ready to be shown off.
Konqueror looks pretty much like Konq in 3.x, except the backend libs for rendering html and javascript have seen some new work. Not much to show off there :)
Krunner is visually just a run dialog, and hasn't really changed since I showed that screenshot a few weeks ago, except that it's now enabled by default. It also has a few things that kdesktop used to have that have simply been ported over. It now controls screensaver activation, for example.

I'll write an article in a few weeks containing updates to old topics that'll touch on krunner some more.

by Lans (not verified)

Thanks for the answers. I look forward to read the next article!

by RF (not verified)

Just to mention

The Pronunciation of phonon in Arabic means Arts

by superstoned (not verified)

that's really cool ;-)

by Morreale Jean Roc (not verified)

As we're talking about sound, any news on kmix ? I hope KDE4 will not ship with a similar sound manager as I would like to install it without the users being lost with all the undocumented and obscure parameters

by otherAC (not verified)

you could also try to document al those parameters to make sure your users will understand how kmix works

layer upon layer upon layer upon layer upon layer = slow machine.

and me who thought the point with audio was to be able to listen to it, and the point with video was to be able to view it.

NOW.

your assumption that layer upon layer is always slowing down is wrong ;)

Sometimes layer upon layer upon layer upon layer upon layer... is faster (and better) than just a single layer.

by somekool (not verified)

Did you say that once you switch backend in Phonon's config, it will use that backend and only that one?

I thought it could be a preference list, like the KDE language settings. so that one is preferered and if it fails, it fallback onto the next one. wouldn't that be sweet? and non-working backend on a system should be automatically disable. for example on a fresh install gstreamer backend would be the default (for example), phonon finds out it fails, phonon suppose gstreamer is not installed and disable the backend. so it won't try and fail again.

another solution would be to mimic the file association config panel. mimi-type type of thing. where, I want mplayer or xine to handle video, and gstreamer for music. again, for example.

keep up the good work

thanks

by Ben (not verified)

I was wondering what happens to programs like VLC that have the codecs built in rather than useing an engine like xine. Will there some way of controlling VLC via Phonon? say by useing a special phonon-engine that just takes sound from VLC and sends it to ALSA or the Sound Server?

by otherAC (not verified)

there is no need for vlc to use phonon for this, just like there is no need voor xine, mplayer or xmms to use phonon.

by Ben (not verified)

Well there is no need for it to use phonon for decodeing media, but what about phonon's volume control, or its ability to chose diffrent pieces of hardware or soundservers?

by otherAC (not verified)

if VLC sees advantage in supporting phonon, then they can put an option in the media player that uses phonon as backend.
Just like xmms had an option to use aRts as backend.

by Josep (not verified)

Hi, I like very much what I have read about Phonon, and I would like to ask if there will be support for PulseAudio (http://pulseadio.org) as it seems it's evolving and becoming more popular now.

Thanks.

by otherAC (not verified)

Looks like competition for GStreamer :)

I guess if someone writes a pulseaudio backend for phonon, then pulseaudio will be supported as wel..

by Kevin Krammer (not verified)

I don't think that PulseAudio is a competitor of GStreamer, they have orthogonal feature sets.

PulseAudio is very likely already a supported output option of GStreamer, in which case it is also supported by the GStreamer based Phonon backend.

by Tim Beaulen (not verified)

Judging from the name PulseAudio, it seems to deal with audio only.

But indeed, GStreamer can play to a PulseAudio server.

http://img201.imageshack.us/my.php?image=snapshot1br2.png
http://img74.imageshack.us/my.php?image=snapshot2ko4.png

by phony (not verified)

I'm just curios about authoring APIs. Something on par with QuickTime - check the example: http://developer.apple.com/documentation/QuickTime/RM/CreatingMovies/MTC...

The goal of Phone is all nice and evolutionary. But IMHO problem now is not a playback - but creativity support. KDE (and Linux desktop) need decent multimedia creativity support to go forward. After all to play some multimedia content - it (the content) has to be first created.

There are only few applications - e.g. Cinelera, Kino, mencoder, transcode. As my friend said if you want to buy camcoder and use it under Linux - do NOT buy camcoder: video encoding at best flaky, video editing is non-existent.