Phonon: Multimedia in KDE 4

After many months of work on the new Multimedia API for KDE 4 it is time to
finally announce Phonon. Phonon will
provide a task oriented API for multimedia, making it easy for KDE
applications to use media playback and capture functionality (and more)
resulting in application developers being free to concentrate on the user
interface aspects. The number of possibilities to integrate multimedia into the
desktop experience
make Phonon especially interesting.

Phonon uses exchangeable backends to do the real work which can be implemented using GStreamer, NMM, Xine, Helix or whatever else you can come up with. In turn KDE applications do not need to develop media engine abstractions anymore as Phonon provides it for them.

The folks from Motama have already started work on a Phonon-NMM backend and we will give a joint presentation at
upcoming LinuxTag this year (also see their announcement). Meet us on Saturday, May 6th at 10:00 in Hall 6.2 or later at the KDE booth.

As there is still a lot of work that needs to be done until Phonon is released with KDE 4.0 it is a great chance to get involved now, for example doing a Google Summer of Code project.

Phonon is supported by basysKom GmbH, Motama GmbH, KDE-NL who are organising a KDE Multimedia meeting and most importantly the community for providing feedback & code.

Dot Categories: 

Comments

by liquidat (not verified)

If you do this presentation, please, please include a Video for the people who cannot come to the LinuxTag - it would be great to see something from Phonon in real life :)

by Matthias Kretz (not verified)

This is out of my control. Last year there were cameras to record the talks, but I never saw the files for download. I guess it will be the same this year.

by somekool (not verified)

same for me, I'm in Japan, I really would like to watch the presentation. please.

by David (not verified)

Hey, the website looks really good!

I was wondering if it is also going to be possible to write a phonon program that gets frames from a source (mic or just a file) and then the possability to modify the data (for instance real time effects) and then send it back to phonon, like jack. Do you know if this is going to be possible?

by Matthias Kretz (not verified)

> Hey, the website looks really good!

Kudos to Nuno Pinheiro. He really did a great job on the artwork!

> gets frames from a source (mic or just a file) and then the possability
> to modify the data (for instance real time effects) and then send it back
> to phonon

Capture will be supported, an interface for getting the audio data is there already, I'm working on the video data interface now. Feeding it back into Phonon is technically possible, but I don't think it's a good idea. If you want to do processing of the audio or video data it should be done on a lower level. Phonon supports to use effects for audio and video for that.

by RJakiel (not verified)

PLEASE tell me when this comes to pass arts will be dead and gone. If so that will be the best news I have heard in a LONG time.

by FreqMod (not verified)

I think arts will stay in KDE 3 because of binary compatibillity, and then it will be replaced in KDE 4.

by RJakiel (not verified)

That's what I meant. KDE 4 + Phonon = Death of arts, if that is the case I will be too excited for words.

by Narishma (not verified)

That's the case.

by anonymous (not verified)

Not at all.
aRts is still a sound server, and if there will be a Phonon backend for aRts, then aRts could serve as a sound backend for a KDE 4 environment.

by SadEagle (not verified)

aRts is unmaintained. Unless that changes, I don't see how that would be such a hot idea.

by Steven Brown (not verified)

Arts/esd are already dead. Current versions of ALSA come standard with dmix enabled (previously, you had to configure it which not everyone did) so ALSA-using programs can already multiplex sound just fine. Combine that with libao with an ALSA backend for platform-independent traditional audio and GStreamer with an ALSA backend for platform-independent modern audio and you're all set.

Note that for some reason the OSS emulation of ALSA isn't able to be multiplexed, so make sure you aren't running apps that use the OSS devices like /dev/dsp.

by kaiwai (not verified)

What about those who don't use Linux; such as me? I run KDE on FreeBSD, and I think that the abstraction between the soundcard/lower levels of the sound framework, and the desktop is very important.

If you don't maintain this abstraction, you'll have for ever and a day, people writing directly for ALSA, that will not only cause problems for ALSA later on, when it wishes to make radical moves, but also those whose platforms don't have ALSA, and would require major application re-working to get it running with their OS of choice.

Remember, KDE is a desktop environment for UNIX, not just Linux.

by mikeyd (not verified)

Why? As a user, arts is wonderful.

How will Phonon deal with simultaneous requests from the same user and simultaneous requests from multiple users (ie. two users logged in to different X sessions by "Switch Users" or by "su - " on the command line)?

Currently, when a KDE alert appears it plays a "gong" sound. However, if amaroK is playing while the alert appears, the "gong" is only heard _after_ amaroK has finished playing the whole song!

Another problem I've encountered is that when two users are logged in, only the first user can play sounds.

by superstoned (not verified)

2 requests from the same user should work if he uses ONLY phonon: if you use arts in amaroK, you'll hear the sound immediately. the problem is, most users don't use arts in amarok. if you use gstreamer in amarok and aRts or whatever in another app, they can't play at the same time. phonon will not fix this - but at least all KDE apps will use the same backend (unlike now) so it will get better.

i don't know about the two-user problem, but i really think phonon SHOULD try to do something about it.

by Jakob Petsovits (not verified)

The amaroK thingie is for sure going to be solved with Phonon. It stems from the fact that amaroK uses a different sound engine than the standard KDE one, so all the KDE sounds have to wait until the amaroK sound backend releases its claim for the sound card. With Phonon, all of the sounds (including amaroK's) will be redirected to the same sound engine, so all of the applications can make sounds at the same time. (This should also work with non-KDE apps.)

About the user switching use case, I don't know if this is going to be tackled.

by Ian Monroe (not verified)

This isn't really the case. Phonon won't be a sound server.

What has actually solved the issue described by the parent post is that ALSA now has software sound mixing and most sound chips (even those builtin to the motherboard) have hardware sound mixing.

When aRts was created this wasn't the case so a sound server made a lot of sense. This isn't really the case anymore.

The backend to the Phonon system could be a sound server (aRts, NMM) but it doesn't have to be (Xine playing to ALSA isn't).

You need to setup ALSA to do software mixing on your computer because it seems like your sound card doesn't support hardware mixing. You can have ALSA do software mixing by setting up DMIX yourself, or upgrading to a rather recent version of ALSA (which has it enabled by default). If you're using OSS (Open Sound System) still, you have to upgrade to ALSA.

This only applies to Linux, if you're using *BSD or another *nix then you may be out of luck (I don't know if they support software mixing, or how to set it up). Mixing sounds coming from multiple applications isn't (shouldn't be) the job of a userland sound server (like aRts), but really should be on the other side of the sound API so the applications don't have to know or deal with it.

To actually answer your question (guess I kinda forgot to), if you're using software or hardware mixing, then phonon doesn't need to do anythin special to allow sounds to play at the same time, in fact you'll be able to have multiple apps/users output directly to ALSA and not have any problems (you can't do that with OSS). I don't believe Phonon will include anything special to solve the problem people that don't have hardware/software mixing have.

As I see, phonon will replace arts, but with improvements, trying to standarize multimedia apps.

About the hw/sw mixing, it all depends on what sound card you use and NOT if you use ALSA or OSS. Nvidia Nforce2 sound card uses an OSS driver and has hw mixing, i.e., you can use xmms, amarok, flash, mplayer and xine at the same time regardless what backend they use. I think this matter is way beyond Phonon scope.

by Ian Monroe (not verified)

Well, ALSA can do software mixing (dmix). So it does matter if you use ALSA or OSS.

How to enable multiple sound sources simultaneously in FreeBSD is described in the FreeBSD handbook (online) chapter 7.2.3. It's easy :-)

Write the values to /etc/sysctl.conf to make them permanent.

Change amarok to the arts engine, then it works.

by Simon Edwards (not verified)

Phonon developers,

How does the scope of the Phonon API compare NMM, Jack, gstreamer etc? What is the philosophy here. Does Phonon aim to provide a base level of multimedia support aimed at most 'simple' applications, and if you need advanced features do you then by pass Phonon and code on the API from NMM, Jack and friends?

sorry if this is a FAQ.

--
Simon

by superstoned (not verified)

phonon USES NMM, Jack and Gstreamer. Phonon just makes it a lot easier for application developers to add rich multimedia features to their applications without having to worry about gstreamer, NMM, Jack or whatever. and you don't bypass Phonon, Phonon should be powerfull enough in its own right!

by mabinogi (not verified)

Well no. Especially not when it comes to Jack.

Not every media library has the same goals, and Phonon shouldn't try to be a superset.
It should try to be a reasonable subset that covers the common cases.
It's not the desktop environment's job to provide audio and video capabilities.

by Matthias Kretz (not verified)

The Phonon API is a high-level multimedia API, NMM and GStreamer are rather low-level multimedia APIs while Jack provides an audio in-/out API especially geared for pro-audio needs.

The idea is to make multimedia application development easy. The average developer doesn't know what a (de)muxer is or how he needs to set up the flow graph/pipelines correctly. Like I said in the article, the API is designed from looking at common tasks ("task oriented API") instead of looking at what features can be provided. In the end that means, that some things which are possible with GStreamer will not be possible with Phonon.

But yes, in the cases where the Phonon API is really too limited (you should first ask me before you decide whether the Phonon API is not good enough for you) you can use whatever else you want. The only thing to take care of is that the soundcard output works nicely together with Phonon applications. I guess I should come up with a Do's and Don'ts document for such a case.

by Janne (not verified)

"The Phonon API is a high-level multimedia API, NMM and GStreamer are rather low-level multimedia APIs"

Why do I get the feeling that we have Phonon, which talks to Gstreamer/JACK/NMM/etc. and they talk to ALSA. So that's three layers on top of each other. Doesn't that make debugging harder? How about bloat?

by hugelmopf (not verified)

Adding overhead by yet another layer (besides ALSA and gstreamer/NMM/...) is exactly what I was wondering about. Is it really that hard for developers to use gstreamer & co. for playing audio/video? Or is there another reason for wrapping it?
I can't imagine that another layer is going to simplify anything for the user, but you may prove me wrong ;-)

by anon (not verified)

Gstreamer sucks, so right there is one good reason to have a wrapped API. That way when the next overhyped beta-quality media engine get shoved in people's faces, it should be easy enough to switch to it. Then switch back to Xine.

by Ian Monroe (not verified)

And even if you think gstreamer is great, it might have an ABI change before KDE 5.0.

by Brandybuck (not verified)

It's a GNOME project, so will most likely have an ABI change before KDE 4.0.

by sundher (not verified)

So funny and so true. Xine rocks.

by meatman (not verified)

Gstreamer isn't to bad the latest release works well and even surpasses Xine in some areas, Xine is a pain to get "win32" codecs running with sound. With GStreamer all you have to do is install the "Ugly" branch. Xine still is a great, don't get me wrong but. Gsteamer right now I think is a good framework

by Ian Monroe (not verified)

Its not like Phonon is some sort of running process that has to "talk" to xine, gstreamer etc. It just layer of API in keeping with good OOP design. So, it adds a couple of extra function calls.

The ability to change multimedia backends while retaining binary compatability is a large benefit. We don't want KDE 4.x to be stuck on one multimedia system like KDE 3.x is stuck with aRts.

by Matthias Kretz (not verified)

First: "talk" is a term which suggests that it takes a lot of time for the Phonon -> GStreamer -> ALSA "communication", which doesn't have to be that case (of course it can be slow if you program it like that). And it would make debugging harder if those were processes (or threads), but actually debugging will be easier than with aRts, and not harder than without Phonon.

The term bloat is unjustified. Phonon at this point has about 9500 lines of code including example code and the fake backend. libphononcore has 189892 bytes (stripped) in comparison to 2000212 for libkdecore or 385852 for libkutils.

by superstoned (not verified)

really. look at these sites:
http://phonon.kde.org
http://plasma.kde.org
http://solid.kde.org

don't they all look BEAUTIFUL? really wonderfull, professional and such?

respect to those who made them...

i think all KDE sites should use this look'n'feel, even if there are differences like between phonon and the other two - they don't have to look exactly the same, just similar.

by Ian Monroe (not verified)

I agree, http://www.kde.org could learn something. :)

by pinheiro (not verified)

:) thanks i feel embaressed we in oxygen work very hard to make kde visual elements beter and beter.
We are working on lots of other kde related sites and i can promisse kde is loking beter by the day.

by superstoned (not verified)

you should feel proud of yourself :D

by Psiren (not verified)

My issue with the phonon site is the fixed width, requiring me to scroll left and right. Very little point, considering the lack on content in it. It may look good, but for me it's unusable.

by pinheiro (not verified)

Has monitors get biguer and bigher the fix withe is imo the best aproach try to read a page on a 1900 screen in full screen mode, you get intire text's in one line and you constantly lose the starting point couse your eys lose th reference points.

by Jimmy (not verified)

I think 59em must be the optimum width for a page, because then you can fit ten words per line, which apparently is easiest to read. And the average word length is apparently 5 characters. So 5 characters times 10, plus 9 spaces would be 59em.

by Corbin (not verified)

If you're scrolling on a monitor with the resolution set to 1024x768 and an almost full screen window thats a bit too large, considering most people still use 800x600. Other than making you scroll on a small monitor I really love the new sites (the art is really nice, I can't wait for Oxygen!).

by Gary Greene (not verified)

Actually, I'd have to disagree about the 800x600 comment there.... Most I've seen in the last two years are at 1024x768, as monitors (LCD and CRT) have INCREASED in size.

by Kolbjørn Barmen (not verified)

I'm not impressed, what a waste of space.

And it's not close to validating, but I guess that's part of having "professional look". I just hope they're not indications on what will happen with KDE4

by Koos (not verified)

What is the idea with the central configuration and choosing the right backend. I can imagine that depending on a certain use, eg. local playback vs. web movies, one would like to choose for an different one. Or can/should a movie player query all backends and let the user decide?
Btw http://developer.kde.org/documentation/library/cvs-api/kdelibs-apidocs/p... link is broken.
Who is handling network streaming for eg. mms and rtsp protocols, is that phonon or should the backends handle those (and what happens with password protected links then)?
Is there a way to specify whether the backends should be in- or out-process, eg. for konqueror web browsing one might not want to pull in the gstreamer libs but for a movie player that might not be an issue.

by gerd (not verified)

Everything should "just work", you should not even have to think about it

by Koos (not verified)

That sounds pretty boring, count me out then :-)