An Analysis of KDE Speed

Our recent poll (courtesy KDE.com) on the upcoming KDE 2.2 suggests that the area of
greatest concern for KDE users is speed -- at this time, out of 3,463 votes, over 24% consider speed as most important for developers to address. Waldo Bastian, who developed the kdeinit speed hack among other things, has written a paper entitled "Making C++ ready for the desktop", in which he analyzes the various startup phases of a C++ program. Noting that one component of linking -- namely, library relocations -- is currently slow, he offers some suggestions for optimizations. An interesting read.

Dot Categories: 

Comments

by Waldo Bastian (not verified)

Pedantic: "The memory footprint" doesn't drop by 800Kb, the memory footprint remains the same, it's just that 800Kb more of the footprint is shared with other processes.

The reason for this is partly that the linker modifies data in pages of memory (the so called relocation) and by doing so such a memory page will no longer be shared. That is called "copy-on-write", normally the memory page is shared between different processes, but once you make a modification the kernel makes a copy of the page and lets you modify this copy so that your change only affects your process and not any other
process.

It should be noted that about 400Kb of this 800Kb seems to be actually caused by the linker. The other 400Kb seems to go up on other things.

Cheers,
Waldo

by ik (not verified)

hello,

Would this also be true for non kde applications, like applications using just glibc (eg glibc also has to be relocated, and this also involves writing, and
does this mean all used parts of glibc are no longer shared ? and does this involve a lot of memory ?
(thus could linux memory usage in general benefit from such a kdeinit like hack)

by Waldo Bastian (not verified)

To answer your questions in reverse order:

> does this involve a lot of memory ?
No it does not involve a lot of memory.

> does this mean all used parts of glibc are no
> longer shared ?
No, only a small part of glibc (the .got and .data
sections to be precies) is not shared. These
sections are typically quite small in C libraries.
2980 and 12804 bytes resp. for glibc.

> Would this also be true for non kde
> applications?

Yes, but too a much smaller degree.

Cheers,
Waldo

by Ganesan R (not verified)

A very interesting paper. I suspect that using the -Bsymbolic flag to link the KDE libraries can make a significant difference. Most of the symbols that KDE applications use should be resolved within the KDE libraries (or the dependent libraries).

Ganesan

by kdefan (not verified)

My impression about the poll: The first days (or were it hours?) "KOffice" and "Konqueror" were far ahead. Since then "Speed" has constantly catched up not only closing the gap but leading with over 170 votes!

Just wondering if it is possible to vote hundred times via script from the same IP without delay?

Of course speed is important, but I don't think that is SO slow when given enough memory that "Speed" gets most of the votes of all points.

by Minh (not verified)

It means that people on the bleeding KDE edge are used to its speed and focus on extensions. Those that visit the site once in a while don't vote in the first days: I don't use KDE because it's too slow, but I voted nevertheless.
It remind me when I switched to Linux from another quick and dirty operating system, it took me weeks to adjust to the unresponsiveness of the system. Now I am used to it and don't expect windows to open instantaneously.
Please don't tell me to upgrade hardware, or send a me check with your comments.
Minh.

by kdefan (not verified)

>It means that people on the bleeding KDE edge are used to its speed and focus on extensions. Those that visit the site once in a while don't vote in the first days: I don't use KDE because it's too slow, but I voted nevertheless.

I would like to see a new poll which asks for the KDE version, MHz of CPU, RAM size and a rating for "Speed" between 1 and 5.

Users who don't use KDE but vote nevertheless will perhaps only know 2.0's speed. Users with low memory will perhaps vote for "Speed" not recognizing it's actually there memory shortage and they should vote for "Memory footprint".

A _possible_ summary could be: More peoply are pleased with 2.1's than 2.0's speed. 80% are pleased with 2.1 which have at least 400Mhz and 128MB. 80% with less then 64MB are disappointed.

That would be interesting and perhaps show development progress.

by richie123 (not verified)

Kde 2.x needs about 48mb of ram to get decent performance, you can make it run ok on 32 if you diable all the eye candy, and klipper, but any machime with less than that is going to run into performance probs with any newer applications not just kde.

by Steve Gorwood (not verified)

I switch back and forth between Mandrake Linux and Windows 98. I recently upgraded my system from a 450MHz K6-2 with 128Meg RAM to a 900MHz Athlon with 256Meg RAM.

I find that KDE-2 runs slower (as measured by how long it takes to open a window to display my home directory, or to bring up a text editor) on the new system than Win98 runs on the old system with half the speed and half the RAM.

by Dre (not verified)

On later days other sites picked up the poll and so people were voting who are not regular dot readers.

As for multiple voting, you can't do it (from the same IP address, anyway).

by Rob Kaper (not verified)

> Since then "Speed" has constantly catched up
> not only closing the gap but leading with over
> 170 votes!

People wanting more speed have slow computers so it took them more time to vote. ;-)

by Ghulam Sarwar (not verified)

Well... perceived speed of KDE is a little bit slower as compared to KDE 1.x, though its still a lot faster than Gnome and Windows. And I really think that its "perception" which matters. For example, when I open a link in Konquer in a new window, it takes a lot of time. But if I open it in the same window, it appears immediately (though it still takes noticeable time). I can imagine that it takes a lot of time opening a new window, unfortunately, that's what I've to do while I'm doing heavy surfing. Other thing I've noticed is that I can see the page rendering from the top to the bottom from which I perceive that its slow. On the other hand, IE may not render the page faster than Konquerer, but its not noticeable.

by pasha (not verified)

Some time ago someone posted a possible problem with GNU malloc wich causes memory fragmentation and, as a result of that, unneeded memory consumption. Using a alternative malloc gave him significant better result on memory usage. I can't find any reference on the GNU site. Is there somebody out there with a status update on this?

by Nicolas DUPEUX (not verified)

I think it was in KDE Kernel Cousins.

by Evan "JabberWok... (not verified)

Yes, it was in KC KDE:

http://kt.zork.net/kde/kde20010331_4.html#7

The last sentence of the comment reads: "It would indeed be interesting to see what the overall impact on the entirety of KDE would be if this new malloc() was used everywhere". I agree... anybody wanna recompile and see?

BTW - anybody know why KC KDE hasn't been updated this past week? It's already a regularly hit bookmark for me.

--
Evan

by Aaron J. Seigo (not verified)

Hi...

Brief answer: life imposed itself upon my schedule leaving pretty much no time to do anything on my "want to" list. This is actually my first chance to read the Dot since late last week!

KC KDE will be back this coming week however as things are once again in a (more or less) normal pattern for me.

Apologies to all those who were looking forward to this week's KC KDE.

by Theo van Klaveren (not verified)

Does anybody around here know how PHK malloc (FreeBSD's default malloc function) compares to GNU malloc in this way? Does it have the same behaviour?

by Laurence (not verified)

I find it a bit weird that when KDE starts up the CPU is between 15-30% idle (on SuSE 7.0) and a bit less on Mandrake 8. Any ideas?

by Rob Kaper (not verified)

CPU is not always the bottleneck, it is very well possible that a process is waiting for some bytes to be read from disk and thus cannot do anything but idle.

by Martin Macok (not verified)

How did you measure this? With top?

The numbers (especially about CPU usage) the top (and similar tools using libproc) reports are not accurate AFAIK. Especially with multithreaded applications.

by AC (not verified)

I think the kde window manager is also quite a bottleneck. It takes _much_ less time to start a kde application under Xfce or IceWM than under Kwm...
Which I find kind of strange...

AC

by Evandro (not verified)

even more strange is that kwm doesn't exist...

anyway, i tried this here with kwin and icewm and can't duplicate it. the apps take the same time to start.

by Matthias Ettrich (not verified)

Sigh, here we go again. KWin (KDE's window manager) relies on a couple of KDE libraries, including libkdecore and libkdeui.

Those libraries have a certain size, but are present in memory whenever you run a single KDE application.

Now, if you have a system with 8 MB of memory, and you do not run any KDE or Qt-based application at all, you may see a performance decrease on launching applications compared to things like blackbox or icewm due to the increased memory usage (again: not due to the speed of kwin).

If you have 32 MB of memory and you run an entire KDE desktop (that is: kdesktop, konqueror, kicker, some tools like klipper or kscd plus all the service daemons) you will notive a speed difference when starting new applications compared to a simple window manager.

But again, this difference is _NOT_ due to kwin but due to the lack of free memory.

If you are concered that kwin isn't able to create enough window frames per second around new windows, run the following test program:

while true; do sh -c "xclock &"; sleep 1; killall xclock; done

The time you see the clock flashing is the time required to destruct a window frame, start a shell, start xclock, make xclock communicate with the X-Server and having KWin create a decoration frame, doing the smart placement and inform the titlebar. You will hardly notice any CPU usage when running the program.

If that isn't enough of a proof, run kwin standalone, without the rest of KDE.

When will people understand that all a window manager really is is a tiny little program that draws some stupid decoration frames around their application windows? I received bugreports that some programs had missbehaving scrollbars or weird menus and that most certainly must be because KWin does something wrong.

Those things really bother me. A window manager is a frame around a window. Period. It has no influence at all on application behaviour or general look and feel (even if the S.u.S.E. manual tells a different story).

When I read on lwn that fvwm is faster than KDE, I really do not understand what they compare. Starting up a Konsole in KDE with fireing up rxvt from the fvwm menus? Starting lynx vs. starting Konqueror? Starting KMenuEdit vs. vi ~/.fvwmrc? I cannot come up with a single thing that actually has something to do with the window manager (i.e. fvwm). Anybody out there with a clue what they possibly could mean?

by Tim N. van der Leeuw (not verified)

At home I have an aging P200MMX with 64Mb of RAM. At the time, a nice machine with lots of RAM.

Under KDE2, the computer runs slow and sluggish. Under WindowMaker, it's fast and snappy.

My wife and I have come to the conclusion that Win9X runs just much, much better at this type of machines, although we both hate to admit that.

Reasons? Well, what I think is:

1) The KDE2 apps seems to take a long time to start up, this makes everything feel sluggish right away.
(I get the impression that the number of installed fonts makes a huge difference. When starting up the KDE2 desktop, the fontserver eats as much as 40%-50% of CPU for much of the time and if no fontserver runs the X server eats up that time. Starting up Gnome doesn't have the same effect. AA setting doesn't affect this.)

2) Under WMaker I often run a different set of apps, which start faster. Not entirely true, because I do use Konqueror or Konsole from time to time when using WMaker, BlackBox or FLWM.

3) WMaker is a simple windowmanager using little memory. KDE2 is a fullblown environment using lots of memory, running lots of little apps in fore- and background

Conclusions:

1) We do false comparisons because we compare simple no-frills applications with applications that contains plenty nice features and therefore extra code

2) The extra memory used by the KDE2 environment means more swapping, which vastly slows down everything - Netscape under KDE2 can be slower than under Wmaker, because more has to be swapped

So the overall computing experience while running KDE2 is that of a slow system, while the lightweight environments appear fast and snappy.

My subjective feeling is that the KDE2 environment *is* fast and snappy like FVWM, WMaker, what not -- once it's loaded. And that KDE2 applications *are* mostly fast and snappy like their non-KDE counterparts -- once they have started up.

And as long as not too much swapping is required.

So in my opinion it is most important for KDE2 to:
- Reduce memory usage
- Improve app. startup speed
- Improve overall speed, but not at the cost of vastly increased memory usage!
- Fix bugs, become more stable

- Keep improving Konqueror and KHtml - it's the greatest browser for *Nix, finally allows me to get rid of Netscape and beats the hell out of Mozilla until now
- Keep improving everything else

Anyone who wants to pay me for helping out with those things? ;-) (right now I don't even find time to play with my own pet projects, much less to start messing with KDE2 source code much as I would love to do it)

With regards and happy hacking,

--Tim :-)

by David Johnson (not verified)

Yeah, right, sure, okay, fine. But that's not the KWin window manager. That's KDE as a whole. The window manager is really small and memory efficient. It is actually much smaller than windowmaker, believe it or not.

I was once writing my own window manager just to learn how to do it. Looking over the sources to KWin, I was amazed at it's elegance and simplicity. The only "bloat" to it was the kdelibs, but then, if you're running KDE already, this is nothing.

KWin by itself is extremely snappy and quick, rivaled only by Blackbox and other minimalistic window managers. Even icewm seemed sluggish alongside a bare KWin.

by Tim N. van der Leeuw (not verified)

Indeed, that's KDE2 as a whole. Not the WM. That's why I went on to explain why, in my opinion, KDE2 as a whole feels sluggish :-)

Actually, Gnome doesn't feel sluggish as a whole (discounting Nautilus for the moment!). And while I prefer Gnome over KDE1.1, there is no way Gnome can compare with KDE2 - so either I put up with the sluggishnes or I revert to a bare-bones-just-wm setup. :-)

But when you run 'just a WM' the simple WM becomes in a way your desktop - it often has things like taskbar or program icons, launchers, little applets, pagers, etc. A WindowManager doesn't offer a whole desktop-suite with editor, calculator and filemanager but I don't think that those are what really makes up the 'desktop experience'. Filemanager does to some degree but old unixheads have gotten along without a filemanager for such a long time that perhaps it doesn't fully count as part of the desktop.

So when comparing KDE2 vs. WMaker, FVWM, FLWM, Blackbox etc you are comparing apples to oranges. But in a way, you are still comparing desktop experiences. Which makes the comparison unfair, but not totally invalid.

With regards & Happy Hacking,

--Tim :-)

by Joeri Sebrechts (not verified)

I agree.
I've had mixed experiences running GNOME and KDE, and eventually went with just a bare WindowMaker. It gives me elaborate window management, multiple desktops, plugin dock applets, most of the stuff the start bars of GNOME and KDE provide in other words. And I can still run the loose apps from GNOME and KDE if I need them, without even getting much of a speed hit in comparison to a running KDE or GNOME.
WindowMaker is up and running within 4 seconds of typing the enter after startx (on an old pentium II/233). And the only desktop feature it doesn't give me that I miss is abundant drag-and-drop. There is _some_ support, but it's not enough. Copy/paste though is complete, once you figure out to do it via the select/middle-mouse-click way, instead of via the keyboard shortcuts. And on top of all that , it even looks great. A lot better than I ever made kwin look anyway. WindowMaker is based on a principle I like called "less is more", meaning, only minimize and close in button form, and for all the rest there are the shortcuts and menu's.

Honestly, I don't get what desktop environments are about. Most of the stuff they do seems to be hugging resources I'd rather use myself (like arts taking over /dev/dsp, so only KDE-approved apps can use it, unless you know the workarounds).

by robert (not verified)

I'd be willing to bet that you have very little memory, all of KDE is huge compared to IceWM, you probably are using virtual memory.

by Maarten ter Huurne (not verified)

From the article:
[When I now define a class "derive" that inherits "base" and that overrides a single virtual function, I will get a second vtable. I now need 20 relocations.]

I don't understand why twice the number of relocations is necessary. Can't the linker copy the vtable from the base class and then overwrite only the addresses of functions overridden by the derived class?

(I'm not that familiar with linking; it's quite possible the answer is "no", but please explain why.)

by Someone (not verified)

Because the linker doesn't know anything about C++.

by Someone (not verified)

Sorry, that was too brief. It couldn't be done by the linker alone. The compiler would have to be modified to create partial vtables, containing only functions pointers which are overridden (the rest left blank), and generate special code to copy the inherited parts of the vtables. And then the loader would have to be modified to execute this code before the library or executable gets accessed (it could be a list of instructions to the loader instead of actual code, but that would be slower).

You'd have to somehow make sure the base class vtable always gets initialized before the derived class, which would be very tricky. Each vtable might need its own fragment of initialization code, or they could be grouped together based on which shared library contains their base class. The loader would have to be informed about the inheritance relationship of the vtables and would have to figure out what order to execute them in.

The extra overhead might cancel out any speed-up and the added complexity probably wouldn't be worth it.

by Shridhar Daithankar (not verified)

I may be wrong, but I was under impression that all c++ features are taken care by compiler and when the compiled binary is produced, it's like an equivalent C binary, with only one difference. The class member function have an additional, unspecified (first/last) argument, *this, which points to proper data to be handled.

Where does relocation comes into picture? For derived objects, the pointers are totally different.
In a compiled object, there should be no relation between base class vtable and derived class vtable.

Why have dependency hierarchy in assmbled code? What compiler is supposed to do then?

Please enlighten me.

Shridhar

by Maarten ter Huurne (not verified)

That makes sense: the linker doesn't know about the base class / derived class relationship.

However, if the linker can see that it's resolving the same symbols twice, the results could be cached. When resolving the base class symbols there is actual work being done, when resolving non-overridden derived class symbols there is only a lookup in the cache.

So the questions are: Is a non-overridden function indeed the same symbol as the original function? If so, can that be detected by the linker?

by Waldo Bastian (not verified)

> So the questions are: Is a non-overridden
> function indeed the same symbol as the original
> function? If so, can that be detected by the
> linker?

Yes and yes. John Polstra of BSD fame pointed
this out as well and managed to get quite an
improvement by caching symbol lookups for this
case.

Cheers,
Waldo

by ac (not verified)

KDE should add Features instead of trying to improve performance. You can always buy a new faster computer. But you can't buy functionality... it has to be developered. Right now, KDE does not have enough features for end user. End user needs more configurability and options, which KDE developer should aim to deliver.

by Marc Mutz (not verified)

Ha, ha. Good joke!

by thil (not verified)

No comment !

by dc (not verified)

Yeah right, as if everybody has so much money.
I have to wait at least another two years before I can afford a new motherboard and CPU!

by Sangohn Christian (not verified)

Could you please be more precise about which features YOU are missing in KDE?
Personnally I´m happy with it. I´m running KDE since almost the beginning and the last feature I WAS missing WAS an office suite ;-).

So what exactly are you missing within KDE?

by Johann Lermer (not verified)

Well, I'm having an 8 year old Tower case, 5 year old motherboard and graphics gard, very new AMD K6 processor and a 1 year old 10 gig hard disk. Obviously this is not anymore fast and big enough for operating systems these days, at least, when I read your comment. Please feel free to send donations in any currency and any amount to: J. Lermer, Fuerstenfeldbruck, Germany. Thank you.

by Clippy (not verified)

Hi my name is Clippy. You may have seen me in such applications as Microsoft Office 97 or 2000. I'm currently job hunting and will work for RAM. I believe that I am just the right extra functionality that you need. Interested parties can contact me at [email protected]

by Bojan (not verified)

I partly agree with that. Speed is a very important issue and should not be underestimated. However, you cannot expect complex applications that uses KParts and other technological advantages that KDE offers to run fast on 486. This is simply not possible. Flexibility and user friendliness are time consuming issues, unfortunately. I use 233Mhz Pentium II with 64MB of RAM and KDE 2.1 is slow on that kind of machine. But this is the price for all other benefits you get from using KDE. I really doubt that Windoze ME or 2K would run any faster (if they would run at all with that little RAM).

by Peter Nunn (not verified)

Just some other issues to think about when dealing with the shared library/dll comparison. I have spent most of my professional life working inside windows and some of the speed enhancements in NT are worth thinking about.
My understanding of the loader etc (from observation mostly goes like this):
The application does not call the DLL directly, they generate stubs that match the DLL entry table so that if you call a DLL only the relatively small relocation table in the executable needs relocation. The code inside the DLL also has a jump table that uses relative code. There is generally not a lot of relocation needed.
The next big thing is that if a DLL is relocated it is allocated in swap space and shared by all processes loading it. This means that in most cases the reloc is only ever done once.
If a Dll is already in an unused memory area then nothing happens. The .exe will use the link-time offsets and the dll is loaded into it's memory location using a memory mapped file. Since this is how the O/S also tries to load apps the result is that if an app has pre-defined and non-conflicting addresses then the relocation time in 0 and the memory allocation is only the data needed.

I can suggest one way to speed up the time to load a .so, and that is to generate a proxy object that the application calls, the proxy then is the only part of the application that needs to have entry point addresses relocated. Should help?

by Chad Kitching (not verified)

Or perhaps it would just be easier to 'fix' the dynamic loader/linker. Generally when you install a new shared object (.so), you need to run ldconfig(8) utility to update the ld.so.cache file. If I'm not mistaken, this process should also be extendable to automatically generate non-conflicting base addresses for all these libraries, much like how windows executables and dlls specify a preferred address to achieve the same result. This wouldn't help libraries that aren't in the standard library directories, but since few of these are shared between multiple applications, I doubt there'd be much sharing between them.

The basic performance problem with KDE is basically the amount of RAM it consumes, more than anything else. On my laptop, without X or KDM loaded, I have roughly 22MB free (0M swap used). Starting kdm drops that figure to 6MB free (0M swap used), and after logging in, I end up down to 1MB free (2MB swap used). The moment swap starts getting used, performance is ready long dead. If I had 96MB or 128MB instead of 64MB, it'd probably seem much faster. As it stands, KDE does not perform well in such 'low memory' situations. RAM may be cheap right now, but that is no help to someone who may be trying to squeeze some extra life out of an old pentium machine (most of which can only cache up to 64MB of RAM anyway).

Then again, I could be completely wrong... I'm just rambling uneducated guesses about all of this.

by PG (not verified)

Why is not possible to run fast on a 486? Wouldn't it be very nice if it did, especially if you live in an underdeveloped country where computers are very expensive? Comparing KDE to Windows isn't the right view. Instead, compare KDE/Linux to Be. In that view, KDE/Linux has a long way to go and Waldo has done a great service to trying to go further down that road.

by Carbon (not verified)

Speed enchancements are good, but there is a certain limit. A 486 is an extremely old piece of hardware, and expecting KDE to run quickly on it is simply asking too much.

by Con Kolivas (not verified)

Actually I disagree. I almost exclusively use linux with kde2 on Mandrake8 at home on a reasonably fast PC and find that the few minutes I'm using windows everything is faster (simple loading and closing of programs, file manager etc..). Sad but true. Even on a fast PC, it is relatively slow. I've never had click->snap! kind of performance from kde1 or 2. I still love it though.

by BillyBoy (not verified)

They run with 64MB.... and even with O2k installed.

by Alan Chandler (not verified)

I disagree. Firstly when it is obvious from an analysis like we see here that there are things that can be improved they should be - because there are sure to be other performance issues that will bite you anyway. Secondly - its almost a frame of mind which says that hardware will improve so that lets ignore improving the software - that has us struggling with usability timings. Add layer upon layer of that viewpoint in a complex application of today and that is precisely why the problems occur.

by Toni (not verified)

I think functionality is important, obviously, but speed is also very important.
I was using KDE 2.1, and some days ago, i return to start with Windows 98. Why?. Because is possible that Konqueror has the same functionality that Explorer or IExplorer, but when you open Konqueror first time, or close it and open again, it is very sloooow. You do the same operation on Explorer, and start is some slow, but open it again, is <1 seg. (I have Pentium II 233 Mhz, 128 MB ram). If you use Explorer to move or copy things from folders, operations are most quickly -flexible- on Windows that in KDE. Same things in Word or Excel: run one of these apliccations is very quickly, and reopen it again after close, is about 1 seg. You can compare similar programs on Linux, and imagine.. There are something that not work as it must work. Perhaps is ELF format executables, or is library linking, or must have libraries resident in memory, but there is something that not work as it must work. Perhaps it not was a bad idea to "look" how Windows do the work.
I think that buy another machine is not a solution, because windows will be much faster, again. And Linux "robust" sistem is not a excuse; Windows NT is also robust, and it is fastest that Win98 to run programs.
I don't want a war between Linux or Windows. I just remember the facts. There something that not works in Linux and in desktop environments.