Continuing the series of articles previewing KDE's World Summit,
aKademy (running from August 21st to 29th), Tom
Chance interviewed Matthias Ettrich, the founder of the KDE project, the creator of the LyX
document-processor, and an employee of Trolltech. At aKademy he will be talking about
how
to design intelligent, Qt-style APIs. I asked him for his thoughts about the status of the
KDE project, its achievements, and what he is looking forward to in aKademy. You can read the
previous interview with Nils Magnus of LinuxTag here.
Q: We all know you as the founder of KDE, but what is your current role in the project?
Matthias Ettrich: Today I am very much focused on KDE's underlying technology, the Qt
toolkit. This pretty much is a full-time job, so I'm no longer feeling
bad about not actively contributing code to other parts of KDE
anymore. When you take a step back and recognize how much the KDE team
achieves in relation to its financial backup and the number of
developers, you'll clearly see how important a solid foundation is. We
are an insanely productive development community, and we achieve that
by layering our software stack and investing into the foundation,
instead of constantly reinventing the wheel.
It's all about developers
and what developers need to be efficient. Every hour spent on Qt and
the KDE libraries is an hour spent wisely, because an every growing
number of applications benefits from it. So that's what I do.
In addition my Trolltech position allows me to contribute indirectly to
KDE's success: Some of our engineers can do part-time work on KDE, we
sponsor David Faure, and of course we are an aKademy gold sponsor. On
a more personal level I do my share of giving talks and interviews, I
make an effort to bring people together, and I try to actively help
with community events like last year's conference in Nove Hrady and
this year's aKademy.
Q: What is your favorite development in the project since you started it?
ME: The greatest thing for me is that we managed to grow the project while
keeping its initial culture and soul intact. We started out with a
relatively small group of equals that cooperated purely based on
mutual respect and technical merits. This is pretty standard for small
engineering groups. What makes KDE special, though, is that we managed
to scale this to the overwhelming size the project has today. With KDE
e.V. and its statutes we have found and established a mechanism that
makes sure KDE stays this way: a project owned and controlled by its
active community of individual contributors. Establishing KDE e.V. and
seeing it gaining acceptance within the KDE community was probably the
most important non-technical development that happened, and this
process is far from being over.
Q: Almost four years ago [1] you said that in 2005 you'll be a manager due
to the success of KDE (which "will be a leading desktop platform by then").
Given that you only have one year left, what are your thoughts on this
prediction?
ME: Well, I have been working as a Director of Software Development for
some time now, so for me it became true already. Luckily my concerns
about being a manager turned out to be exaggerated, managing people is
not as bad as I anticipated it to be. Lesson to be learned: one should
not rely on Dilbert as the only source of information. The obvious
downside is less time for coding, but it comes with a strong upside:
by working through a team you can achieve far more than what you could
do on your own. Just imagine somebody offered you 50 extra hands. And
not only that: each pair of hands came with a brain of its own, each
with extra skills and talents that complete your own. Now, how good
does that sound?
With regards to KDE becoming a leading desktop platform: we are
already, in many areas. We are a leader in terms of active community,
in terms of network integration, in terms of providing freedom and
choice to desktop users, and in terms of providing a sophisticated
development framework for application developers.
Q: What do you think the "next big thing" in KDE will be?
ME: There is one thing that will become increasingly important in the
future, not just for KDE, but for all of Linux: a convincing answer to
Microsoft's .Net. I'm not concerned about the server, I'm concerned
about the client, and about the belief that some people in the
community share, that you can successfully clone Microsoft's APIs and
then keep up with them. Free software should not be about cloning, but
about creating. If we want to be successful, we need to have our own
APIs. And guess what, we are really good at that. There is no reason
to throw everything away and start all over again from scratch.
Instead we must built upon what we already have, and that is native
code.
Native code is and will be the solid basis of every successful
computing platform, simply for its flexibility, its performance, and
its low memory consumption. With KDE and Qt, it's easy to develop
native code. Once you get the hang on it, it is easier than
e.g. developing complex applications with Java/Swing.
Still it would be nice to take advantage of JIT-compiled bytecode where it makes
sense, and have the two worlds interoperate. Currently there are two
technical options: integrating Mono and the CLR, or going for a Java
Virtual Machine. Mono at present has several advantages: First, there
is no free JIT-compiling JVM that is equally actively developed and it
doesn't look like there will be one. Second, cooperating with Miguel
and the Ximian group at Novell is probably a lot easier than
cooperating with Sun. And third, it is easier to integrate native
C++ code with the CLR than going through the JNI.
Q: What are you looking forward to in aKademy?
ME: Meeting people, having fun, and watching KDE improve! Every KDE
conference so far has been a big happy gathering of friends that
kick-started an insane commit rate to the CVS. And there's no reason
why aKademy 2004 should be any different.
Q: What do you think people should make a special effort to attend at
aKademy?
ME: There's so many interesting things going on at aKademy, it's hard to
pick just one. But if you are a developer and haven't thought much
about accessibility yet, I suggest you listen to Aaron Leventhal's
opening speech of the Unix Accessibility Forum on Sunday. Assistive
technologies are not only an interesting technical challenge, but an
area where we as a free software project can make a real difference in
many peoples' lifes. For the User and Administrators conference I
suggest you give the groupware and collaboration track some special
attention. Kolab and Kontakt are exciting projects that have not yet
gotten the attention they deserve. And nobody should miss the social
event on Saturday when we celebrate the Freedom Software Day.
Q: Thank you for your answers and your time.
ME: My pleasure :)
Comments
There is a rumor that one major distro (SuSE) will be shifting to GNOME, and another that Novell will be having two items, the SuSE Linux Desktop and the Novell Linux Desktop. A question should have been asked to ME about what he thinks about the potential shift. I know that he'd probably not want t speculate, but as a programmer, I am sure he understands what I mean when users want to see solutions to [potential] scenarios.
Cb..
Look out for an interview I'll be publishing soon with Waldo Bastian, a SuSE employee :-)
Is this going to be on the dot, or elsewhere?
thanks,
Jason
SUSE does not shift to Gnome. They are using KDE for SUSE Linux 9.2 by default and will do so in the future as well
The Novell Linux Desktop (which is targeted at business users while SUSE Linux Personal and Professional will be targeted at the common user) might be released next year and will let you choose between KDE and Gnome.
Good idea.
I like choices. It is either between distros or it is between desktops.
While choice may be a great thing for the common nerd, not too many corporations appreciate it. It just makes matters more complex, and thus, more expensive.
What Novell want to do here is give corporations a look at what is on offer, and make a decision based on a variety of needs.
of course. however not every corporation chooses the same technologies for various reasons (some good, some not so good). because an OS is not a final product for one specific person, group or company, the whole "companies will only use one thing internally anyways" argument is bunk as a reason to limit choice as to what is available.
as long as interoperability and Freedom are maintained at the highest levels, then allowing entities to choose what works best for them (and then stick to it for as long as they wish) is a great selling point. it offers "best fit" scenarios, built-in (friendly) competition and a failsafe (if one desktop stops meeting our needs......)
a good analogy may be gleaned from asking yourself why there are so many different types of vehicles on the road (sedan, coup, SUV, pickup truck, semi, etc).
I really don't think that they going to do that. Though SuSE allows to run both GNOME and KDE it is mainly a KDE distribution. I would assume that at least 90% of the SuSE Linux users are running KDE on their desktop and I'm pretty sure that most of them don't want to miss it. So if Novell would force SuSE to switch from KDE to Ximian GNOME as the default desktop choice they would risk to loose a significant portion of their customers. On the other hand Novell not only bought the KDE company SuSE they also bought Ximian. But if you look a little closer then you'll notice that they've bought Ximian for about 10 or 20 million dollars and SuSE for about 220 million dollars. Therefore, risking the larger investment would be pretty stupid.
I'm afraid that there are politically opposed forces within Novell who are trying to tell everyone the way things are going to be (even in direct conflict with existing product lines, and even their own Vice Chairrman). I have a feeling they're going to be rather embarrassed.
"Still it would be nice to take advantage of JIT-compiled bytecode where it makes sense, and have the two worlds interoperate. Currently there are two technical options: integrating Mono and the CLR, or going for a Java Virtual Machine. Mono at present has several advantages: First, there is no free JIT-compiling JVM that is equally actively developed and it doesn't look like there will be one. Second, cooperating with Miguel and the Ximian group at Novell is probably a lot easier than cooperating with Sun. And third, it is easier to integrate native C++ code with the CLR than going through the JNI."
Miguel is so bloody good with his marketing, that he completely bamboozles evenpeople like Matthias. There is development as active as that on Mono on a free jvm, a free Java class library. You can, right now, use Java to code Qt applications, compile them to native code and _still_ have the advantage of a garbage collector that manages your memory for you. JNI is a crutch, but CNI is a great thing. GCJ, GIJ, Classpath and CNI are truly free software, and are at least as stable and useful as Mono. Dash it -- every time Miguel prances about showing off Eclipse, what he shows is ikvm and classpath.
Free java on the desktop is ready _now_ -- or at least, more ready than Mono. The only reason people don't know about it is that a) you don't need it if you have Qt and C++ (Java is nicer, but the difference isn't killing. C++ and Qt together is easy and productive enough) and that b) The gcj/classpath people are really bad at marketing.
Though I definitely like to work with Qt and C++ I will now have to use Java and Eclipse for my next programming job. I started using Eclipse a just a few weeks ago and I have to say that it indeed has a few very nice features. The only thing that is annoying me is the it looks ugly and feels slow and kind of alien in an KDE environment. I think that this is mainly the result of using of using GTK as underlying toolkit for Eclipse. So my quesion is, do you know if there is effort under way to implement the SWT Interface with the Qt or KDE libraries?
Anything IBM looks ugly and feels slow. Have you ever tried any of the Visual Age or Visual Studio stuff by IBM? It's not pretty, and is slower than molasses flowing uphill in the winter.
Not really -- Eclipse with Qt, I mean. On the other hand, Redhat has a natively compiled Eclipse 2 that is rumoured to be as snappy as any other GTK application. (Which doesn't impress me all that much, but then, my main beef with Eclipse is that I have to do a lot of work to free the editor from the surrounding blurb.)
Hmm, too bad, because a Qt version of SWT would surely also be a nice way to write (simple) applications that work and look nice under KDE and Windows.
P.S.: Thank you for your work on Krita. I'm really looking forward to have a Paint application with a sane UI under Linux. :-)
"Hmm, too bad, because a Qt version of SWT would surely also be a nice way to write (simple) applications that work and look nice under KDE and Windows."
IBM's CPL license isn't compatible with the GPL, so that isn't possible.
I think the Qt api rendered via the QtJava bindings, is more complete and elegant than SWT - so why do we need SWT? The QtJava development and marketing budget is obviously a bit smaller than IBM's.
Hi Richard,
I'd like to write small Java application and it would be really nice to have a native looking user interface under KDE. However, it is also important that this application works under windows. Does that mean that I have to write the user interface twice, once in Swing and once in Qt Java?
Another question is, is Qt Java still maintained and will it work with the newest KDE and Qt Versions? The readme file under http://developer.kde.org/language-bindings/java/qtjava-readme.html says something like this: "Here are Java JNI based api bindings for Qt 2.2.4."
Thank you
"I'd like to write small Java application and it would be really nice to have a native looking user interface under KDE. However, it is also important that this application works under windows. Does that mean that I have to write the user interface twice, once in Swing and once in Qt Java?"
I don't have a windows development environment, but as far as I know QtJava works perfectly fine with very little change on windows (and Mac OS X too). I don't really understand Qt/Windows licensing issues too well - if you distribute your small app, you might have to include the QtJava sources to comply with the GPL.
"Another question is, is Qt Java still maintained and will it work with the newest KDE and Qt Versions? The readme file under http://developer.kde.org/language-bindings/java/qtjava-readme.html says something like this:
Here are Java JNI based api bindings for Qt 2.2.4."
Well developer.kde.org/language-bindings/java doesn't sound as though it is being maintained too well, but the bindings themselves are in good shape. If anyone fancies sorting out the docs on developer.kde.org/language-bindings please go ahead..
"I don't have a windows development environment, but as far as I know QtJava works perfectly fine with very little change on windows (and Mac OS X too)."
Wow, that sounds cool, but I guess that would have to buy a Qt Version for Windows. Still, that would be an option.
"I don't really understand Qt/Windows licensing issues too well - if you distribute your small app, you might have to include the QtJava sources to comply with the GPL."
With a bought version of Qt which I guess is not under GPL, do you know whether I would have to include the source of my program? I mean is the code of qtjava completely GPL or is there a way to use qtjava under the same conditions as the commercial Qt Version? If not, then you could possibly try to sell qtjava to Trolltech. This way it could stay GPL under Linux but could be boundled with Qt under a commercial license for Windows? But that's just a thought.
Regarding SWT do you really think that there is no way to combine the CPL and the GPL in any way? Because I would think that QtJava would be a pretty solid foundation for a Qt implementation of the SWT. May be it could be done the same way as NVIDIA integrated their driver into the Linux kernel. I still think having a native looking version of Eclipse would be a real gain for the KDE environment.
Finally, I managed to test some of the demo apps that are in the java-bindings source file ( I couldn't find them on my computer even though I had java-bindings package installed.). A problem that I had with these demos was that I had to set the LD_LIBRARY_PATH. I thought that this would not be necessary since the libqtjava.so is installed in the /opt/kde3/lib dir. But it seems it is necessary.
Anyways, it works now very nicely and I guess I'll use it.
Great work!
" I mean is the code of qtjava completely GPL or is there a way to use qtjava under the same conditions as the commercial Qt Version? If not, then you could possibly try to sell qtjava to Trolltech. This way it could stay GPL under Linux but could be boundled with Qt under a commercial license for Windows? But that's just a thought."
If there was sufficient demand for a commercial version of QtJava I would happy to dual license it (with Trolltech's permission as normally you can't change from the GPL version of some Qt software to issue a commercial version). But there hasn't been any real demand so far..
"IBM's CPL license isn't compatible with the GPL, so that isn't possible. "
I always wondered why there was such a hype about SWT when its licence makes it impossible to use it in GPL applications.
Anyway, does anyone know if CPL is also incompatible with QPL?
Cheers,
Kevin
Well, the point of showing Eclipse running on IKVM is to show that our JIT engine is mature enough to run something of the complexity of IKVM and with the complexity of Eclipse running on top of it.
Great kudos should go to the hackers that develop GNU Classpath, without which
Eclipse on IKVM would not be possible.
Miguel.
Um, maybe it would be a good idea then to give these kudo's in public whenever possible. 'Ikvm + classpath == eclipse' running means you're using Java, C#. (Irrelevant aside: I always feel like I'm in Amsterdam when reading about that language -- hiss, see? hash!) Anyway, what I like classpath and gcj for is portable code that compiles to native code and still used garbage collection.
C# isn't about hash, it's about Chash.
> At aKademy he will be talking about how to design intelligent, Qt-style APIs.
This sounds very, *very* interesting. Will there be a transcript of the talk for those of us who won't be at the aKademy? Thanks! :)
Will you guys (KDE community) be doing any outreach on Software Freedom Day as well? Globally, we'll be handing out about 10 000 packs containing TheOpenCD and Knoppix (both in special editions) at 18 locations. I still have some materials (printed CD covers + sleeves) available if anyone in Ludwigsburg wants to hand stuff out. Our customised Knoppix CD (3.4) has KDE 3.2 as default, though you would probably want you make a fresh one with KDE 3.3 (Knoppix 3.6?).
See info about the materials here:
http://www.softwarefreedomday.org/article.php?story=20040726161207293
and the aKademy wiki page on our site here:
http://softwarefreedomday.org/wiki/index.php/AKademy%2C_Ludwigsburg
- Henrik
"Native code is and will be the solid basis of every successful computing platform, simply for its flexibility, its performance, and its low memory consumption."
That's FUD. First of all, in theory there is no difference in performance between native code and managed code. Both are just different forms of expressing code. Every output of a native compiler can be generated by a runtime that uses managed code. And every optimization that a JIT can do is also possible with native code. The only difference is the point at which the native code is generated, but as a JIT can cache it, there is no real problem.
Things are a little bit different if you consider how much work different types of compilers need. Writing a simple JIT compiler is usually more work than writing a simple static compiler. If you want to write a static compiler with a moderate number of optimizations (like gcc), this is less work than a JIT compiler.
However, if you write a truly dynamic compiler, doing it with managed code is far easier than with native code, because managed code is easier to work with than native code. Classic native compilers like gcc are static, they just compile the code, one compilation units after the other. After compilation the code won't be changed anymore. The linker does not modify the code either, so it can not inline a library function, leaving out a lot of potential for optimization. Dynamic compilers do not have such boundaries for optimization. All code is equal for them, because all code is stored in the prefered way for optimization.
There are some optimizations that a static compiler can't do and which are only possible for dynamic compilers. For example the JavaVM checks whether a virtual method is actually overridden by a class. If not, it is treated like a non-virtual method and can enjoy optimizations for non-virtual methods like inlining. Because Java allows loading new classes while the app is running (like gcc's C++ does with dlopen()), it may be necessary to recompile the method as real virtual method while the program is running. A static compiler can't do optimizations like this, it always has to optimize for the worst case.
That's why, in the end, managed code will win. Or at least if you want the same kind of performance with native code, it will be much more work. If today managed code does not match the speed of native code yet, you can be sure that MS has sufficient resources to make it faster in the long run.
"That's FUD. First of all, in theory there is no difference in performance between native code and managed code."
Please tell me that's some sort of joke. First of all, a managed environment has not been proven simply because major things have not been re-written in one. Once a whole desktop is re-written with a managed environment and it performs acceptably then we'll know. As it is, that hasn't happened and not even Microsoft is going to re-write Windows for the CLR - although of course, there will be 'seamless' interfaces so no one notices :).
As we have seen with various benchmarks, managed code can be fast. However, once you start running everything through it and have six or seven applications open at the same time, all targetted for managed code, that is somewhat different. That's where Java has been found wanting over the years. Because of garbage collection, and the overhead of the managed environment itself, it is always going to consume more resources than native code.
For providing an environment for desktop applications that are easier to develop for, easier to debug (potentially) and work with, managed code is certainly a plus. However, in terms of a system as a whole you're just never going to get managed code everywhere (and Microsoft will certainly never achieve it). Managed code will not win, but it will be useful for some tasks.
"First of all, a managed environment has not been proven simply because major things have not been re-written in one. Once a whole desktop is re-written with a managed environment and it performs acceptably then we'll know."
You don't have to. There's simply not a single optimization that native code can do and managed code can't. (I didn't doubt that implementing a managed code runtime is more work)
"Because of garbage collection, and the overhead of the managed environment itself, it is always going to consume more resources than native code."
Gargage collection has nothing to do with managed/native code, you can have managed code without (There are C compilers for IL). GC a language issue.
"For providing an environment for desktop applications that are easier to develop for, easier to debug (potentially) and work with, managed code is certainly a plus."
No, these are no advantages of managed code. I would even doubt that the programmer has to notice the difference. The reason why developing with Java or .Net is nicer than with gcc is just the age and history of make, gcc and the whole unix build system. Advantages of managed code are that it is usually much easier to
- check whether code is secure / run it in a sandbox
- run the same binary on multiple CPU architectures
- manipulate the code (optimize, analyze etc)
Note that all this is possible with native code, it's just more difficult for the developer of the system.
> - check whether code is secure / run it in a sandbox
These are features that must be built in the OS, not in the
programming language or the VM.
> - run the same binary on multiple CPU architectures
This is made possible by building multiple personalities
in the CPU. Furthermore, if you can run a JIT compiler you
can translate the code from a reference architecture to
another on the fly.
> - manipulate the code (optimize, analyze etc)
This can be done with native code even better.
There is no need for virtual machines and 'managed code'
when you have the OS and the computing architecture handling
all this stuff.
People should spend more time improving the OS instead of
trying to reinvent an OS on top of the OS.
/Gian Filippo.
>> - check whether code is secure / run it in a sandbox
>These are features that must be built in the OS, not in the
>programming language or the VM.
Java implements it in the VM without any help from the OS...
>> - run the same binary on multiple CPU architectures
> This is made possible by building multiple personalities
> in the CPU. Furthermore, if you can run a JIT compiler you
> can translate the code from a reference architecture to
> another on the fly.
As I said, it is possible (WinNT did this to run x86 software on Alphas), it is just less work to translate from an intermediate language like IL or Java's.
> - manipulate the code (optimize, analyze etc)
> This can be done with native code even better.
I doubt that. It's extremely hard to have any automatic optimizations on assembly code like x86. First of all, assembly has more instructions do deal with, and in x86 they are not exactly nice. With native code you have much more knowledge about the program's intentions. You know which method is virtual (and thus could be converted into a non-virtual), you can easily see where is mutex is acquired (all modifications without this knowledge can be pretty dangerous) and so on. In many cases you probably need to convert the assembly code back to an intermediate language anyway, because it's hard to modify code with a limited number of registers.
And then, even if you manage all this, you also have the problem that your code will be very platform specific and is useless on other architectures.
>>People should spend more time improving the OS instead of trying to reinvent an OS on top of the OS.<<
Basically I agree with that, but Linux in today's form is pretty useless in this respect. It just lacks infrastructure for too many things that would be needed.
Hmmm,
Every line of code, that is run on a system is native.
Otherwise, it would not run.
Or am I wrong here?
(I assume that if the cpu contains emulators that the emulated code is native too, as that's in the cpu, so I understand with native, something the cpu understands).
Your JIT compiler compiles the program to native code.
It might do it in a different way than a "static" compiler.
Quote:
"And then, even if you manage all this, you also have the problem that your code will be very platform specific and is useless on other architectures."
The same with JIT. When it's compiled, it's platform specific.
JIT compiling is good for small programs, not for large ones (in my opinion).
I don't think my PII 233MHz would like "office"-size JIT compiled programs.
And I've heard stories, although I can't confirm them, or know if they are true, that the garbage collector etc... sometimes delete the wrong objects, or think they can delete them.
I've always found that garbage collection is something for people who do not know how to program (personal opinion, although I understand it's faster to code with). I prefer using profiling and debugging tools to analyse my programs.
"Every line of code, that is run on a system is native. Otherwise, it would not run."
The main difference is in which form the code is distributed, and which kind(s) of code are stored at disk.
With native code you distribute and store code that's optimized to run on the CPU with as few modifications as possible (usually you still need to link before actually running).
'Managed code', byte code, IL and whatever they are called are optimized for other purposes. For example for being easy to translate to a variety of CPU architectures, for being secure (programs can only access their own memory), for being easy to manipulate, for being small, and so on. Not necessary all of them at the same time, depending on the designer's goals.
"The same with JIT. When it's compiled, it's platform specific."
Yes, but you can rely on having the byte code available. This byte code is the same on every platform, so you can manipulate the program in a platform-independent way, even if the runtime later changes turns it into native code.
"JIT compiling is good for small programs, not for large ones (in my opinion).
I don't think my PII 233MHz would like "office"-size JIT compiled programs."
Compilation results can be cached though.
Compilation results can be cached though.
You've said this a few times, so if it can be cached, why isn't it? Apparently it's not as easy as you believe.
I think you can argue all you want about how managed code is/can be just as fast or faster than native code, but the fact is: it isn't. Perhaps it's possible to code a program in Java/C# that's just as fast as the same one in C++, but in practice, this doesn't happen. Off the top of my head, the non-trivial java apps that I use or have used:
Eclipse. Useable, but still very slow.
Azureus (java bittorrent client): Painfully slow
Freenet. (java) Huge memory hog, takes forever to start up.
Frost (java freenet client) Same as above, almost unuseable on a 256MB RAM machine.
I've had some better experiences with C# apps but still nothing close to native speed.
Theoretically, managed code may be just as fast, but realistically, with today's major platforms, it isn't. And you can't get around that reality with words.
"I think you can argue all you want about how managed code is/can be just as fast or faster than native code, but the fact is: it isn't."
Give it some time. C# is still very new. Java already made a lot of progress from it's early beginnings, but they started with an interpreter and Java is not exactly a language that has been designed with performance in mind.
I never claimed that managed code is faster today. I just complained about the statement that native code will be faster and whatnot for eternity.
True. I misread your original post a bit.
In any case, I think eventually it will become a moot point.
If I can implement features much faster in a mangaged language than in a static one then I can certainly live with the end result being a bit slower. Current average hardware isn't quite at the point where we can do this everywhere but it will be. Just like the transition from assembly to C to C++. The trend is towards languages that allow faster programming at the expense of some speed.
Maybe in 50 years I'll be coding in C##++ on a PIC :)
> Java implements it in the VM without any help from the OS...
That's the problem. It's the wrong place.
> it is just less work to translate from an intermediate language
Every language is an intermediate language, to some
extent. It is the VM idea that is useless, not the idea that
the code produced by the compiler can be optimized on the
fly for the target architecture or the target CPU.
> Linux in today's form is pretty useless in this respect
I completely agree. That's sounds like a good reason for
improving it.
The main motivation for building VMs it to run code on a
wide-spread architecture and OS without writing that code
for the specific OS and, most importantly, its API. It is a
good motivation, I think, but is not going to last once you
have the same code running on that architecture and that
OS natively.
Soon we'll see Microsoft Windows running native Linux
programs and Linux running native Windows programs. The
platform that will win will be the platform that offers
the better facilities for running those programs securely,
with the best scalability and the best manageability.
There is an alternative, though. Somebody comes with a new
computing architecture that requires programs to be written
for a new API or a new programming language. Who controls
the API, again, controls the platform. It is not a matter
of having the best OS anymore. It becomes a matter of having
the vast user-base required to impose the API. The idea is
very good, it it works.
/Gian Filippo.
Dynamic compilers do not have such boundaries for optimization. All code is equal for them, because all code is stored in the prefered way for optimization.
Good static compilers have no such boundries either --- LLVM will do inter-library optimization at link time. Also, java or CLR bytecodes are a terrible representation for optimization. Before doing any optimzation on CLR code, Mono first converts it from the stack-oriented CLR model to a virtual-register oriented SSA form.
For example the JavaVM checks whether a virtual method is actually overridden by a class.
Static compilers do this too. For high-performance native compilers for Lisp or Smalltalk, such optimzations are standard operating procedure.
A static compiler can't do optimizations like this, it always has to optimize for the worst case.
No it doesn't. The capability to do optimization in the precense of dynamic code has nothing to do with native code and everything to do with having a compiler available at runtime. Static compilers for Lisp can do these sorts of optimizations, even though they generate native code.
"Good static compilers have no such boundries either --- LLVM will do inter-library optimization at link time."
Hmm? I don't know LLVM very well, but I'd call LLVM a dynamic compiler with managed code (their virtual instruction set).
"A static compiler can't do optimizations like this, it always has to optimize for the worst case.
No it doesn't. The capability to do optimization in the precense of dynamic code has nothing to do with native code and everything to do with having a compiler available at runtime."
If they can compile at runtime, I would call them dynamic compilers with managed code. The only difference may be that they don't use an intermediate language, but work directly on the source code.
"Managed code" and "native code" have rather precise definitions. Managed code is stored as an abstract bytecode, and compiled *on-the-fly* to machine code. Native code is stored as machine code, and executed directly. JIT's and compilers available at runtime blur the lines, but the fundemental distinction is still that JIT's compile on the fly by default, and cache native code as an optimization, while native compilers use native code by default, but make it possible to regenerate that code. Certainly, using the accepted definitions of "native code compiler" and "managed code compiler," Java/C# are managed code platforms, while LLVM/CMUCL/etc are native code platforms.
Performance and memory consumption is not only a matter of the instruction set and how the runtime system executes it, but depends largely on data structures, the object model itself and on what kind of APIs and programming style the programming languages in use imply or encourage. This is not about static versus dynamic compiler optimizations techniques or the benefits of having another layer on top of the CPU's instruction set. This is about comparing how developers today write native applications on both Microsoft Windows and Linux, and how they write Java or .NET applications. So I could have said "C/C++" instead of "native code".
C++ does not necessarily mean speed, even if the language makes it harder to write slow programs. Just imagine a C++ programming style where all functions and inheritances are virtual, all classes inherit from a common base class that does central object registration, all objects are always allocated on the heap and every single pointer or array access is checked beforehand. If you then create an API that encourages the creation of massive amounts of small objects that call into each other through listener interfaces, and requires dynamically checked type casts to narrow objects, then you would not end up with the speed we see with Qt and KDE today, and for that matter any other well written C/C++ software.
What makes C++ code so fast is not only that the CPU understands the assembly directly, at least not exclusively. Most of its speed stems from the powerful language. It's good when a JIT compiler can optimize function calls, but it's better if you don't even need a function call. It's good when an allocator is really fast when allocating small objects, but it's better if you don't have to allocate anything at all. And so on.
This is why C++ is here to stay, on both Microsoft Windows and on Linux.
"Just imagine a C++ programming style where all functions and inheritances are virtual, all classes inherit from a common base class that does central object registration, all objects are always allocated on the heap and every single pointer or array access is checked beforehand"
Well I've used Objective-C/OpenStep (ie Apple Cocoa) a lot. It was easily fast enough 10 years ago, and it certainly is now. In Objective-C every method is effectively virtual, including the equivalent of 'static' methods. All classes inherit from NSObject, and the NSArray class doesn't allow you to drop off the end and crash. All Objective-C instances are allocated on the heap.
"If you then create an API that encourages the creation of massive amounts of small objects that call into each other through listener interfaces, and requires dynamically checked type casts to narrow objects, then you would not end up with the speed we see with Qt and KDE today"
This is where Objective-C diverges from the java approach. In Cocoa there are many fewer instances required to do something, and the inheritance heirachies are flatter. You can use delegation or categories (ie dynamically adding/changing methods to running instances) where you would need to subclass in java. Listener interfaces are a design disaster as they ignore the dynamic reflection possibilities in java. Qt takes a static language and makes it more dynamic via the moc and signals/slots, while Swing java takes a dynamic language and implements a clunky api based entirely on static typing.
I'm fussy and I don't like a lot of software, but I really think Qt/KDE is the best application framework since NeXTStep (especially with the KDE 3.3 ruby bindings).
"All Objective-C instances are allocated on the heap."
BTW this is not strictly necessary in Objective-C, and neither in Java or C#. They could use stack-allocation for many short-lived objects. It would require more/better analysis of the code though, to determine for which objects it is possible.
(Similarly all the other performance problems listed by Matthias can be solved by more intelligent compilers - we're just not there yet)
"BTW this is not strictly necessary in Objective-C,"
It isn't possible in Objective-C - you create an instance by messaging a class object, and then sending an initialization message to the new instance. Only string literals of the form @"mystring" can be statically allocated
Instances are allocated in 'autorelease pools'. You can create you're own autorelease pools, and you would do that if you have a lot of short lived objects.
C# already allows to allocate short lived objects on the stack.
Lot's of compilers are already at that point. Stalin, CMUCL, d2c, Bigoo, etc, all do these sorts of optimizations. It's just something that hasn't come to C# and Java compilers yet (and will never come to C/C++ compilers, because of their semantics).
umm i very much doubt this...
c is here to stay. but c++ i doubt.
dynamic optimisation is not possible
without a jitting mechanism and a
language that doesn't overspecify.
good jit compiler do inlines of
the methods based on execution
profiling. the best c++ compiler
can only guess. and slowly at
that. sure with profile feedback
it can do much better. but still
nowhere near perfect.
also its because of the
overspecification of interfaces
in c++ that we don't see automatic
removal of virtual keywords based
on profiling / path analysis
same goes for the need to
explicitly use zone allocators
rather than have the runtime
figure it out at er.. yeah
runtime :)
Alex
"c is here to stay. but c++ i doubt.
dynamic optimisation is not possible
without a jitting mechanism and a
language that doesn't overspecify.
good jit compiler do inlines of
the methods based on execution
profiling. the best c++ compiler
can only guess. and slowly at
that. sure with profile feedback
it can do much better. but still
nowhere near perfect."
I think you've lost the plot here. Which is most important - a gui framework based on a more expressive dynamic language such as ruby, or a statically compiled language which is a bit faster at runtime because of jit compilation? And should a toolkit be written in the same language that end users will program it in?
Which gui toolkits are so awesome that they just need a bit of jit'ing to make them perfect? MFC, Swing, WinForms, Delphi, Taligent, WxWidgets, GTK+, GTK#. To me they're all just a bunch of dogs compared with Cocoa or Qt/KDE.
C# is a complex systems programming language (as a systems programmer I find it fun), but it is most certainly not a RAD language that everyday programmers will feel comfortable with. Ruby doesn't have jit'ing, but does anyone care, it's just much easier to get stuff done with ruby (or python) than C# or java or C++. Why can't we talk about the programming language usability vs. efficiency tradeoff? The Qt toolkit will never be as popular as it should be while it is C++ specific - a jit'ed C# version wouldn't solve that problem at all.
The other problem with JITing runtimes tends to be memory pressure, which impacts desktop performance far more than CPU overhead these days.
I think most of the advantages of JIT compiled code are bogus. Security is not to me a convincing argument as the only security you gain is via type-checking, which is not fine grained enough for real world security. For instance you can prove the code does not use the File object, but you cannot restrict which files it can access and when (at least, not without seriously hacking the class library sources itself). A worse-is-better approach seems to be an SELinux style one, where security is applied at the process level rather than the type level, but policy is easier to specify and more flexible.
CPU independence - please. The CPU is already abstracted by the compiler, no need to do it twice given the dominance of x86 (on the desktop, it may be more useful on the server).
Easier to examine the code: well, there is no rule that says you cannot have reflective native code, and indeed gcj/java does exactly this.
I don't have a strong opinion on Java vs Mono, but right now it seems they are evenly matched. Mono has the community momentum and nicer APIs (at least for GTK developers) but GCJ has a more traditional toolchain and produces native code that is easily integrated with existing systems. It also seems to have easier integration with native code via the CNI and the upcoming GDirect thing (similar to P/Invoke). I do not know which will "win", but I suspect they will both be strong. Right now Mono seems to be in the lead if only because there aren't any "real" desktop Java apps outside Eclipse that are in wide use on Linux.
I don't have an opinion on this Java vs Mono but somehow, seing the debate reminds me a bit of the java momentum, when java was supposed to replace everything and of the orbit vs dcop/kpart debate too.
We sure like to have heated discussion with technical arguments...
"The other problem with JITing runtimes tends to be memory pressure, which
impacts desktop performance far more than CPU overhead these days."
These problems can easily eliminated by caching the results. The advantage is
that you can do best-case optimizations that a static compiler can't afford, - if the worst-case happens a dynamic compiler can still recompile at runtime. But that doesnt mean that it needs to happen frequently.
"I think most of the advantages of JIT compiled code are bogus. Security is not to me a convincing argument as the only security you gain is via type-checking, which is not fine grained enough for real world security. "
There are two things that you need for Java-like sandboxing:
1. you need to integrate security checks in all libraries that communicate with the world outside of the process/VM and should be used by the sandbox'd code
2. you need to make sure that the only way to access any part of the system is using the public APIs
With native code, number 2 becomes very hard.
"For instance you can prove the code does not use the File object, but you cannot restrict which files it can access and when (at least, not without seriously hacking the class library sources itself)."
Java allows this:
http://java.sun.com/j2se/1.4.2/docs/api/java/io/FilePermission.html
"A worse-is-better approach seems to be an SELinux style one, where security is applied at the process level rather than the type level, but policy is easier to specify and more flexible."
This allows only very low-level security. For example Java allows that applets in a sandbox can create windows, but only if the window has a large warning sign. Try enforcing that with SELinux.
"CPU independence - please. The CPU is already abstracted by the compiler, no
need to do it twice given the dominance of x86 (on the desktop, it may be
more useful on the server)."
Even if there would be only x86, there are already more than enough variants through extensions. 32-bit and 64-bit, MMX, 3DNow, SSE, SSE2...
Managed code also has the advantage that it will give x86 CPU vendors more room for improvement. Right now they have the problem that even if they come up with a good (low-level) extension, there will hardly any software that makes use of it.
"Easier to examine the code: well, there is no rule that says you cannot have reflective native code, and indeed gcj/java does exactly this."
That's examining at runtime and not what I meant. I was talking about analysing and modifying programs that are stored on the disk in their distribution format.