[KDE Dot News]
 faq
 flatforty
 contribute
 subscribe
 configure
 search
 rdf

 main


  KDE Developer's Corner: Common Programming Mistakes
Developer Posted by Zack Rusin on Tuesday 23/Mar/2004, @18:17
from the hackers-for-hackers dept.
I'm very happy to announce a new document for inspiring KDE hackers, entitled "Common Programming Mistakes". The document aims to combine the experience of many of the top KDE developers about the Qt and KDE frameworks dos and don'ts. The way they were usually passed on to the next generation was by letting the youngsters make the mistakes and then yell at them in public. We will go over things, which are not necessarily bugs, but which make the code either slower or less readable. The document will be expanding as we see the need for it.


<  |  >

 

  Related Links
 ·   Articles on Developer
 ·   Also by Zack Rusin
 ·   Contact author

Thread Threshold:

The Fine Print: The following comments are owned by whomever posted them.
( Reply )

Fabulous!
by Max Howell on Tuesday 23/Mar/2004, @20:11
Some of those tidbits are like gold! I'm especially annoyed at not realising the QValueList iterator one. I should have realised that at some point.
[ Reply To This | View ]
  • Re: Fabulous!
    by LMCBoy on Wednesday 24/Mar/2004, @07:47
    I agree, these are pure gold! There are a lot of these "mistakes" in kstars...looks like I have some work to do :)
    [ Reply To This | View ]
Question: someCString.length()
by James Richard Tyrer on Tuesday 23/Mar/2004, @21:49
As an old FORTRAN programmer, I know that you shouldn't do this:

QCString someCString = ...;
for( int i = 0; i < someCString.length(); ++i ) {
//Do something
}

You should do:

QCString someCString = ...;
int temp = someCString.length()
for( int i = 0; i < temp; ++i ) {
//Do something
}

But, my question is: doesn't the optimizing compiler take care of this?

And my pet programing mistake. What is wrong with the example??

Answer: The index of a loop should be: "unsigned int" (as should: "temp" in my example!). And when you get the warning about comparison between signed and unsigned integers, you should fix it.

--
JRT
[ Reply To This | View ]
  • Re: Question: someCString.length()
    by Jay on Tuesday 23/Mar/2004, @21:58
    Why does it matter if you make them unsigned? Do they not still take up the same memory space? Isn't there no use for the extra positive ints? Just curious.
    [ Reply To This | View ]
    • Re: Question: someCString.length()
      by André Somers on Tuesday 23/Mar/2004, @22:27
      I'm not sure, but I guess you'll run into trouble when the loop gets very large, larger then 2^(len(int)-1) iterations, because the last bit is used for positive/negative for normal ints, and is just another bit for unsigned ints.
      [ Reply To This | View ]
    • Re: Question: someCString.length()
      by James Richard Tyrer on Wednesday 24/Mar/2004, @00:25
      I guess that writing code correctly is part of the difference between being a software engineer and being a hacker.

      :-)

      But seriously, it will help you find bugs, unless you simply ignore the compiler warnings.

      --
      JRT
      [ Reply To This | View ]
      • Re: Question: someCString.length()
        by Dawnrider on Wednesday 24/Mar/2004, @04:31
        The thing with using uints, is that while using a Uint will allow higher values than an int, we're still talking 32 bit numbers here. If you really are worried about large loop indices, then use a long at least. Mind, even then, there is always potential to overflow the type.

        Honestly, as a programmer, you should always be aware of the potential iterations of any given loop, and ultimately, you should put checks in place in very large loops to ensure that you can't roll over.
        [ Reply To This | View ]
        • Re: Question: someCString.length()
          by Spy Hunter on Wednesday 24/Mar/2004, @09:25
          longs aren't longer than ints on x86; they are 32 bits. C's types are dumb. IMHO there should be int8, int16, int32, int64, and int128. That way there would be no confusion about the length of types, and ints wouldn't magically change size and break all your code when you recompile for Alpha or x86-64 or 286 or Z80.
          [ Reply To This | View ]
          • Re: Question: someCString.length()
            by AC on Wednesday 24/Mar/2004, @14:35
            Include C99's stdint.h, and you get types like uint8_t, uint16_t, etc...
            [ Reply To This | View ]
            • Re: Question: someCString.length()
              by Robin on Thursday 25/Mar/2004, @15:01
              The point is that length() returns an unsigned int that could be >= 2^31 which will be interpreted as a negative number.

              Funny thing is that since the loop comparison involves a signed and un unsigned the signed will be cast to unsigned before comparison and the loop will go on as expected. There might be other problems with an "negative" index, but not in the example.
              [ Reply To This | View ]
  • Re: Question: someCString.length()
    by gregj on Tuesday 23/Mar/2004, @23:08
    How do you think compiler should optimise that ?
    This would be rather strange, since that method can be:

    for( ;a.bla();a.bleble() )
    {
    a.do_something_nasty();
    }

    in example above a.bla() can be testing for something, and return true or false.
    So there is no way compiler should even approach optimisation.
    I know that code above suffers from - can be neverending story. But it's just example :-D
    [ Reply To This | View ]
    • Re: Question: someCString.length()
      by Ingo Klöcker on Wednesday 24/Mar/2004, @00:09
      The compiler can easily check whether the string is modified inside the loop. If it's not modified then the value of string.length() will be calculated only once.

      And in your example: If bla(), bleble() and do_something_nasty() are all const (i.e. they don't modify a) then a.bla() can (and will) be calculated before the loop. I only wonder what happens if one of those methods change a mutable member variable of a.
      [ Reply To This | View ]
      • Re: Question: someCString.length()
        by Anonymous on Wednesday 24/Mar/2004, @08:41
        Because of exactly this compilers must not rely on 'const' as method attributes for optimization purpose, and that's exactly why blah() will be called each time.
        [ Reply To This | View ]
      • Re: Question: someCString.length()
        by Marc Mutz on Wednesday 24/Mar/2004, @09:41
        Two things:

        1. The compiler doesn't see the implementation of QCString::length() and therefore can't check that
        2. QCString::length() doesn't do:

        int QCString::length() const {
        return const_cast<QCString*>( this )->mLen++;
        }
        [ Reply To This | View ]
      • Re: Question: someCString.length()
        by Ravi on Wednesday 24/Mar/2004, @09:52
        This is an issue that has been discussed a million times already in several fora. The best description of the problem and its subtleties can be found at the venerable Guru of the Week site:
        <A HREF="http://www.gotw.ca/gotw/081.htm">Constant Optimization?</A>
        It is a little dated if you are using a globally optimizing compiler along with monolithic applications, but for the rest of us mere mortals, it is accurate. For KDE, assuming the use of dynamic libraries, it is accurate.

        Hmm, I can't select anything other than "Plain Text" for encoding with Konq from HEAD.
        [ Reply To This | View ]
        • Re: Question: someCString.length()
          by Waldo Bastian on Wednesday 24/Mar/2004, @13:57
          > Hmm, I can't select anything other than "Plain Text" for encoding with Konq from HEAD.

          That's because dot.kde.org only offers "Plain Text" to mortals.
          [ Reply To This | View ]
    • Re: Question: someCString.length()
      by James Richard Tyrer on Wednesday 24/Mar/2004, @00:11
      > How do you think compiler should optimise that ?

      One of the things an optimizing compiler does is move invariant (constant) expressions outside of the loop.

      --
      JRT
      [ Reply To This | View ]
      • Re: Question: someCString.length()
        by ac on Wednesday 24/Mar/2004, @01:32
        Yes, but how is the compiler to know that the expression is invariant? If
        .length() is an inline it may be able to work it out from the fact that someCString
        is never updated, but if it isn't it's hard to see how this would work. Or does C++
        have some way of telling the compiler that length() has no side-effects?
        [ Reply To This | View ]
        • Re: Question: someCString.length()
          by Eggert Ehmke on Wednesday 24/Mar/2004, @02:59
          Yes of course. QCString::length () is defined as
          uint length () const
          [ Reply To This | View ]
          • Re: Question: someCString.length()
            by glettieri on Wednesday 24/Mar/2004, @03:09
            Well, this means that lenght() does not modify (unmutable) members of the object it is applied to. It does not mean that lenght() has no side effects (it can modify global variables, for instance).
            [ Reply To This | View ]
        • way of telling that length() has no side-effects
          by Humberto Massa on Wednesday 24/Mar/2004, @03:07
          just put const: it indicates exactly that (at least for that invocant, recipient or object [this] :-), length() will not change anything that is not marked mutable... ;-)

          size_t QCString::length() const {
          ...
          }
          [ Reply To This | View ]
          • Re: way of telling that length() has no side-effects
            by ac on Wednesday 24/Mar/2004, @09:20
            Yes, but that's not enough for the compiler to treat it as invariant. You'd need to say
            that it didn't touch any global state either.
            [ Reply To This | View ]
          • Re: way of telling that length() has no side-effec
            by CK on Thursday 25/Mar/2004, @00:58
            Just add 'return random();' to that member function, and you cannot rely on that returning the same number every time. The function still won't modify the object, but it illustrates why const doesn't allow the compiler to optimize away calling length each time through.
            [ Reply To This | View ]
            • Re: way of telling that length() has no side-effec
              by Jon on Thursday 25/Mar/2004, @09:49
              True.

              Help is at hand, however. gcc allows you to specify that a function called with the same arguments always returns the same values, and thus can be optimised out of a loop, by use of __attribute__((const)). This is a gcc extension specifically for cases like the above.
              [ Reply To This | View ]
  • Re: Question: someCString.length()
    by matus telgarsky on Tuesday 23/Mar/2004, @23:32
    I disagree about using the temp variable. There are two obvious reasons to use it (a) avoid function call and computation overhead of length() call--or whatever you use to limit the loop--by calling once and saving/caching, and (b) in concurrent environments length() could change during the loop, so you're screwed.

    well, most libraries (stl most definitely) inline the call to length()/size(), and the function itself just returns an attribute of the class, so there is no advantage over the caching method you state.

    for (b), you'd probably want to use a mutex or something like that to shield in that situation anyway.

    So basically what that method does is create an unnecessary obfuscation and more code to read.

    unsigned int may seem more correct, however (a) most places in the code don't need this kind of protection (you know you won't go that high), (b) in situations where it will, there's probably a smarter constraint (don't forget that architectures with different word sizes define unsigned int differently!), and (c) no way I'm going to type more than I need.
    [ Reply To This | View ]
    • Re: Question: someCString.length()
      by Ingo Klöcker on Wednesday 24/Mar/2004, @00:19
      > well, most libraries (stl most definitely) inline the call to
      > length()/size(), and the function itself just returns an attribute of the
      > class, so there is no advantage over the caching method you state.

      The problem is that QCString is a pretty stupid class (it's a normal C-string with reference counting).

      http://doc.trolltech.com/3.3/qcstring.html#length says:
      "Returns the length of the string, excluding the '\0'-terminator. Equivalent to calling strlen(data())."

      So QCString::length() is not O(1) but O(n) where n is the length of the string and thus highly inefficient.
      [ Reply To This | View ]
      • Re: Question: someCString.length()
        by matus telgarsky on Wednesday 24/Mar/2004, @02:49
        well, that sucks. I'm not a QT guy, so my comment was uninformed in this context...
        [ Reply To This | View ]
        • Re: Question: someCString.length()
          by James Richard Tyrer on Wednesday 24/Mar/2004, @17:46
          It appears that the problem is ... .

          I suppose that this isn't the best place to state that C syntax sucks. But it does. One of the main offenders is the 'for' loop construct. This can be used for many kinds of loops, but code efficiency suffers. If we used an indexed loop, then the problem wouldn't exist because it is required that the parameters for an indexed loop ARE constants.

          QCString someCString = "...";
          START(N=0, someCString.length())
          //Do something
          LOOP: END

          Much better. And, you don't have to define: "N" because it isn't really a variable (but it is unsigned).

          --
          JRT
          [ Reply To This | View ]
    • Re: Question: someCString.length()
      by James Richard Tyrer on Wednesday 24/Mar/2004, @00:20
      > So basically what that method does is create an unnecessary obfuscation and
      > more code to read.

      I think that my point is that you no longer have to do this because the compiler will do it for your.

      > unsigned int may seem more correct

      Yes that is why you do it, because it is correct. Integers that can never be less than 0 should be unsigned.

      > no way I'm going to type more than I need.

      Yes, it is much better to hunt for bugs than to take the extra effort to write the code correctly.

      :-) And I guess that you, therefore, would never want to try PL/Fiv.

      --
      JRT
      [ Reply To This | View ]
      • Re: Question: someCString.length()
        by matus telgarsky on Wednesday 24/Mar/2004, @02:56
        Hi,
        I was careful to say "may _seem_ more correct", because it's naive to assume that unsigned int is a good idea on all architectures, and especially in all contexts. Usually there is a better choice if you are worry about overflow, like ptrdiff_t, size_t, class::type, etc.

        oh and while I'm being anal, isn't using != rather than < much prettier in many loops? I use that styling very often, and definitely whenever dealing with sequential incrementation of integers..
        [ Reply To This | View ]
        • Re: Question: someCString.length()
          by Anonymous on Wednesday 24/Mar/2004, @04:08
          It's not only about the overflow. Suppose you want to do something like
          textfield.setlength(i);
          where i is a loop index

          Now Textfield::setlength should really be defined as
          setlength(unsigned int l)
          because you don't want to have to handle the case of a negative length for a textfield.

          If you have warnings turned on the call to textfield.setlength(i) will give you a warning. And you really want to avoid warnings because too many of them will start to hide those that you really should see because they indicate a possible bug.
          [ Reply To This | View ]
          • Re: Question: someCString.length()
            by matus telgarsky on Friday 26/Mar/2004, @02:28
            yes yes, agreed, I write that way all the time for the reason you stated. But the fragment in question was just part of a routine, and it's possibly we could know we don't need the extra amount of int afforded to us this way.

            but yeah for any API type stuff, whenever I take something vaguely like an offset I use unsigned so that the compiler can help me.
            [ Reply To This | View ]
  • Re: Question: someCString.length()
    by Paul Balez on Tuesday 09/Aug/2005, @06:45
    Am i right to propose this oslution :


    QCString someCString = ...;
    for( unsigned int i = 0; someCString.data()[i] != 0; ++i )
    {
    // Do something
    }


    This way i values are visited sequentially,
    and (if someCString has a \0 ending char) then
    string length (string end) is tested with a O(1) complexity.

    Constraint : the string size can stay unmodified,
    or can increase, but should not decrease.

    To always respect the constraint : someCString could be const (if possible)
    [ Reply To This | View ]
Good Stuff
by David on Wednesday 24/Mar/2004, @09:09
This is a good thing to see happening, and I hope you can perhaps turn this into a series.

Although I'm not familiar with the Qt/KDE aspects (yet!), one thing I will say is that people 'just hacking' (and people who should know better :)) ignore compiler warnings all too often. "Oh, well, it compiles and the application runs...". "Oh it's OK, it'll get caught by the compiler.". Those warnings are there for a purpose, and more often than not, are the sign of redundant (and/or inefficient) code - something that you really do not want in a project the size and scope of KDE. The worst case scenario is that your code is compiled with a different compiler version, or on a different platform, and the compiler throws a wobbler. A very real problem is that in a project the size of KDE the problems tend to multiply the more that people do this, and remember that people may actually be reusing your code and depending on it. Please, if possible, try and work for a handful of minutes to get them darn warnings down to zero, if possible. You'll be satisfied with a cleanly compiling bit of code, and it will save you (and others) a ton of time and effort in the future.

Something that also came home to me in the 'QString is Null' example (a bit of a mild example in this context) was that people do not use the existing methods available to them enough. If you reuse existing methods and code available as much as possible then you spend less time getting something done, you know where any problems occur, if and when they do, you know how to solve them much quicker, more developers can help you because they are using the same methods and the same code as you.... There is also a responsibility on you to try and expose as much useful code to others as you can, and make sure that everyone knows about it :). Because of the size of a project like KDE, if you do these things, the advantages, speed and efficiency just multiply exponentially. I'm sure we all know this anyway.

None of this is code-specific in any way, but then again, good, productive programming never is - it is unseen a lot of the time unfortunately.

Just a couple of pennys' worth.
[ Reply To This | View ]
  • Helpful GCC options
    by Simon Farnsworth on Friday 26/Mar/2004, @10:28
    As something useful for open source developers; compile releases with "-O -Wall -Werror", and make sure it compiles (obviously, -O could be -O2 or -Os, or whatever). This turns all warnings into errors, and makes sure you can fix them. Where a warning is something silly (some lints require you to cast the result of printf() to void), it forces you to put something into the code that makes your intent quite clear; -Wall only turns on warnings that make your code unclear, and thus helps with making sure it's correct. -W also helps with finding errors, since it warns where your code is doing something where the "natural" interpretation doesn't fit the spec.

    If you're developing code, it's well worth turning on as many warnings as possible ("info:/gcc/Warning Options" for KDE users who've got the info documentation installed), and using -Werror to make sure that your release gives no warnings on your compiler. It won't catch every bug, but it'll force you to look at code that's not 100% clear and perfect.

    FWIW, when I'm finishing code, I use "-O2 -Wall -Werror -W -Wfloat-equal -Wundef -Wshadow -Wpointer-arith -Wcast-qual -Wwrite-strings -Wconversion -Wredundant-decls". It's a lot of warnings, but even though I'm the only user of my code, it cuts down on the trouble I have if ever I want to go back to an old project and play with it. Plus, if ever I think my code is worth releasing (usually it's not), I've got it into a nice state.
    [ Reply To This | View ]
Good Stuff
by Frerich Raabe on Thursday 25/Mar/2004, @01:46
A nice article IMHO, two things came to my mind while reading it:

- Regarding the NULL pointer issue, in particular setting 'ptr' to 0 after calling delete on it (or rather, what it points to): it might be a good idea to mention QGuardedPtr in that context (IIRC QGuardedPtr does just that, deleting something and setting the pointer to zero in the dtor. Just the name is a little suboptimal).

- Regarding iterators: a very common mistake I spotted is that iterators are acquired from temporary lists. Usually, this happens if some function returns a list (e.g. a QValueList) by value and you fail to make a local copy. So, assuming you have a function 'QValueList<int> numbers();', you shouldn't do

QValueList<int>::ConstIterator end = numbers().end();

but

QValueList<int> list = numbers();
QValueList<int>::ConstIterator it = list.end();

or something like that, I hope the point I'm trying to make became apparent.

- Frerich
[ Reply To This | View ]
Good Stuff
by Frerich Raabe on Thursday 25/Mar/2004, @01:46
A nice article IMHO, two things came to my mind while reading it:

- Regarding the NULL pointer issue, in particular setting 'ptr' to 0 after calling delete on it (or rather, what it points to): it might be a good idea to mention QGuardedPtr in that context (IIRC QGuardedPtr does just that, deleting something and setting the pointer to zero in the dtor. Just the name is a little suboptimal).

- Regarding iterators: a very common mistake I spotted is that iterators are acquired from temporary lists. Usually, this happens if some function returns a list (e.g. a QValueList) by value and you fail to make a local copy. So, assuming you have a function 'QValueList<int> numbers();', you shouldn't do

QValueList<int>::ConstIterator end = numbers().end();

but

QValueList<int> list = numbers();
QValueList<int>::ConstIterator it = list.end();

or something like that, I hope the point I'm trying to make became apparent (the problem is that in the former version, the iterator gets invalidated at the end of the line since the list it was acquired for got destroyed, as it's a temporary).

- Frerich
[ Reply To This | View ]
The Fine Print: The previous comments are owned by whomever posted them.
( Reply )

  "I can't describe the scene as this is a G-rated site, but think Pulp Fiction." -- Sirtaj Singh Kang
KDE®, "K Desktop Environment", "KDE Dot News", "got the dot?" and the KDE Logo® are trademarks or registered trademarks of KDE e.V. in the European Union, the United States and other countries. All other trademarks and copyrights on this page are owned by their respective owners. Comments are owned by the poster. The rest: Copyright © 2000-2008 KDE e.V. for The KDE Project. For further information or comments on this site, please contact the Webmaster.
[ home | post article | flat forty | subscribe | search | rdf ]