Archive for the 'C++' CategoryPage 2 of 3

C++ Enemy Functions

Scott Meyers wrote an article in CUJ a few years ago entitled How Non-Member Functions Improve Encapsulation. I highly recommend this article, as well as Herb Sutter’s GotW #84: Monoliths “Unstrung”, in which everyone’s favorite scapegoat, std::string, is refactored using this principle.

I have just stumbled onto a rather satisfying, if obvious, extension of this work. First lets review the properties of a friend function. Recall that a friend:

  • …can access the private members of a class
  • …is not passed a this pointer
  • …does not reside in the class’s namespace

Now consider a function which:

  • cannot access the private members of a class
  • is passed a this pointer
  • does reside in the class’s namespace

With tongue-in-cheek, I call this an enemy function (alternate names are welcome).

We can add as many enemy functions as we like to a class without reducing its degree of encapsulation, because these functions only have access to the class’s public interface. Better still, since we pass a this pointer, we don’t have the inconsistency between w.eat(.564) and nap(w) (from the Meyers article). This means we can fix std::string without breaking existing code. I envision something like this:


   class Foo
   {
      public:
          // ...
      private:
          // ...
      enemy:
          void func();  // may not access private members
    }

   // …time passes…

   Foo f;
   f.func();  //not really a member function

Update (6/4/06):
I threw caution to the wind and posted this idea to comp.lang.c++.moderated. Maxim Yegorushkin made a great suggestion: why not use a public member function in a derived class?


   namespace
   {
   class FooBase
   {
      private:
      int x;
      public:
      FooBase(int x) : x(x) {}
      int size() { return x; }
   };
   }
   class Foo : public FooBase
   {
      public:
      Foo(int x) : FooBase(x) {}
      bool empty() { return size() == 0; }
   };

This does exactly what I wanted. The only downside is that I must write forwarding functions for FooBase’s constructors. I find this much more satisfying than Scott Meyer’s class Foo / namespace FooStuff method.

Update (6/15/06):
Many who object to the subclass solution do so on the grounds that it is “misuse of inheritance”. To appease them, I’ve modified the code above to wrap FooBase in an anonymous namespace. The “world” will have no idea I misused inheritance.

Versioned C++

I couldn’t agree more with Kevin’s latest article, which tackles the issue of versioned C++.

I’ve been envious of Perl’s require functionality for years. If you don’t already know, Perl programmers use statements like require 5.6.1; to indicate a version dependency.

Maybe we can get this with a #pragma in C++?

I’d like to point out that there is already some precedent for things like this in the C/C++ world:

  • A .cpp or .cc file extension really indicates that we want the “C++ version of the C language”
  • Many compilers are implementing the latest C99 changes under a -c99 command line switch
  • C++ namespaces have been used (abused?) for this purpose before (c.f. the MSVC stdext namespace)

I think what I’m really trying to do here is whine. The Perl language folks get to change their minds whenever they want. If you don’t like it, you can continue using the old version — nobody is stopping you. Meanwhile, we in the C++ world are unable even to introduce a new reserved word, for fear of breaking existing source.

Update:
I just discovered that the ISO IEC JTC1/SC22/WG21 working group (better known as the C++ Standards Committee) has some papers on this subject:

Incomplete Types in C

I happened across the following bit of C code at work today. It attempts to allocate storage for an incomplete type:


/* foo.c */
struct foo bar;
struct foo { int x; } bap;

I was surprised to find that this code compiles cleanly with GCC 3.4, Intel 8.1, and Comeau C/C++ 4.3.3 (apparently reserving 4 bytes for both bar and bap). Microsoft Visual C 8.0, however, produces the following error:


foo.c(1) : error C2079: 'bar' uses undefined struct 'foo'

MSVC’s error seems pretty reasonable to me. The primary characteristic of a C definition is that it reserves storage. If you don’t yet know the size of foo, you can’t reserve storage.

I’ve got a copy of the C language spec. Can someone point me to a relevant passage?

Update:
Upon review, it seems that this is indeed legal C. The Microsoft compiler should probably accept it. (Thanks to Jonathan Caves for looking at this.)

Farewell C/C++ Users Journal

The following was attached to the front cover of my February issue of C/C++ Users Journal (emphasis mine):

For nearly 30 years, the C/C++ Users Journal has provided resources and information to serve the constantly evolving community of C and C++ developers. More recently, however, we at CMP Media LLC have come to the difficult realization that the best way to serve this community in the future is to focus on new web sites, magazines, and events. What this means is that you are holding in your hands the last issue of the C/C++ Users Journal. As a result, Dr. Dobb’s Journal, which has published C and C++ articles ranging from the days of Small-C to C++, will expand its coverage of these important programming languages even more.

The letter goes on to say that my subscription is being transferred to Dr. Dobb’s Journal, and offers a refund if this is not amenable to me.

Is C++ dying? It’s hard not to wonder.

Java and C# lure in C/C++ developers with their curly-braced good looks, and then woo them with garbage collection and regular expressions, as well as network, thread, and GUI libraries. C++ is primitive by comparison.

We need these features and we need them all yesterday.

Higher-Order Functions with boost::lambda

I’ve been learning Lisp recently (I’ll eventually post something about how and why) but, oddly enough, all this Lisping has got me thinking about C++.

So, without further ado, a higher-order function in C++:


#include <boost/lambda/lambda.hpp>
#include <boost/function.hpp>
#include <iostream>

using namespace std;
using namespace boost::lambda;

template< typename T >
boost::function foo( T n_ )
{
   // Old code.  Leaks memory.
   //T *n = new T( n_ );
   //return ( *n += _1 );

   boost::shared_ptr pn( new T(n_) );
   return ret< T& >( *constant(pn) ) += _1;
}

int main()
{
   // start with one
   boost::function accum = foo(1);
   cout << accum(2) << endl;
   cout << accum(10) << endl;
}

This blows my mind on so many levels, I don’t know where to start. The gist is that foo is a function which generates new functions (ok technically it generates functors, but if you squint, you can’t tell the difference).

Neat.

(Inspired by Paul Graham’s Revenge of the Nerds.)

Update: I originally declared n as a static local, but then I realized what this actually meant (all generated accumulators share n). The current code leaks, but at least it works for multiple accumulator functions. I’m still trying to figure out how to use a smart pointer to plug the leak.

Update: Thanks to Peter Dimov on boost-users, the code above now uses a boost::shared_ptr and is leak-free. For some reason, it won’t build with gcc 4.0.2.

The Flattening Problem

Given a C or C++ program, can you write a script that will inline all the #includes, so that compiling its output is identical to compiling the program directly?

I call this ‘the flattening problem,” and it sounds much easier than it actually is. My first attempt, for example, was foiled by include guards.

To overcome infinite include loops, everyone’s first inclination is to remember which files are already included, and include them only once. This seems reasonable, but it assumes that no header file can usefully be included more than once, which is generally untrue. Instead, you need to keep track of defined preprocessor symbols (and their values), so that you can simply elide unreachable #includes. If you do this, you’re left with two problems:

  1. Not all preprocessor symbols definitions are contained in a program’s source code. Defines can be automatic, like __cplusplus and __FILE__, or external, via the compiler’s -D switch.
  2. #include statements can take preprocessor symbols as their arguments. There is a rather famous IOCCC winner, (vanschintz.c) that abuses this feature to solve the Towers of Hanoi problem.

The combination of 1 and 2 means you may actually encounter include directives with externally-defined arguments. This makes the problem impossible, in the general case.

You can, however, get an approximate solution with some shortcuts. First, you can make assumptions about automatic preprocessor defines (for example, you assume that .cpp files implicitly define __cplusplus). Second, you must be willing to simply fail on any #include statement that takes an externally-defined preprocessor symbol as its parameter.

(Thanks to Weimin Chen, Wendy Thrash, and Jonathan Caves for their help on this one.)

Const Constructors

If you’re a regular reader — and I’m pretty sure you’re not — you may have seen my infamous “craptimizing” articles: one and two. Read them if you like, but the short story is that I thought I had discovered a neat way to optimize using const member functions. Of course, there was a problem.

My approach is still flawed, but tonight I had an “ah-ha” moment. Suddenly, I knew what was missing — what had put the “crap” in “craptimizing”. What I really needed was a const constructor.

In today’s C++, during construction, we have no way of knowing if we are creating a regular ‘ole Foo, or a const Foo. But with a const constructor, I could differentiate:


#include <iostream>
#include <stdexcept>

using namespace std;

template < class T >
class Foo {
    private:
        bool initialized_;
        T data_;
    public:
        Foo() :
            initialized_( false )
        {}

        Foo( T data ) const :
            initialized_( true ),
            data_( data )
        {}

        void set( int data )
        {
            data_ = data;
            initialized_ = true;
        }

        void fubarize()
        {
            if( !initialized_ )
                throw invalid_argument( “Uninitialized Foo” );

            cout << "Called slow fubarize()" << endl;
        }

        void fubarize() const
        {
            // No initialized_ check required.
            // It's important to fubarize quickly.
            cout << "Called fast fubarize()" << endl;
        }
};

int main()
{
    Foo f;

    try
    {
        f.fubarize();   // Will throw invalid_argument
    }
    catch( invalid_argument &e )
    {
        cout << e.what() << endl;
    }

    f.set( 42 );
    f.fubarize();   // OK, calls slow fubarize()

    const Foo cf1( 42 );
    cf.fubarize();  // OK, calls fast fubarize()

    const Foo cf2;  // Error! (This is good)
}

Note that even if we exclusively used const Foos in our program, the compiler could not perform this optimization (omitting the initialized check) without our help. In short, this is because there are precious few circumstances under which the compiler can actually trust the const qualifier. If you want more details, you got ‘em.

Alas, I don’t think there is any reasonable way to get this behavior with standard C++. (To me, separate “Foo” and “ConstFoo” classes is unreasonable.) And, to make matters worse, adding this feature would break lots of existing code, because the status quo is:


const Foo cf;  // Calls Foo::Foo()

This obviously works fine: how else would we create a const Foo?

PS:
Google finds tons of references to “const constructor”. One very interesting document was this ISO/IEC Discussion Draft by Kevlin Henny. His proposal is notable in that it manages not to break backwards compatibility.

#include “irony”

The other day, I had this great (read: stupid) idea: I wanted to write a Perl script to process a .c file, find all the #include directives, and replace them with the included file’s contents. I wanted to do this recursively, such that the resulting .i file would have zero #include lines.

Notice that this is not the same as just running the C preprocessor on the file. That would have processed lines like #ifdef _AMD64_ and resulted in a non-portable .i file. All I wanted to do, was to “flatten” a complicated program down to one (still-portable, possibly enormous) file.

Please take my word for the fact that I have a totally reasonable justification for wanting to do this. Honestly. I could convince you, but it would take another page of text.

I wrote the Perl program in about 20 minutes, and turned it loose on my .c file.

A few minutes and 26 megabytes of .i file later, I decided the program would never terminate. That’s about the time it hit me: include guards.

In CS-nerd speak, all the files in a C program form a directed graph where the edges represent inclusion. This graph contains cycles. That means that A.h can include B.h which can include A.h, and so forth, until your brain explodes. You typically fix this problem with so-called include guards:


#ifndef __FOO_H
#define __FOO_H
//... text of foo.h goes here ...
#end

Beginner C/C++ programmers often learn about include guards the hard way. For me, they are reflexive — which is why I totally forgot about them.

So here’s the irony: I want only to process #include directives, and not #define directives, but I can’t sucessfully do the former without doing the latter too.

Pop Count

Larry Osterman recently blogged about some various methods for computing the population count (or pop count) of a machine word. For the uninitiated, pop count is the canonical name for the function which counts the number of asserted bits in a binary number. Larry’s blog entry was excellent, as usual, but it was of the comments that blew me away. Jeu George added a link to his blog describing the MIT HAKMEM pop count. Here it is:


unsigned popCount( unsigned x )
{
   unsigned r = x - ((x >> 1) & 033333333333)
                  - ((x >> 2) & 011111111111);
   return ((r + (r >> 3)) & 030707070707) % 63;
}

I spent hours figuring out how it worked.

Continue reading ‘Pop Count’

Craptimizing with Const Member Functions

Bheeshmar has pointed out a major flaw in my const-member optimization.

Continue reading ‘Craptimizing with Const Member Functions’




Creative Commons Attribution-NonCommercial 3.0 United States
Creative Commons Attribution-NonCommercial 3.0 United States