In this article we will meet some of the most catastrophic ways to kill performance in a C++ program: returning objects by value. We will see why it is so slow, and how this can be mitigated.
Still a few flaws to fix
Before we get deeper into the example, however, we must add one missing feature to the DynamicString class we introduced in the Part 2.
This is what DynamicString looks like now:
class DynamicString
{
public:
DynamicString()
{
str = strdup("");
}
DynamicString(const char *str)
{
this->str = strdup(str);
}
DynamicString(const DynamicString &that)
{
this->str = strdup(that.str);
memcpy(this->useless, that.useless, sizeof(useless));
}
~DynamicString()
{
free(str);
}
operator const char *() const
{
return str;
}
protected:
char *str;
char useless[1024];
};
In the end of Part 2 we have added the copy constructor to be able to pass DynamicString instances by value without screwing up dynamic memory allocation due to “str” pointing to the same memory location.
There’s however still one case which is not handled correctly, let’s look at this short code snippet:
void CountTo6AndDie()
{
DynamicString s1("Hello");
DynamicString s2;
printf("Before: %p <%s> %p <%s>\n", (const char *)s1, (const char *)s1, (const char *)s2, (const char *)s2);
s2 = s1;
printf("After: %p <%s> %p <%s>\n", (const char *)s1, (const char *)s1, (const char *)s2, (const char *)s2);
}
Here we are :
- allocating two strings, one is initialised with a const char *, the other uses the default constructor
- printing the strings’ contents as pointer and as string
- copying s1 into s2
- again, printing the strings’ contents as pointer and as string
The output is as follows:
Why is this happening ? Because the compiler is translating the line
s2 = s1;
Using the default assignment operator. Similar to the default copy constructor, the default assignment operator just does a memory copy of s1 contents into s2, thus creating a ditto copy which shares the same “str” pointer. Here is the assembly implementation of that line:
leaq -1072(%rbp), %rax
leaq -2112(%rbp), %rdx
movl $129, %ecx
movq %rax, %rdi
movq %rdx, %rsi
rep movsq
We recognise here the all too familiar “rep movsq” that we have seen many times before, doing a raw memory copy.
The default assignment operator in our example presents two problems:
- s2 and s1 share the same pointer to “str”. Hence, when the two instances are destroyed at the end of the function they live in, the same amount of memory will be freed twice -> crash, malfunction, pestilence, locusts, etc etc
- the pointer originally owned by s2 (a pointer to an empty string, in this case) is lost and it will not be deallocated -> memory leak
Fortunately we have a little friend who can help us to track this kind of issues, a nifty tool called “valgrind”, which we invoke as follows:
valgrind --leak-check=full ./Move
“Move” is the name of our executable. Valgrind will spit out the following output:
Both problems are clearly shown in highlighted areas.
Our own assignment operator
To fix the issue we must, hear hear, define our own assignment operator inside DynamicString. An assignment operator is the redefinition of the operator ‘=’, it must receive a const reference to the object from which we are assigning, and it must return a reference to ‘this’:
DynamicString & operator = (const DynamicString &that)
{
free(str);
this->str = strdup(that.str);
memcpy(this->useless, that.useless, sizeof(useless));
return *this;
}
(yes, no checks on str != nullptr to keep code short, as usual).
This looks very similar to the copy constructor we implemented in part 2:
DynamicString(const DynamicString &that)
{
this->str = strdup(that.str);
memcpy(this->useless, that.useless, sizeof(useless));
}
There are two major differences:
- We free the previous contents of “str”. When we invoke the assignment operator the object has already been defined, so its contents must be freed.
- We return a reference to this.
What to return from the assignment operator
There are no rules imposed by the language about what you should return to from the assignment operator. You could, for instance, return void, or an integer, or whatever you like.
Normally the assignment operator is supposed to be chainable:
DynamicString s1("Hello"), s2, s3;
s3 = s2 = s1;
In this way s1 is copied into s2, then s2 is copied into s3.
To achieve this we can either return (in our case) a DynamicString:
DynamicString operator = (const DynamicString &that)
{
// ...omitted
return *this;
}
or a reference to a DynamicString:
DynamicString & operator = (const DynamicString &that)
{
// ...omitted
return *this;
}
The first solution allocates a temporary object on the stack using the copy constructor, and returns it.
The second solution just returns a reference to this, and is the preferred solution for two reasons:
- It’s more efficient, not requiring allocation of a temporary object and invocation of the copy constructor
- It returns an actual reference to the object on which the copy has been invoked.
Point 1. should be self-explanatory.
For Point 2 consider the following (rather absurd looking) snippet:
DynamicString s1("Hello"), s2, s3;
(s2 = s1).DoSomething();
If the assignment operator returns a reference to this, then s1 is copied onto s2 (using the assignment operator), then the (currently not existing) DoSomething method is invoked on s2, not on a temporary object which is a copy of s2. This may make not much sense in this example, but they certainly do in the redefinition of the “<<” operator in C++ streams where multiple “<<” calls can be chained to log multiple items on a stream.
Heading for disaster: returning an object by value
Now that we have a complete DynamicString class (complete in terms of how it manages memory, not functionally complete as a string) let’s see what happens when we return an object by value.
Let’s look at this simple program:
DynamicString NumberToString(int n)
{
char buf[256];
sprintf(buf, "%d", n);
return DynamicString(buf);
}
int main()
{
DynamicString str;
str1 = NumberToString(10);
return 0;
}
To see what happens under the hood we need to instrument all methods inside DynamicString, let’s look at one of the constructors for example and we will instrument all other methods in the same fashion:
DynamicString(const char *str)
{
std::cout << __func__ << " (from const char *) " <<
str << std::endl;
this->str = strdup(str);
}
A couple of notes here:
- __func__ is a special GCC symbol which contains the name of the function from which it’s used. Very useful for diagnostic logging.
- Remember the “=” operator from above, returning a reference to “this” ? In the little snippet above you can see a similar operator returning a reference to “this”, operator <<. By returning a reference to “this”, logging operations can be chained and they all happen on the same object.
Let’s now enjoy the disastrous output of this program:
This is what is happening in detail:
We are performing a lot of memory allocation / deallocation operations here:
- “str” is constructed: 1 malloc
- a temporary object is constructed: 1 malloc
- the temporary object is copied to “str”: 1 free + 1 malloc (both inside the assignment operator)
- the temporary object is destroyed: 1 free
- “str” is destroyed: 1 free
Total impact of our little code: 3 malloc, 3 free. Not bad for such a little simple program.
The problem here is that in steps 2, 3 and 4 we are creating, assigning and destroying a temporary object whose only purpose is to get assigned to str1, when in fact what we want to do is to move its contents to str1.
The move operator comes into play
The assignment operator is when we want to copy the contents of an object (“that”) into “this”. This makes sense when “that” continues to be alive after the operation.
There are cases, however, when “that” is expected to die right after the operation has been completed; for instance, in our example:
str = NumberToString(10);
The temporary object returned by NumberToString will be deallocated just after the assignment. Therefore, we want to move its contents to str, rather than copying it.
Let’s therefore define the move operator (AKA move assignment operator) inside DynamicString:
DynamicString & operator = (DynamicString &&that)
{
std::cout << __func__ << " (move assignment) " << std::endl;
free(str);
this->str = that.str;
that.str = nullptr;
memcpy(this->useless, that.useless, sizeof(useless));
return *this;
}
The move operator looks very much like the assignment operator, with a few differences:
- It receives a weird “double reference” to that (“&&that”), which just indicates it’s a move operator. It’s not something like a double pointer, just C++ syntax horror.
- It does not make a duplicate of that.str: it just copies the pointer.
- It nullifies that.str. The idea here is that “that” no longer owns “ptr” as it has been moved to “this”. This is needed when the destructor of “that” is called to avoid it frees memory now used by “this”.
As now “str” can be nullptr as a result of legitimate code and not only because of coding mistakes or runtime errors, we are obliged to protect the destructor:
~DynamicString()
{
if(str)
{
std::cout << __func__ << " " << str << std::endl;
free(str);
}
else
std::cout << __func__ << " nullptr " << std::endl;
}
Just by adding the move operator the output of our program becomes the following:
Here’s what’s happening:
Adding the move assignment operator has saved us from 1 useless malloc and 1 useless free, without any change to application code.
Faster, faster, faster !
Can we get any faster than this ? Yes we can, in two equivalent ways. Instead of:
DynamicString str;
str = NumberToString(10);
We can write:
DynamicString str = NumberToString(10);
or:
DynamicString str(NumberToString(10));
In both cases the output will be as follows:
So what happened here…
- The compiler did not call the default constructor for “str” as it knew it would be overwritten immediately
- The compiler did not call the copy / move constructors for DynamicString. The return area of NumberToString is directly initialised to overlap the address of “str” in the stack.
Therefor all useless temporary allocation / deallocation / copy / move operations have been omitted.
Now try to beat that, Java 🙂 ! (or C# or Python or whichever other language is supposed to be “almost as efficient”)
The move constructor (and explicit move assignment)
Let’s assume we want to write a simple function to swap two strings. Like any well-behaved swap function, it needs a temporary variable to hold one of the values being swapped:
void Swap(DynamicString &a, DynamicString &b)
{
DynamicString tmp(a);
a = b;
b = tmp;
}
We can then invoke it as follows:
int main()
{
DynamicString str1("1234");
DynamicString str2("ABCD");
printf("Before: %s %s\n", (const char *)str1,
(const char *)str2);
Swap(str1, str2);
printf("After: %s %s\n", (const char *)str1,
(const char *)str2);
return 0;
}
Thanks to our oh-so-powerful log functions in DynamicString’s methods, we can see this output:
What’s going on here ? Let’s see:
Let’s focus on steps 4-7, the others are of little interest for us.
4) The copy constructor makes a copy of a. 1 malloc.
5) The assignment operator copes b into a. 1 free + 1 malloc.
6) The assignment operator copies tmp into b. 1 free + malloc.
7) tmp is destroyed. 1 free.
So a grand total of 3 malloc and 3 free. Pretty awful, considering that in the end all we have to do is swap two pointers. We surely could tweak DynamicString’s internals and actually swap two pointers, but this would break the object oriented encapsulation approach, wouldn’t it ?
Enter the move constructor
Why are we making a copy of a in step 4 ? What we want to do is move a’s contents to tmp. We can do it by defining the move constructor for DynamicString:
DynamicString(DynamicString &&that)
{
std::cout << __func__ << " (move constructor) " << std::endl;
this->str = that.str;
that.str = nullptr;
memcpy(this->useless, that.useless, sizeof(useless));
}
The move constructor is very similar to the copy constructor, with a few differences we have already seen in the move assignment operator:
- it receives the ugly “double reference &&” which means we are dealing with move operations
- it grabs “that.str” without making a copy
- it nullifies that.str to avoid double deallocation of resources when “that” gets destroyed
Unfortunately, in our “Swap” function the compiler has no way to know whether to use the copy constructor or the move constructor, and what to choose between assignment and move assignment. It will use copy operations, which are safer without “understanding” the meaning of the program.
If we want to use move operations we need to instruct the compiler accordingly:
void Swap(DynamicString &a, DynamicString &b)
{
DynamicString tmp(std::move(a));
a = std::move(b);
b = std::move(tmp);
}
std::move is declared in move.h, part of the standard C++ library, and it’s used to force the compiler to use move operations.
It internally casts its argument to the movable type, so the statement:
a = std::move(b);
is equivalent to:
a = static_cast<DynamicString &&>(b);
With these modifications the output is as follows (just showing the interesting bits) :
As can be seen, we are using only move operations, no memory allocation or deallocation is done, and efficiency is a high as it could ever be.
As a final note, there’s a standard C++ library function to swap objects using move operators:
std::swap(str1, str2);
It’s a template function, it can be used on any copyable class, it will use the move operators if available, and it will scale down to copy operations otherwise.
Its implementation is very similar to our “Swap” function… mostly, I have to admit, because std::swap is where I started from to create the example.
Conclusions
In this very long article we have seen that a complete definition of all components of a class, especially when dealing with dynamic memory allocation, is equally crucial for correct behaviour and efficiency.
When managing any kind of resource (be it memory, files, handles, sockets…) there are simple golden rules to be always followed:
- Constructors and destructors are a a must
- Copy constructors are a must
- Assignment operators are a must
- Move constructors and move operators are important for efficiency
- Don’t pass object by value, unless absolutely needed
- Careful when you return objects by value, especially if you did not define move constructors / operators