Playing with move semantics in C++ - Part 1

(Kjell Schubert contributed to these posts with ideas, discussion, feedback, and corrections).

I did a lot of C++ programming back in the 90s when the language didn’t have universal references, move semantics, lambdas, shared_ptrs, or any of the cool features that have been introduced since then. I moved to C# since it first became available and that was my programming language of choice until I joined Facebook. Oh boy am I old! 🙂

As a way of getting a deeper understanding of some of the new C++ features, I’ve been experimenting by writing toy code. I am sharing some of my notes related to C++’s move() and rvalue references so that experts can correct me or in case other C++ newbies find them useful. BTW… there is ton of related information on the web and in books… Just a search away.

Let’s assume the ExpensiveToCopy class, instances of which are… expensive to copy (e.g. because they maintain millions of strings).

class ExpensiveToCopy {
 public:
  ExpensiveToCopy() {
    log("constructor");
  }
  ~ExpensiveToCopy() {
    log("destructor");
    evolution = 0;
  }

  ExpensiveToCopy(const ExpensiveToCopy& etc): msgs(etc.msgs), evolution(floor(etc.evolution + 1)) {
    log("copy constructor (expensive)");
  }
  ExpensiveToCopy& operator=(const ExpensiveToCopy& rhs) {
    msgs = rhs.msgs;
    evolution = floor(rhs.evolution + 1);
    log("copy operator=");
    return *this;
  }

  ExpensiveToCopy(ExpensiveToCopy&& etc) noexcept: msgs(move(etc.msgs)), evolution(etc.evolution + 0.1) {
    log("move constructor (cheap)");
  }
  ExpensiveToCopy& operator=(ExpensiveToCopy&& rhs) {
    msgs = move(rhs.msgs);
    evolution = rhs.evolution + 0.1;
    log("move operator=");
    return *this;
  }

  void print() const {
    log("contains " + to_string(msgs.size()) + " messages");
  }
 private:
  float evolution = 1.0;
  vector msgs {"hello", "world"};

  void log(const string& str) const {
    cout << " ExpensiveToCopy (" << fixed << setprecision(1) << evolution << "): " << str << endl;
  }
};

Let’s see what happens when we call a function that returns a closure. The closure captures an instance of the above class by value.

auto func1() {
  ExpensiveToCopy expensiveToCopy;
  return [etc = expensiveToCopy] { etc.print(); };
};

{
  cout << "Get lambda with value capture" << endl;
  auto f = func1();
  cout << "call lambda" << endl;
  f();
}

//output
Get lambda with value capture
ExpensiveToCopy (1.0): constructor
ExpensiveToCopy (2.0): copy constructor (expensive)
ExpensiveToCopy (1.0): destructor
Call lambda
ExpensiveToCopy (2.0): contains 2 messages
ExpensiveToCopy (2.0): destructor

So, the object (evolution: 1.0) constructed in func1() was copied (evolution 2.0 created), then destructed, then the closure was returned, and then the copy was destructed. Can we avoid the expensive copy? One might immediately think to capture the etc instance by reference.

auto func2() {
  ExpensiveToCopy expensiveToCopy;
  return [&etc = expensiveToCopy] { etc.print(); };
}

{
  // If this works with your compiler, you just got lucky!
  cout << "Get lambda with reference capture" << endl;
  auto f = func2();
  cout << "Call lambda" << endl;
  f();
}

// output
Get lambda with reference capture
ExpensiveToCopy (1.0): constructor
ExpensiveToCopy (1.0): destructor
Call lambda
ExpensiveToCopy (6468747264.0): contains 18446738209960882049 messages

Ooops! What happened? Well, the object to which the captured reference points no longer exists. With some compilers, you might actually get the right answer but don’t get fooled. This is wrong. In my system with the compiler I used, I was lucky I got garbage instead of an application crash. C++’s move semantics can help…

auto func3() {
  ExpensiveToCopy expensiveToCopy;
  return [etc = move(expensiveToCopy)] { etc.print(); };
}

{  
  cout << "Get lambda with rvalue" << endl;
  auto f = func3();
  cout << "Call lambda" << endl;
  f();
}

// output
Get lambda with rvalue
ExpensiveToCopy (1.0): constructor
ExpensiveToCopy (1.1): move constructor (cheap)
ExpensiveToCopy (1.0): destructor
Call lambda
ExpensiveToCopy (1.1): contains 2 messages
ExpensiveToCopy (1.1): destructor

That’s better. The move constructor is cheap so this isn’t that bad. We effectively said to the compiler to treat the object as if it was an rvalue (i.e. an object without an identity, a temporary object). The move constructor/assignment operator is used in such cases. A new object was created (1.1) and took over the expensive-to-copy resources of the original (the standard library vector class defines a move constructor that just moves all its elements to the new instance). After “moving” the original object, we can’t use it anymore (the behavior is undefined). In fact, we will try to do exactly that in part 2 just for fun 🙂

Let’s see what will happen if we tried to use move in order to “optimize” the return of an object by value from a function. Our intuition might be that a copy would be created in cases when we write…

ExpensiveToCopy func() {
  ExpensiveToCopy etc;
  ...
  return etc;
}

We’ll come back to this. Let’s try to optimize first.

ExpensiveToCopy func4() {
  ExpensiveToCopy expensiveToCopy;
  return move(expensiveToCopy);
}
{  
  cout << "Return object with move" << endl;
  auto f = func4();
}

Let’s see what happens…

// output
Return object with move
ExpensiveToCopy (1.0): constructor
ExpensiveToCopy (1.1): move constructor (cheap)
ExpensiveToCopy (1.0): destructor
ExpensiveToCopy (1.1): destructor

So we avoided the copy constructor but in this particular case we can do even better…

ExpensiveToCopy func5() {
  ExpensiveToCopy expensiveToCopy;
  return expensiveToCopy;
}

{
  cout << "Return object without move. Name return value optimization by compiler." << endl;
  auto f = func5();
}

// output
Return object without move. Name return value optimization by compiler.
ExpensiveToCopy (1.0): constructor
ExpensiveToCopy (1.0): destructor

Ah… very nice! We avoided an intermediate object. But how did this happen? Well, the compiler detected that we were returning the object by value so it allocated in the caller’s stack frame. It’s called Named Return Value Optimization (NRVO) and most compilers support it.
Since we don’t use the object, we can rewrite func5() as…

ExpensiveToCopy func6() {
  return ExpensiveToCopy();
}

{
  cout << "Return object without move. Return value optimization by compiler." << endl;
  auto f = func6();
}

// output
Return object without move. Return value optimization by compiler.
ExpensiveToCopy (1.0): constructor
ExpensiveToCopy (1.0): destructor

Side note: I forgot to mention that I’ve been using C++14. It’s not possible to use move() when capturing an object with C++11. Folly (Facebook’s open source C++ library of cool stuff) offers the MoveWrapper class for C++11. Thanks to Viswanath Sivakumar for pointing this out and for spotting a copy-paste mistake in part 2. BTW… Viswanath was my awesome mentor during my 6 weeks of bootcamp 🙂

Continue to Part 2.

savas parastatidis

Playing with move semantics in C++ – Part 1