Smart pointers

The standard library and Boost smart pointers fall into two types: scoped pointers and reference counted pointers.

Scoped pointers

boost::scoped_ptr and std::unique_ptr
The object is destroyed when the pointer instance goes out of scope.
They cannot be copied (although you can pass them by reference)
The lifetime of the object is tied either to:

  • The current scope
  • The lifetime of the containing object

The second of these makes them suitable for storage in containers.
The old std::auto_ptr is like a scoped pointer, except that it can be copied, which transfers ownership.
It is deprecated in the newer standards and should not be used.

Reference counted pointers

boost::shared_ptr and std::shared_ptr
Allows the pointer to be copied.
Useful when the lifetime of the object is more complicated.
Potential problems:

  • Creating them on the heap
  • Passing them by reference
  • Creating circular references

The last point is mitigated by boost::weak_ptr and std::weak_ptr, which define a weak (i.e., uncounted) reference to a shared_ptr.

Finding a substring in Python

You can simply use in to find out if a string contains another string:

line = "A license for my pet fish, Eric."
if 'license' in line:
    print '"license" is in line'

Use find() to get the position of the substring:

pos = line.find('license')
if pos != -1:
    print '"license" starts at position {0} of line'.format(pos)

Move semantics in C++

Consider the following simple string class:

class String
{
public:
    String()
        : data_(strdup_("")), len_(0)
    {
    }

    String(const char* str)
        : data_(strdup_(str)), len_(std::strlen(data_))
    {
    }

    String(const char *str, size_t len)
        : data_(strndup_(str, len)), len_(len)
    {
    }

    String(const String& other)
        : data_(strdup_(other.data_)), len_(other.len_)
    {
    }

    ~String()
    {
        delete[] data_;
    }

    String& operator=(String other)
    {
        std::swap(data_, other.data_);
        return *this;
    }

    String& operator+=(const String& other)
    {
        size_t len = len_ + other.len_;
        char *data = new char[len + 1];
        std::strcpy(data, data_);
        std::strcat(data, other.data_);
        delete[] data_;
        data_ = data;
        len_ = len;
        return *this;
    }

    String substring(size_t start, size_t len)
    {
        if (start < len_ - 1 && len <= len_ - start) {
            return String(&data_[start], len);
        }
        return String();
    }

    friend String operator+(String lhs, const String& rhs);
    friend std::ostream& operator<<(std::ostream& os, const String& str);

private:
    char* data_;
    size_t len_;
};

The important things to note are:

  • It manages a dynamically allocated block of memory, so it needs to observe the Rule of Three
  • In addition to its constructors, it has two functions that return an object: operator+(), and substring()

Now consider the following uses of the class:

int main()
{
    String a = "The quick brown fox jumps over";
    String b = " the lazy dog";

    String x = a;                  // Copy an existing object
    String y = a + b;              // Create a new object by addition and then copy it
    String z = a.substring(16, 3); // Create a new object by function call and then copy it

    std::cout << x << "\n";
    std::cout << y << "\n";
    std::cout << z << "\n";
}

There are 3 copying operations in this code. In the first, we are copying the instance a, so we naturally expect to be able to use a again after the copy has been made.
In the other two though, temporary objects are created (by operator+() and substring respectively), and there is no way we could use them before they are destroyed at the end of the expression. These temporary objects are called rvalues because they only occur on the right-hand side of an expression, and can never be on the left-hand side of one.

As these objects can never be used, it’s wasteful and unnecessary for them to go through the same construction and destruction process as objects we do use. In particular, it shouldn’t be necessary to go through the process of copying the data_ member in the copy constructor, only for it to be deleted immediately after the copy is made.

What move semantics provide, amongst other things, is a way of writing a copy constructor that behaves differently when the object being constructed is an rvalue, called a move constructor. Below is a move constructor for our string class:

String(String&& other)
{
    data_ = other.data_;
    other.data_ = nullptr;
}

Note the &&, which indicates an rvalue reference variable. It is this that makes the constructor a move constructor. This constructor will be called in the situations like those above where the compiler can tell that the object being copied is an rvalue. In this case we can simply steal the data_ member from the soon-to-be-destroyed other object by making it the data_ member of the new object, and then setting it the other object’s member to nullptr so that the block won’t be deleted when the other object goes out of scope. This makes the whole process of copy initialisation much more efficient when the object being copied is an rvalue.

There is a lot more to rvalues and move semantics than this, but as a first step it is well worth considering adding a move constructor to a class that has methods that return an object, as these objects are quite likely to be constructed as rvalues in practice.

C++ stream synchronization

By default, C++ standard I/O streams are synchronized with their C counterparts.
What this means is that calls to C++ steams are directly forwarded to the corresponding C stream without any buffering in the C++ stream.

For example, given FILE pointers called fin and fout, a std::istream called is, and a std::ostream called os, the following calls are equivalent:

std::fputc(fout, c) and os.rdbuf()->sputc(c)
std::fgetc(fin) and is.rdbuf()->sbumpc()
std::ungetc(c, fin) and is.rdbuf()->sputbackc(c)

This allows calls to C and C++ I/O functions to be intermixed freely with predictable results. For example, consider the following program:

#include <iostream>
#include <cstdio>

int main()
{
    std::cout << "Wynken, Blynken, and Nod one night\n";
    printf("Sailed off in a wooden shoe -\n");
    std::cout << "Sailed on a river of crystal light,\n";
    printf("Into a sea of dew.\n");
}

This produces the following output:

Wynken, Blynken, and Nod one night
Sailed off in a wooden shoe -
Sailed on a river of crystal light,
Into a sea of dew.

This synchronization means that the expected output is produced, but can severely impact performance.
You can improve the performance of C++ streams, at the expense of foregoing being able to intermix them with C I/O functions by disabling synchronization with the std::ios_base::sync_with_stdio function:

#include <iostream>

std::ios::sync_with_stdio(false);

To illustrate what I mean about foregoing being able to intermix C++ streams with C I/O, here is the result of the program above on my system when I disable synchronization:

Sailed off in a wooden shoe -
Into a sea of dew.
Wynken, Blynken, and Nod one night
Sailed on a river of crystal light,

Difference between append and extend in Python

append() adds an item at the end.
extend() adds the contents of an iterable at the end.

def main():
    dishes = ['spam', 'eggs']
    dishes.append('ham')
    print dishes
    dishes.extend(['spam', 'spam', 'spam'])
    print dishes

This prints:

['spam', 'eggs', 'ham']
['spam', 'eggs', 'ham', 'spam', 'spam', 'spam']

Pass by reference in Python

There is no pass by reference in Python, so it isn’t possible to pass an variable to a function and have it assigned a new value within the function, including passing in an immutable primitive type value and having it replaced with a modified value.

It is possible to do something similar though by passing in the value inside a data structure or other object.

For example, in the following program a string has its value modified in two different ways. Firstly, a class object is used to store the string’s value, and secondly, a list is used for the same purpose.

class Wrapper(object):
    def __init__(self, value):
        self.value = value

def pass_by_reference1(wrapper):
    wrapper.value = wrapper.value + ", and spam"

def pass_by_reference2(wrapper):
    wrapper[0] = wrapper[0] + ", and spam"

def main():
    var1 = Wrapper("A string");
    pass_by_reference1(var1)
    print var1.value

    var2 = ["Another string"]
    pass_by_reference2(var2)
    print var2[0]

if __name__ == '__main__':
    main()

Pointers in C++

The main reasons to use pointers in C++ are:

Polymorphism

Derived class overrides are called when a method is called on a base class pointer (or reference).

Optional objects

A pointer can be NULL or nullptr. This is a useful sentinel value indicating that an object has no value.
References have no equivalent, while value objects would need to use their field values to indicate that they did not have a real value.

Separating compilation units

The size of pointers is known at compile-time without knowing the size of what they’re pointing to.
This means that classes that contain pointers do not need to be recompiled if the pointed to classes are changed.
This is the Pimpl idiom.

Interface with C library

C objects are generally exposed as pointers to structs allocated on the heap, so you need to use pointers to manipulate them.
You can wrap the C object in a smart pointer to make this easier and safer.

Data structures

You need pointers to make data structures like linked lists and binary trees.
To make such a structure it is necessary to be able to associate objects indirectly, which means you can’t use value objects, and also change associations as the data structure is modified, which references don’t allow.

Reference semantics

Mostly, you want to pass an object to a function without copying, and pointers and references allow you to do this.

Note that the first and last advantages are shared by references, while the other 4 are specific to pointers.

Mutable default arguments

Python primitive types are immutable. Objects though, including lists and dictionaries are mutable.
This can cause a problem when they’re used as the default argument to a function.

Consider the following code:

def append(item, dest=[]):
    dest.append(item)
    return dest

def main():
    order1 = append('spam')
    print order1
    order2 = append('eggs')
    print order2

if __name__ == '__main__':
    main()

You would think that this code would print:

['spam']
['eggs']

In fact it prints:

['spam']
['spam', 'eggs']

This is because the default argument, rather than being evaluated every time the function is called, is evaluated just once when the function definition is compiled.

The way to get the expected behaviour is to rewrite the function as follows:

def append(item, dest=None):
    if dest is None:
        dest = []
    dest.append(item)
    return dest

Difference between #include <filename> and #include “filename”

Strangely, this isn’t really specified.

Most compilers, however, will only search for files in the first format (angle brackets) in the standard include paths.
They will search the source file’s directory first for files the second format (quotes), and then search the standard include paths.

References:
Search Path – The C Preprocessor in the GCC online documentation
#include Directive (C/C++) in the Visual Studio documentation