Thinking in C++ - Practical Programming

Thinking in C++ Vol 2 - Practical Programming
Prev	Home	Next

What s in a string?

C# Essentials
eBook

$9.99

eBookFrenzy.com

In C, a string is simply an array of characters that always includes a binary zero (often called the null terminator) as its final array element. There are significant differences between C++ strings and their C progenitors. First, and most important, C++ strings hide the physical representation of the sequence of characters they contain. You don t need to be concerned about array dimensions or null terminators. A string also contains certain housekeeping information about the size and storage location of its data. Specifically, a C++ string object knows its starting location in memory, its content, its length in characters, and the length in characters to which it can grow before the string object must resize its internal data buffer. C++ strings thus greatly reduce the likelihood of making three of the most common and destructive C programming errors: overwriting array bounds, trying to access arrays through uninitialized or incorrectly valued pointers, and leaving pointers dangling after an array ceases to occupy the storage that was once allocated to it.

The exact implementation of memory layout for the string class is not defined by the C++ Standard. This architecture is intended to be flexible enough to allow differing implementations by compiler vendors, yet guarantee predictable behavior for users. In particular, the exact conditions under which storage is allocated to hold data for a string object are not defined. String allocation rules were formulated to allow but not require a reference-counted implementation, but whether or not the implementation uses reference counting, the semantics must be the same. To put this a bit differently, in C, every char array occupies a unique physical region of memory. In C++, individual string objects may or may not occupy unique physical regions of memory, but if reference counting avoids storing duplicate copies of data, the individual objects must look and act as though they exclusively own unique regions of storage. For example:

//: C03:StringStorage.h

#ifndef STRINGSTORAGE_H

#define STRINGSTORAGE_H

#include <iostream>

#include <string>

#include "../TestSuite/Test.h"

using std::cout;

using std::endl;

using std::string;

class StringStorageTest : public TestSuite::Test {

public:

void run() {

string s1("12345");

// This may copy the first to the second or

// use reference counting to simulate a copy:

string s2 = s1;

test_(s1 == s2);

// Either way, this statement must ONLY modify s1:

s1[0] = '6';

cout << "s1 = " << s1 << endl; // 62345

cout << "s2 = " << s2 << endl; // 12345

test_(s1 != s2);

}

};

#endif // STRINGSTORAGE_H ///:~

//: C03:StringStorage.cpp

//{L} ../TestSuite/Test

#include "StringStorage.h"

int main() {

StringStorageTest t;

t.run();

return t.report();

} ///:~

We say that an implementation that only makes unique copies when a string is modified uses a copy-on-write strategy. This approach saves time and space when strings are used only as value parameters or in other read-only situations.

Whether a library implementation uses reference counting or not should be transparent to users of the string class. Unfortunately, this is not always the case. In multithreaded programs, it is practically impossible to use a reference-counting implementation safely.[32]

Thinking in C++ Vol 2 - Practical Programming
Prev	Home	Next