Thinking in C++ Vol 2 - Practical Programming |
Prev |
Home |
Next |
In C, a string is simply an array of characters that always
includes a binary zero (often called the null terminator) as its final
array element. There are significant differences between C++ strings and
their C progenitors. First, and most important, C++ strings hide the
physical representation of the sequence of characters they contain. You don t need
to be concerned about array dimensions or null terminators. A string
also contains certain housekeeping information about the size and storage
location of its data. Specifically, a C++ string object knows its
starting location in memory, its content, its length in characters, and the
length in characters to which it can grow before the string object must
resize its internal data buffer. C++ strings thus greatly reduce the likelihood
of making three of the most common and destructive C programming errors:
overwriting array bounds, trying to access arrays through uninitialized or
incorrectly valued pointers, and leaving pointers dangling after an array
ceases to occupy the storage that was once allocated to it.
The exact implementation of memory layout for the string
class is not defined by the C++ Standard. This architecture is intended to be
flexible enough to allow differing implementations by compiler vendors, yet
guarantee predictable behavior for users. In particular, the exact conditions
under which storage is allocated to hold data for a string object are not
defined. String allocation rules were formulated to allow but not require a
reference-counted implementation, but whether or not the implementation uses reference counting, the semantics must be the same. To put this a bit differently,
in C, every char array occupies a unique physical region of memory. In
C++, individual string objects may or may not occupy unique physical
regions of memory, but if reference counting avoids storing duplicate copies of
data, the individual objects must look and act as though they exclusively own unique
regions of storage. For example:
//: C03:StringStorage.h
#ifndef STRINGSTORAGE_H
#define STRINGSTORAGE_H
#include <iostream>
#include <string>
#include "../TestSuite/Test.h"
using std::cout;
using std::endl;
using std::string;
class StringStorageTest : public TestSuite::Test {
public:
void run() {
string s1("12345");
// This may copy the first to the second or
// use reference counting to simulate a copy:
string s2 = s1;
test_(s1 == s2);
// Either way, this statement must ONLY modify s1:
s1[0] = '6';
cout << "s1 = " << s1
<< endl; // 62345
cout << "s2 = " << s2
<< endl; // 12345
test_(s1 != s2);
}
};
#endif //
STRINGSTORAGE_H ///:~
//: C03:StringStorage.cpp
//{L} ../TestSuite/Test
#include "StringStorage.h"
int main() {
StringStorageTest t;
t.run();
return t.report();
} ///:~
We say that an implementation that only makes unique copies
when a string is modified uses a copy-on-write strategy. This approach
saves time and space when strings are used only as value parameters or in other
read-only situations.
Whether a library implementation uses reference counting or
not should be transparent to users of the string class. Unfortunately,
this is not always the case. In multithreaded programs, it is practically
impossible to use a reference-counting implementation safely.
Thinking in C++ Vol 2 - Practical Programming |
Prev |
Home |
Next |