Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Thinking in C++ Vol 2 - Practical Programming
Prev Home Next

vector

The vector class template is intentionally made to look like a souped-up array, since it has array-style indexing, but also can expand dynamically. The vector class template is so fundamentally useful that it was introduced in a primitive way early in this book and was used regularly in previous examples. This section will give a more in-depth look at vector.

To achieve maximally-efficient indexing and iteration, vector maintains its storage as a single contiguous array of objects. This is a critical point to observe in understanding the behavior of vector. It means that indexing and iteration are lightning-fast, being basically the same as indexing and iterating over an array of objects. But it also means that inserting an object anywhere but at the end (that is, appending) is not really an acceptable operation for a vector. In addition, when a vector runs out of preallocated storage, to maintain its contiguous array it must allocate a whole new (larger) chunk of storage elsewhere and copy the objects to the new storage. This approach produces a number of unpleasant side-effects.

Cost of overflowing allocated storage

A vector starts by grabbing a block of storage, as if it s taking a guess at how many objects you plan to put in it. As long as you don t try to put in more objects than can be held in the initial block of storage, everything proceeds rapidly. (If you do know how many objects to expect, you can preallocate storage using reserve( ).) But eventually you will put in one too many objects, and the vector responds by:

1.      Allocating a new, bigger piece of storage.

2.      Copying all the objects from the old storage to the new (using the copy-constructor).

3.      Destroying all the old objects (the destructor is called for each one).

4.      Releasing the old memory.

For complex objects, this copy-construction and destruction can end up being expensive if you often overfill your vector, which is why vectors (and STL containers in general) are designed for value types (i.e. types that are cheap to copy). This includes pointers.

To see what happens when you re filling a vector, here is the Noisy class mentioned earlier. It prints information about its creations, destructions, assignments, and copy-constructions:

//: C07:Noisy.h
// A class to track various object activities.
#ifndef NOISY_H
#define NOISY_H
#include <iostream>
using std::endl;
using std::cout;
using std::ostream;
 
class Noisy {
static long create, assign, copycons, destroy;
long id;
public:
Noisy() : id(create++) {
cout << "d[" << id << "]" << endl;
}
Noisy(const Noisy& rv) : id(rv.id) {
cout << "c[" << id << "]" << endl;
++copycons;
}
Noisy& operator=(const Noisy& rv) {
cout << "(" << id << ")=[" << rv.id << "]" << endl;
id = rv.id;
++assign;
return *this;
}
friend bool operator<(const Noisy& lv, const Noisy& rv) {
return lv.id < rv.id;
}
friend bool operator==(const Noisy& lv,const Noisy& rv) {
return lv.id == rv.id;
}
~Noisy() {
cout << "~[" << id << "]" << endl;
++destroy;
}
friend ostream& operator<<(ostream& os, const Noisy& n) {
return os << n.id;
}
friend class NoisyReport;
};
 
struct NoisyGen {
Noisy operator()() { return Noisy(); }
};
 
// A Singleton. Will automatically report the
// statistics as the program terminates:
class NoisyReport {
static NoisyReport nr;
NoisyReport() {} // Private constructor
NoisyReport & operator=(NoisyReport &); // Disallowed
NoisyReport(const NoisyReport&); // Disallowed
public:
~NoisyReport() {
cout << "\n-------------------\n"
<< "Noisy creations: " << Noisy::create
<< "\nCopy-Constructions: " << Noisy::copycons
<< "\nAssignments: " << Noisy::assign
<< "\nDestructions: " << Noisy::destroy << endl;
}
};
#endif // NOISY_H ///:~
 
//: C07:Noisy.cpp {O}
#include "Noisy.h"
long Noisy::create = 0, Noisy::assign = 0,
Noisy::copycons = 0, Noisy::destroy = 0;
NoisyReport NoisyReport::nr;
///:~
 

Each Noisy object has its own identifier, and static variables keep track of all the creations, assignments (using operator=), copy-constructions, and destructions. The id is initialized using the create counter inside the default constructor; the copy-constructor and assignment operator take their id values from the rvalue. With operator= the lvalue is already an initialized object, so the old value of id is printed before it is overwritten with the id from the rvalue.

To support certain operations such as sorting and searching (which are used implicitly by some of the containers), Noisy must have an operator< and operator==. These simply compare the id values. The ostream inserter follows the usual form and simply prints the id.

Objects of type NoisyGen are function objects (since there is an operator( )) that produce Noisy objects during testing.

NoisyReport is a Singleton object[107] because we only want one report printed at program termination. It has a private constructor so no additional NoisyReport objects can be created, it disallows assignment and copy-construction, and it has a single static instance of NoisyReport called nr. The only executable statements are in the destructor, which is called as the program exits and static destructors are called. This destructor prints the statistics captured by the static variables in Noisy.

Using Noisy.h, the following program shows a vector overflowing its allocated storage:

//: C07:VectorOverflow.cpp {-bor}
// Shows the copy-construction and destruction
// that occurs when a vector must reallocate.
//{L} Noisy
#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>
#include "Noisy.h"
using namespace std;
 
int main(int argc, char* argv[]) {
int size = 1000;
if(argc >= 2) size = atoi(argv[1]);
vector<Noisy> vn;
Noisy n;
for(int i = 0; i < size; i++)
vn.push_back(n);
cout << "\n cleaning up << endl;
} ///:~
 

You can use the default value of 1000, or you can use your own value by putting it on the command line.

When you run this program, you ll see a single default constructor call (for n), then a lot of copy-constructor calls, then some destructor calls, then some more copy-constructor calls, and so on. When the vector runs out of space in the linear array of bytes it has allocated, it must (to maintain all the objects in a linear array, which is an essential part of its job) get a bigger piece of storage and move everything over, first copying and then destroying the old objects. You can imagine that if you store a lot of large and complex objects, this process could rapidly become prohibitive.

There are two solutions to this problem. The nicest one requires that you know beforehand how many objects you re going to make. In that case, you can use reserve( ) to tell the vector how much storage to preallocate, thus eliminating all the copies and destructions and making everything very fast (especially random access to the objects with operator[ ]). Note that the use of reserve( ) is different from using the vector constructor with an integral first argument; the latter initializes a prescribed number of elements using the element type s default constructor.

Generally you won t know how many objects you ll need. If vector reallocations are slowing things down, you can change sequence containers. You could use a list, but as you ll see, the deque allows speedy insertions at either end of the sequence and never needs to copy or destroy objects as it expands its storage. The deque also allows random access with operator[ ], but it s not quite as fast as vector s operator[ ]. So if you re creating all your objects in one part of the program and randomly accessing them in another, you may find yourself filling a deque and then creating a vector from the deque and using the vector for rapid indexing. You don t want to program this way habitually just be aware of these issues (that is, avoid premature optimization).

There is a darker side to vector s reallocation of memory, however. Because vector keeps its objects in a nice, neat array, the iterators used by vector can be simple pointers. This is good of all the sequence containers, these pointers allow the fastest selection and manipulation. Whether they are simple pointers, or whether they are iterator objects that hold an internal pointer into their container, consider what happens when you add the one additional object that causes the vector to reallocate storage and move it elsewhere. The iterator s pointer is now pointing off into nowhere:

//: C07:VectorCoreDump.cpp
// Invalidating an iterator.
#include <iterator>
#include <iostream>
#include <vector>
using namespace std;
 
int main() {
vector<int> vi(10, 0);
ostream_iterator<int> out(cout, " ");
vector<int>::iterator i = vi.begin();
*i = 47;
copy(vi.begin(), vi.end(), out);
cout << endl;
// Force it to move memory (could also just add
// enough objects):
vi.resize(vi.capacity() + 1);
// Now i points to wrong memory:
*i = 48; // Access violation
copy(vi.begin(), vi.end(), out); // No change to vi[0]
} ///:~
 

This illustrates the concept of iterator invalidation. Certain operations cause internal changes to a container s underlying data, so any iterators in effect before such changes may no longer be valid afterward. If your program is breaking mysteriously, look for places where you hold onto an iterator while adding more objects to a vector. You ll need to get a new iterator after adding elements or use operator[ ] instead for element selections. If you combine this observation with the awareness of the potential expense of adding new objects to a vector, you may conclude that the safest way to use a vector is to fill it up all at once (ideally, knowing first how many objects you ll need) and then just use it (without adding more objects) elsewhere in the program. This is the way vector has been used in the book up to this point. The Standard C++ library documents the container operations that invalidate iterators.

You may observe that using vector as the basic container in the earlier chapters of this book might not be the best choice in all cases. This is a fundamental issue in containers and in data structures in general the best choice varies according to the way the container is used. The reason vector has been the best choice up until now is that it looks a lot like an array and was thus familiar and easy for you to adopt. But from now on it s also worth thinking about other issues when choosing containers.

Inserting and erasing elements

The vector is most efficient if:

1.      You reserve( ) the correct amount of storage at the beginning so the vector never has to reallocate.

2.      You only add and remove elements from the back end.

It is possible to insert and erase elements from the middle of a vector using an iterator, but the following program demonstrates what a bad idea this is:

//: C07:VectorInsertAndErase.cpp {-bor}
// Erasing an element from a vector.
//{L} Noisy
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
#include "Noisy.h"
using namespace std;
 
int main() {
vector<Noisy> v;
v.reserve(11);
cout << "11 spaces have been reserved" << endl;
generate_n(back_inserter(v), 10, NoisyGen());
ostream_iterator<Noisy> out(cout, " ");
cout << endl;
copy(v.begin(), v.end(), out);
cout << "Inserting an element:" << endl;
vector<Noisy>::iterator it =
v.begin() + v.size() / 2; // Middle
v.insert(it, Noisy());
cout << endl;
copy(v.begin(), v.end(), out);
cout << "\nErasing an element:" << endl;
// Cannot use the previous value of it:
it = v.begin() + v.size() / 2;
v.erase(it);
cout << endl;
copy(v.begin(), v.end(), out);
cout << endl;
} ///:~
 

When you run the program, you ll see that the call to reserve( ) really does only allocate storage no constructors are called. The generate_n( ) call is busy: each call to NoisyGen::operator( ) results in a construction, a copy-construction (into the vector), and a destruction of the temporary. But when an object is inserted into the vector in the middle, it must shift everything down to maintain the linear array, and, since there is enough space, it does this with the assignment operator. (If the argument of reserve( ) is 10 instead of 11, it must reallocate storage.) When an object is erased from the vector, the assignment operator is once again used to move everything up to cover the place that is being erased. (Notice that this requires that the assignment operator properly clean up the lvalue.) Last, the object on the end of the array is deleted.

Thinking in C++ Vol 2 - Practical Programming
Prev Home Next

 
 
   Reproduced courtesy of Bruce Eckel, MindView, Inc. Design by Interspire