Thinking in C++ - Practical Programming

Thinking in C++ Vol 2 - Practical Programming
Prev	Home	Next

Associative containers

C# Essentials
eBook

$9.99

eBookFrenzy.com

The set, map, multiset, and multimap are called associative containers because they associate keys with values. Well, at least maps and multimaps associate keys with values, but you can look at a set as a map that has no values, only keys (and they can in fact be implemented this way), and the same for the relationship between multiset and multimap. So, because of the structural similarity, sets and multisets are lumped in with associative containers.

The most important basic operations with associative containers are putting things in and, in the case of a set, seeing if something is in the set. In the case of a map, you want to first see if a key is in the map, and if it exists, you want the associated value for that key. There are many variations on this theme, but that s the fundamental concept. The following example shows these basics:

//: C07:AssociativeBasics.cpp {-bor}

// Basic operations with sets and maps.

//{L} Noisy

#include <cstddef>

#include <iostream>

#include <iterator>

#include <map>

#include <set>

#include "Noisy.h"

using namespace std;

int main() {

Noisy na[7];

// Add elements via constructor:

set<Noisy> ns(na, na + sizeof na/sizeof(Noisy));

Noisy n;

ns.insert(n); // Ordinary insertion

cout << endl;

// Check for set membership:

cout << "ns.count(n)= " << ns.count(n) << endl;

if(ns.find(n) != ns.end())

cout << "n(" << n << ") found in ns" << endl;

// Print elements:

copy(ns.begin(), ns.end(),

ostream_iterator<Noisy>(cout, " "));

cout << endl;

cout << "\n----------- << endl;

map<int, Noisy> nm;

for(int i = 0; i < 10; i++)

nm[i]; // Automatically makes pairs

cout << "\n----------- << endl;

for(size_t j = 0; j < nm.size(); j++)

cout << "nm[" << j <<"] = " << nm[j] << endl;

cout << "\n----------- << endl;

nm[10] = n;

cout << "\n----------- << endl;

nm.insert(make_pair(47, n));

cout << "\n----------- << endl;

cout << "\n nm.count(10)= " << nm.count(10) << endl;

cout << "nm.count(11)= " << nm.count(11) << endl;

map<int, Noisy>::iterator it = nm.find(6);

if(it != nm.end())

cout << "value:" << (*it).second

<< " found in nm at location 6" << endl;

for(it = nm.begin(); it != nm.end(); it++)

cout << (*it).first << ":" << (*it).second << ", ";

cout << "\n----------- << endl;

} ///:~

The set<Noisy> object ns is created using two iterators into an array of Noisy objects, but there is also a default constructor and a copy-constructor, and you can pass in an object that provides an alternate scheme for doing comparisons. Both sets and maps have an insert( ) member function to put things in, and you can check to see if an object is already in an associative container in two ways. The count( ) member function, when given a key, will tell you how many times that key occurs. (This can only be zero or one in a set or map, but it can be more than one with a multiset or multimap.) The find( ) member function will produce an iterator indicating the first occurrence (with set and map, the only occurrence) of the key that you give it or will produce the past-the-end iterator if it can t find the key. The count( ) and find( ) member functions exist for all the associative containers, which makes sense. The associative containers also have member functions lower_bound( ), upper_bound( ), and equal_range( ), which only make sense for multiset and multimap, as you will see. (But don t try to figure out how they would be useful for set and map, since they are designed for dealing with a range of duplicate keys, which those containers don t allow.)

Designing an operator[ ] always presents a bit of a dilemma. Because it s intended to be treated as an array-indexing operation, people don t tend to think about performing a test before they use it. But what happens if you decide to index out of the bounds of the array? One option is to throw an exception, but with a map, indexing out of the array could mean that you want to create a new entry at that location, and that s the way the STL map treats it. The first for loop after the creation of the map<int, Noisy> nm just looks up objects using the operator[ ], but this is actually creating new Noisy objects! The map creates a new key-value pair (using the default constructor for the value) if you look up a value with operator[ ] and it isn t there. This means that if you really just want to look something up and not create a new entry, you must use the member functions count( ) (to see if it s there) or find( ) (to get an iterator to it).

A number of problems are associated with the for loop that prints the values of the container using operator[ ]. First, it requires integral keys (which we happen to have here). Next and worse, if all the keys are not sequential, you ll end up counting from zero to the size of the container, and if some spots don t have key-value pairs, you ll automatically create them and miss some of the higher values of the keys. Finally, if you look at the output from the for loop, you ll see that things are very busy, and it s quite puzzling at first why there are so many constructions and destructions for what appears to be a simple lookup. The answer only becomes clear when you look at the code in the map template for operator[ ], which will be something like this:

mapped_type& operator[] (const key_type& k) {

value_type tmp(k,T());

return (*((insert(tmp)).first)).second;

}

The map::insert( ) function takes a key-value pair and does nothing if there is already an entry in the map with the given key otherwise it inserts an entry for the key. In either case, it returns a new key-value pair holding an iterator to the inserted pair as its first element and holding true as the second element if an insertion took place. The members first and second give the key and value, respectively, because map::value_type is really just a typedef for a std::pair:

typedef pair<const Key, T> value_type;

You ve seen the std::pair template before. It s a simple holder for two values of independent types, as you can see by its definition:

template<class T1, class T2> struct pair {

typedef T1 first_type;

typedef T2 second_type;

T1 first;

T2 second;

pair();

pair(const T1& x, const T2& y) : first(x), second(y) {}

// Templatized copy-constructor:

template<class U, class V> pair(const pair<U, V> &p);

};

The pair template class is very useful, especially when you want to return two objects from a function (since a return statement only takes one object). There s even a shorthand for creating a pair called make_pair( ), which is used in AssociativeBasics.cpp.

So to retrace the steps, map::value_type is a pair of the key and the value of the map actually, it s a single entry for the map. But notice that pair packages its objects by value, which means that copy-constructions are necessary to get the objects into the pair. Thus, the creation of tmp in map::operator[ ] will involve at least a copy-constructor call and destructor call for each object in the pair. Here, we re getting off easy because the key is an int. But if you want to really see what kind of activity can result from map::operator[ ], try running this:

//: C07:NoisyMap.cpp

// Mapping Noisy to Noisy.

//{L} Noisy

#include <map>

#include "Noisy.h"

using namespace std;

int main() {

map<Noisy, Noisy> mnn;

Noisy n1, n2;

cout << "\n-------- << endl;

mnn[n1] = n2;

cout << "\n-------- << endl;

cout << mnn[n1] << endl;

cout << "\n-------- << endl;

} ///:~

You ll see that both the insertion and lookup generate a lot of extra objects, and that s because of the creation of the tmp object. If you look back up at map::operator[ ], you ll see that the second line calls insert( ), passing it tmp that is, operator[ ] does an insertion every time. The return value of insert( ) is a different kind of pair, where first is an iterator pointing to the key-value pair that was just inserted, and second is a bool indicating whether the insertion took place. You can see that operator[ ] grabs first (the iterator), dereferences it to produce the pair, and then returns the second, which is the value at that location.

So on the upside, map has this fancy make a new entry if one isn t there behavior, but the downside is that you always get a lot of extra object creations and destructions when you use map::operator[ ]. Fortunately, AssociativeBasics.cpp also demonstrates how to reduce the overhead of insertions and deletions, by avoiding operator[ ] if you don t need it. The insert( ) member function is slightly more efficient than operator[ ]. With a set, you hold only one object, but with a map, you hold key-value pairs; so insert( ) requires a pair as its argument. Here s where make_pair( ) comes in handy, as you can see.

For looking objects up in a map, you can use count( ) to see whether a key is in the map, or you can use find( ) to produce an iterator pointing directly at the key-value pair. Again, since the map contains pairs, that s what the iterator produces when you dereference it, so you have to select first and second. When you run AssociativeBasics.cpp, you ll notice that the iterator approach involves no extra object creations or destructions. It s not as easy to write or read, though.

Thinking in C++ Vol 2 - Practical Programming
Prev	Home	Next