Thinking in C++ Vol 2 - Practical Programming |
Prev |
Home |
Next |
You ve seen the set, which allows only one object of
each value to be inserted. The multiset is odd by comparison since it allows more than one object of each value to be inserted. This seems to go against the whole
idea of setness, where you can ask, Is it in this set? If there can be
more than one it, what does that question mean?
With some thought, you can see that it makes little sense to
have more than one object of the same value in a set if those duplicate objects
are exactly the same (with the possible exception of counting
occurrences of objects, but as seen earlier in this chapter that can be handled
in an alternative, more elegant fashion). Thus, each duplicate object will have
something that makes it different from the other duplicates most likely
different state information that is not used in the calculation of the key
during the comparison. That is, to the comparison operation, the objects look
the same, but they contain some differing internal state.
Like any STL container that must order its elements, the multiset
template uses the less function object by default to determine element
ordering. This uses the contained class s operator<, but you can always
substitute your own comparison function.
Consider a simple class that contains one element that is
used in the comparison and another that is not:
//: C07:MultiSet1.cpp
// Demonstration of multiset behavior.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <set>
using namespace std;
class X {
char c; // Used in comparison
int i; // Not used in comparison
// Don't need default constructor and operator=
X();
X& operator=(const X&);
// Usually need a copy-constructor (but the
// synthesized version works here)
public:
X(char cc, int ii) : c(cc), i(ii) {}
// Notice no operator== is
required
friend bool operator<(const X& x, const X&
y) {
return x.c < y.c;
}
friend ostream& operator<<(ostream& os,
X x) {
return os << x.c << ":"
<< x.i;
}
};
class Xgen {
static int i;
// Number of characters to select from:
enum { SPAN = 6 };
public:
X operator()() {
char c = 'A' + rand() % SPAN;
return X(c, i++);
}
};
int Xgen::i = 0;
typedef multiset<X> Xmset;
typedef Xmset::const_iterator Xmit;
int main() {
Xmset mset;
// Fill it with X's:
srand(time(0)); // Randomize
generate_n(inserter(mset, mset.begin()), 25, Xgen());
// Initialize a regular set from mset:
set<X> unique(mset.begin(), mset.end());
copy(unique.begin(), unique.end(),
ostream_iterator<X>(cout, " "));
cout << "\n---- << endl;
// Iterate over the unique values:
for(set<X>::iterator i = unique.begin();
i != unique.end(); i++) {
pair<Xmit, Xmit> p = mset.equal_range(*i);
copy(p.first,p.second,
ostream_iterator<X>(cout, " "));
cout << endl;
}
} ///:~
In X, all the comparisons are made with the char c.
The comparison is performed with operator<, which is all that is
necessary for the multiset, since in this example the default less
comparison object is used. The class Xgen randomly generates X
objects, but the comparison value is restricted to the span from A to
E . In main( ), a multiset<X> is created and
filled with 25 X objects using Xgen, guaranteeing that there will
be duplicate keys. So that we know what the unique values are, a regular set<X>
is created from the multiset (using the iterator, iterator
constructor). These values are displayed, and then each one produces the equal_range( )
in the multiset (equal_range( ) has the same meaning here as
it does with multimap: all the elements with matching keys). Each set of
matching keys is then printed.
As a second example, a (possibly) more elegant version of WordCount.cpp
can be created using multiset:
//: C07:MultiSetWordCount.cpp
// Count occurrences of words using a multiset.
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <string>
#include "../require.h"
using namespace std;
int main(int argc, char* argv[]) {
const char* fname =
"MultiSetWordCount.cpp";
if(argc > 1) fname = argv[1];
ifstream in(fname);
assure(in, fname);
multiset<string> wordmset;
string word;
while(in >> word)
wordmset.insert(word);
typedef multiset<string>::iterator MSit;
MSit it = wordmset.begin();
while(it != wordmset.end()) {
pair<MSit, MSit> p = wordmset.equal_range(*it);
int count = distance(p.first, p.second);
cout << *it << ": " <<
count << endl;
it = p.second; // Move to the next word
}
} ///:~
The setup in main( ) is identical to WordCount.cpp,
but then each word is simply inserted into the multiset<string>.
An iterator is created and initialized to the beginning of the multiset;
dereferencing this iterator produces the current word. The equal_range( )
member function (not generic algorithm) produces the starting and ending
iterators of the word that s currently selected, and the algorithm distance( )
(defined in <iterator>) counts the number of elements in that
range. The iterator it is then moved forward to the end of the range,
which puts it at the next word. If you re unfamiliar with the multiset,
this code can seem more complex. The density of it and the lack of need for
supporting classes such as Count has a lot of appeal.
In the end, is this really a set, or should it be called
something else? An alternative is the generic bag that is defined in some
container libraries, since a bag holds anything, without
discrimination including duplicate objects. This is close, but it doesn t quite
fit since a bag has no specification about how elements should be ordered. A multiset
(which requires that all duplicate elements be adjacent to each other) is even
more restrictive than the concept of a set. A set implementation might use a
hashing function to order its elements, which would not put them in sorted
order. Besides, if you want to store a bunch of objects without any special
criteria, you will probably just use a vector, deque, or list.
Thinking in C++ Vol 2 - Practical Programming |
Prev |
Home |
Next |