Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Thinking in C++ Vol 2 - Practical Programming
Prev Home Next

Multisets

You ve seen the set, which allows only one object of each value to be inserted. The multiset is odd by comparison since it allows more than one object of each value to be inserted. This seems to go against the whole idea of setness, where you can ask, Is it in this set? If there can be more than one it, what does that question mean?

With some thought, you can see that it makes little sense to have more than one object of the same value in a set if those duplicate objects are exactly the same (with the possible exception of counting occurrences of objects, but as seen earlier in this chapter that can be handled in an alternative, more elegant fashion). Thus, each duplicate object will have something that makes it different from the other duplicates most likely different state information that is not used in the calculation of the key during the comparison. That is, to the comparison operation, the objects look the same, but they contain some differing internal state.

Like any STL container that must order its elements, the multiset template uses the less function object by default to determine element ordering. This uses the contained class s operator<, but you can always substitute your own comparison function.

Consider a simple class that contains one element that is used in the comparison and another that is not:

//: C07:MultiSet1.cpp
// Demonstration of multiset behavior.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <set>
using namespace std;
 
class X {
char c; // Used in comparison
int i; // Not used in comparison
// Don't need default constructor and operator=
X();
X& operator=(const X&);
// Usually need a copy-constructor (but the
// synthesized version works here)
public:
X(char cc, int ii) : c(cc), i(ii) {}
// Notice no operator== is required
friend bool operator<(const X& x, const X& y) {
return x.c < y.c;
}
friend ostream& operator<<(ostream& os, X x) {
return os << x.c << ":" << x.i;
}
};
 
class Xgen {
static int i;
// Number of characters to select from:
enum { SPAN = 6 };
public:
X operator()() {
char c = 'A' + rand() % SPAN;
return X(c, i++);
}
};
 
int Xgen::i = 0;
 
typedef multiset<X> Xmset;
typedef Xmset::const_iterator Xmit;
 
int main() {
Xmset mset;
// Fill it with X's:
srand(time(0)); // Randomize
generate_n(inserter(mset, mset.begin()), 25, Xgen());
// Initialize a regular set from mset:
set<X> unique(mset.begin(), mset.end());
copy(unique.begin(), unique.end(),
ostream_iterator<X>(cout, " "));
cout << "\n---- << endl;
// Iterate over the unique values:
for(set<X>::iterator i = unique.begin();
i != unique.end(); i++) {
pair<Xmit, Xmit> p = mset.equal_range(*i);
copy(p.first,p.second, ostream_iterator<X>(cout, " "));
cout << endl;
}
} ///:~
 

In X, all the comparisons are made with the char c. The comparison is performed with operator<, which is all that is necessary for the multiset, since in this example the default less comparison object is used. The class Xgen randomly generates X objects, but the comparison value is restricted to the span from A to E . In main( ), a multiset<X> is created and filled with 25 X objects using Xgen, guaranteeing that there will be duplicate keys. So that we know what the unique values are, a regular set<X> is created from the multiset (using the iterator, iterator constructor). These values are displayed, and then each one produces the equal_range( ) in the multiset (equal_range( ) has the same meaning here as it does with multimap: all the elements with matching keys). Each set of matching keys is then printed.

As a second example, a (possibly) more elegant version of WordCount.cpp can be created using multiset:

//: C07:MultiSetWordCount.cpp
// Count occurrences of words using a multiset.
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <string>
#include "../require.h"
using namespace std;
 
int main(int argc, char* argv[]) {
const char* fname = "MultiSetWordCount.cpp";
if(argc > 1) fname = argv[1];
ifstream in(fname);
assure(in, fname);
multiset<string> wordmset;
string word;
while(in >> word)
wordmset.insert(word);
typedef multiset<string>::iterator MSit;
MSit it = wordmset.begin();
while(it != wordmset.end()) {
pair<MSit, MSit> p = wordmset.equal_range(*it);
int count = distance(p.first, p.second);
cout << *it << ": " << count << endl;
it = p.second; // Move to the next word
}
} ///:~
 

The setup in main( ) is identical to WordCount.cpp, but then each word is simply inserted into the multiset<string>. An iterator is created and initialized to the beginning of the multiset; dereferencing this iterator produces the current word. The equal_range( ) member function (not generic algorithm) produces the starting and ending iterators of the word that s currently selected, and the algorithm distance( ) (defined in <iterator>) counts the number of elements in that range. The iterator it is then moved forward to the end of the range, which puts it at the next word. If you re unfamiliar with the multiset, this code can seem more complex. The density of it and the lack of need for supporting classes such as Count has a lot of appeal.

In the end, is this really a set, or should it be called something else? An alternative is the generic bag that is defined in some container libraries, since a bag holds anything, without discrimination including duplicate objects. This is close, but it doesn t quite fit since a bag has no specification about how elements should be ordered. A multiset (which requires that all duplicate elements be adjacent to each other) is even more restrictive than the concept of a set. A set implementation might use a hashing function to order its elements, which would not put them in sorted order. Besides, if you want to store a bunch of objects without any special criteria, you will probably just use a vector, deque, or list.

Thinking in C++ Vol 2 - Practical Programming
Prev Home Next

 
 
   Reproduced courtesy of Bruce Eckel, MindView, Inc. Design by Interspire