Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Thinking in C++ Vol 2 - Practical Programming
Prev Home Next

Removing characters from strings

Removing characters is easy and efficient with the erase( ) member function, which takes two arguments: where to start removing characters (which defaults to 0), and how many to remove (which defaults to string::npos). If you specify more characters than remain in the string, the remaining characters are all erased anyway (so calling erase( ) without any arguments removes all characters from a string). Sometimes it s useful to take an HTML file and strip its tags and special characters so that you have something approximating the text that would be displayed in the Web browser, only as a plain text file. The following example uses erase( ) to do the job:

//: C03:HTMLStripper.cpp {RunByHand}
//{L} ReplaceAll
// Filter to remove html tags and markers.
#include <cassert>
#include <cmath>
#include <cstddef>
#include <fstream>
#include <iostream>
#include <string>
#include "ReplaceAll.h"
#include "../require.h"
using namespace std;
 
string& stripHTMLTags(string& s) {
static bool inTag = false;
bool done = false;
while(!done) {
if(inTag) {
// The previous line started an HTML tag
// but didn't finish. Must search for '>'.
size_t rightPos = s.find('>');
if(rightPos != string::npos) {
inTag = false;
s.erase(0, rightPos + 1);
}
else {
done = true;
s.erase();
}
}
else {
// Look for start of tag:
size_t leftPos = s.find('<');
if(leftPos != string::npos) {
// See if tag close is in this line:
size_t rightPos = s.find('>');
if(rightPos == string::npos) {
inTag = done = true;
s.erase(leftPos);
}
else
s.erase(leftPos, rightPos - leftPos + 1);
}
else
done = true;
}
}
// Remove all special HTML characters
replaceAll(s, "&lt;", "<");
replaceAll(s, "&gt;", ">");
replaceAll(s, "&amp;", "&");
replaceAll(s, "&nbsp;", " ");
// Etc...
return s;
}
 
int main(int argc, char* argv[]) {
requireArgs(argc, 1,
"usage: HTMLStripper InputFile");
ifstream in(argv[1]);
assure(in, argv[1]);
string s;
while(getline(in, s))
if(!stripHTMLTags(s).empty())
cout << s << endl;
} ///:~
 

This example will even strip HTML tags that span multiple lines.[35] This is accomplished with the static flag, inTag, which is true whenever the start of a tag is found, but the accompanying tag end is not found in the same line. All forms of erase( ) appear in the stripHTMLFlags( ) function.[36] The version of getline( ) we use here is a (global) function declared in the <string> header and is handy because it stores an arbitrarily long line in its string argument. You don t need to worry about the dimension of a character array as you do with istream::getline( ). Notice that this program uses the replaceAll( ) function from earlier in this chapter. In the next chapter, we ll use string streams to create a more elegant solution.

Thinking in C++ Vol 2 - Practical Programming
Prev Home Next

 
 
   Reproduced courtesy of Bruce Eckel, MindView, Inc. Design by Interspire