Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

The sed FAQ
Prev Home Next

4.3. How do I convert files with toggle characters, like +this+, to look like [i]this[/i]?

Input files, especially message-oriented text files, often contain toggle characters for emphasis, like ~this~, this, or =this=. Sed can make the same input pattern produce alternating output each time it is encountered. Typical needs might be to generate HMTL codes or print codes for boldface, italic, or underscore. This script accomodates multiple occurrences of the toggle pattern on the same line, as well as cases where the pattern starts on one line and finishes several lines later, even at the end of the file:

     # sed script to convert +this+ to [i]this[/i]
     :a
     /+/{ x;        # If "+" is found, switch hold and pattern space
       /^ON/{       # If "ON" is in the (former) hold space, then ..
         s///;      # .. delete it
         x;         # .. switch hold space and pattern space back
         s|+|[/i]|; # .. turn the next "+" into "[/i]"
         ba;        # .. jump back to label :a and start over
       }
     s/^/ON/;       # Else, "ON" was not in the hold space; create it
     x;             # Switch hold space and pattern space
     s|+|[i]|;      # Turn the first "+" into "[i]"
     ba;            # Branch to label :a to find another pattern
     }
     #---end of script---

This script uses the hold space to create a "flag" to indicate whether the toggle is ON or not. We have added remarks to illustrate the script logic, but in most versions of sed remarks are not permitted after 'b'ranch commands or labels.

If you are sure that the +toggle+ characters never cross line boundaries (i.e., never begin on one line and end on another), this script can be reduced to one line:

     s|+\([^+][^+]*\)+|[i]\1[/i]|g

If your toggle pattern contains regex metacharacters (such as '*' or perhaps '+' or '?'), remember to quote them with backslashes.

The sed FAQ
Prev Home Next

 
 
   Reprinted courtesy of Eric Pement. Also available at https://sed.sourceforge.net/sedfaq.html Design by Interspire