Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

The sed FAQ
Prev Home Next

4.27. How do I change all paragraphs to long lines?

A frequent request is how to convert DOS-style textfiles, in which each line ends with "paragraph marker", to Microsoft-style textfiles, in which the "paragraph" marker only appears at the end of real paragraphs. Sometimes this question is framed as, "How do I remove the hard returns at the end of each line in a paragraph?"

The problem occurs because newer word processors don't work the same way older text editors did. Older text editors used a newline (CR/LF in DOS; LF alone in Unix) to end each line on screen or on disk, and used two newlines to separate paragraphs. Certain word processors wanted to make paragraph reformatting and reflowing work easily, so they use one newline to end a paragraph and never allow newlines within a paragraph. This means that textfiles created with standard editors (Emacs, vi, Vedit, Boxer, etc.) appear to have "hard returns" at inappropriate places. The following sed script finds blocks of consecutive nonblank lines (i.e., paragraphs of text), and converts each block into one long line with one "hard return" at the end.

     # sed script to change all paragraphs to long lines
     /./{H; $!d;}             # Put each paragraph into hold space
     x;                       # Swap hold space and pattern space
     s/^\(\n\)\(..*\)$/\2\1/; # Move leading \n to end of PatSpace
     s/\n\(.\)/ \1/g;         # Replace all other \n with 1 space
     # Uncomment the following line to remove excess blank lines:
     # /./!d;
     #---end of sed script---

If the input files have formatting or indentation that conveys special meaning (like program source code), this script will remove it. But if the text still needs to be extended, try 'par' (paragraph reformatter) or the 'fmt' utility with the -t or -c switches and the width option (-w) set to a number like 9999.

The sed FAQ
Prev Home Next

 
 
   Reprinted courtesy of Eric Pement. Also available at https://sed.sourceforge.net/sedfaq.html Design by Interspire