Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions

  




 

 

The fileinput Module

The fileinput module interacts with sys.argv. The fileinput.input function opens files based on all the values of sys.argv[1:]. It carefully skips sys.argv[0], which is the name of the Python script file. For each file, it reads all of the lines as text, allowing a program to read and process multiple files, like many standard Unix utilities.

The typical use case is:

import fileinput
for line in fileinput.input():
    process(line)

This iterates over the lines of all files listed in sys.argv[1:], with a default of sys.stdin if the list is empty. If a filename is - it is also replaced by sys.stdin at that position in the list of files. To specify an alternative list of filenames, pass it as the argument to input. A single file name is also allowed in addition to a list of file names.

While processing input, several functions are available in the fileinput module:

fileinput.filename → string

the filename of the line that has just been read.

fileinput.lineno → int

the cumulative line number of the line that has just been read.

fileinput.filelineno → int

the line number in the current file.

fileinput.isfirstline → int

true if the line just read is the first line of its file.

fileinput.isstdin → int true

if the line was read from sys.stdin.

fileinput.nextfile

close the current file so that the next iteration will read the first line from the next file (if any); lines not read from the file will not count towards the cumulative line count; the filename is not changed until after the first line of the next file has been read.

fileinput.close

closes the sequence.

All files are opened in text mode. If an I/O error occurs during opening or reading a file, the IOError exception is raised.

This makes it easy to write a Python version of the common Unix utility, grep . The grep utility searches a list of files for a given pattern.

Example 33.1. greppy.py

#!/usr/bin/env python
import sys, re, fileinput
pattern= re.compile( sys.argv[1] )
for line in fileinput.input(sys.argv[2:]):
    if pattern.match( line ):
        print fileinput.filename(), fileinput.filelineno(), line

This contains the essential features of the grep . For non-Unix users, the grep utility looks for the given regular expression in any number of files. The name grep is an acronym of Global Regular Expression Print.

The re module provides the pattern matching, and the fileinput module makes searching an arbitrary list of files simple. We cover the re module in more depth in Chapter 31, Complex Strings: the re Module .

The first command line argument (sys.argv[0]) is the name of the script, which this program ignores. This program uses the second command-line argument as the pattern that defines the target of the search. The remaining command-line arguments are given to fileinput.input so that all files will be examined. The pattern regular expression is matched against each individual input line. If match returns None, the line did not match. If match returns an object, the program prints the current file name, the current line number of the file and the actual input line that matched.

After we do a chmod +x greppy.py, we can use this program as follows. Note that we have to provide quotes to prevent the shell from doing globbing on our pattern string.

$ 

greppy.py 'import.*random' *.py


demorandom.py 2 import random

dice.py 1 import random

functions.py 2 import random


 
 
  Published under the terms of the Open Publication License Design by Interspire