Tab Files: Nothing Special
Tab-delimited files are text files organized around data that has
rows and columns. This format is used to exchange data between
spread-sheet programs or databases. A tab-delimited file uses just rwo
punctuation rules to encode the data.
-
Each row is delimited by an ordinary newline character. This
is usually the standard \n. If you are exchanging files
across platforms, you may need to open files for reading using the
"rU" mode to get universal newline handling.
-
Within a row, columns are delimited by a single character,
often \t. The column punctuation character that is
chosen is one that will never occur in the data. It is usually (but
not always) an unprintable character like \t.
In the ideal cases, a CSV file will have the same number of
columns in each row, and the first row will be column titles. Almost as
pleasant is a file without column titles, but with a known sequence of
columns. In the more complex cases, the number of columns per row
varies.
When we have a single, standard punctuation mark, we can simply
use two operations in the string and
list classes to process files. We use the
split method of a string
to parse the rows. We use the join method of a
list to assemble the rows.
We don't actually need a separate module to handle tab-delimited
files. We looked at a related example in the section called “Reading a Text File”.
Reading. The most general case for reading Tab-delimited data is shown in
the following example.
myFile= open( "
somefile
", "rU" )
for aRow in myFile:
print aRow.split('\t')
myFile.close()
Each row will be a list of column
values.
Writing. The writing case is the inverse of the reading case.
Essentially, we use a "\t".join( someList ) to create the
tab-delimeted row. Here's our sailboat example, done as tab-delimited
data.
test= file( "boats.tab", "w" )
test.write( "\t".join( Boat.csvHeading ) )
test.write( "\n" )
for d in db:
test.write( "\t".join( map( str, d.csvRow() ) ) )
test.write( "\n" )
test.close()
Note that some elements of our data objects aren't string values.
In this case, the value for sails is a tuple, which needs to be
converted to a proper string. The expression map(str,
someList
) applies the
str function to each element of the original list,
creating a new list which will have all string values. See the section called “Sequence Processing Functions: map,
filter, reduce and
zip”.