Tab Files: Nothing Special

Tab-delimited files are text files organized around data that has rows and columns. This format is used to exchange data between spread-sheet programs or databases. A tab-delimited file uses just rwo punctuation rules to encode the data.

  • Each row is delimited by an ordinary newline character. This is usually the standard \n. If you are exchanging files across platforms, you may need to open files for reading using the "rU" mode to get universal newline handling.

  • Within a row, columns are delimited by a single character, often \t. The column punctuation character that is chosen is one that will never occur in the data. It is usually (but not always) an unprintable character like \t.

In the ideal cases, a CSV file will have the same number of columns in each row, and the first row will be column titles. Almost as pleasant is a file without column titles, but with a known sequence of columns. In the more complex cases, the number of columns per row varies.

When we have a single, standard punctuation mark, we can simply use two operations in the string and list classes to process files. We use the split method of a string to parse the rows. We use the join method of a list to assemble the rows.

We don't actually need a separate module to handle tab-delimited files. We looked at a related example in the section called “Reading a Text File”.

Reading. The most general case for reading Tab-delimited data is shown in the following example.

myFile= open( "
somefile
", "rU" )
for aRow in myFile:
    print aRow.split('\t')
myFile.close()

Each row will be a list of column values.

Writing. The writing case is the inverse of the reading case. Essentially, we use a "\t".join( someList ) to create the tab-delimeted row. Here's our sailboat example, done as tab-delimited data.

test= file( "boats.tab", "w" )
test.write( "\t".join( Boat.csvHeading ) )
test.write( "\n" )
for d in db:
    test.write( "\t".join( map( str, d.csvRow() ) ) )
    test.write( "\n" )
test.close()

Note that some elements of our data objects aren't string values. In this case, the value for sails is a tuple, which needs to be converted to a proper string. The expression map(str, someList ) applies the str function to each element of the original list, creating a new list which will have all string values. See the section called “Sequence Processing Functions: map, filter, reduce and zip.