Python - String Built-in Functions

String Built-in Functions
	Chapter 12. Strings

String Built-in Functions

The following built-in functions are relevant to string manipulation

chr( i ) → character: Return a string of one character with ordinal i; 0 ≤ i < 256.
len( object ) → integer: Return the number of items of a sequence or mapping.
ord( c ) → integer: Return the integer ordinal of a one character string
repr( object ) → string: Return the canonical string representation of the object. For most object types, eval(repr(object)) == object.
str( object ) → string: Return a nice string representation of the object. If the argument is a string, the return value is the same object.
unichr( i ) → Unicode string: Return a Unicode string of one character with ordinal i; 0 ≤ i < 65536.
unicode( string , [ encoding , ] [ errors ]) → Unicode string: Creates a new Unicode object from the given encoded string. encoding defaults to the current default string encoding and errors , defining the error handling, to 'strict'.

For character code manipulation, there are three related functions: chr, ord and unichr. chr returns the ASCII character that belongs to an ASCII code number. unichr returns the Unicode character the belongs to a Unicode number. ord transforms an ASCII character to its ASCII code number, or transforms a Unicode character to its Unicode number.

The len function returns the length of the string.

>>> 
len("abcdefg")

7
>>> 
len(r"\n")

2
>>> 
len("\n")

1

The str function converts any object to a string.

>>> 
a= str(355.0/113.0)

>>> 
a

'3.14159292035'
>>> 
len(a)

13

The repr function also converts an object to a string. However, repr usually creates a string suitable for use as Python source code. For simple numeric types, it's not terribly interesting. For more complex, types, however, it reveals details of their structure. It can also be invoked using the reverse quotes (`), also called accent grave, (underneath the tilde, ~, on most keyboards).

>>> 
a="""a very

... 
long string

... 
on multiple lines"""

>>> 
print repr(a)

'a very\012long string\012on multiple lines'
>>> 
print `a`

'a very\012long string\012on multiple lines'

This representation shows the newline characters (\012) embedded within the triple-quoted string. If we simply print a or str( a ), we would see the string interpreted instead of represented.

>>> 
a="""a very

... 
long string

... 
on multiple lines"""

>>> 
print a

a very
long string
on multiple lines

The unicode( string , [ encoding , ] [ errors ]) function converts the string to a specific Unicode external representation. The default encoding is 'UTF-8' with 'strict' error handling. Choices for errors are 'strict', 'replace' and 'ignore'. Strict raises an exception for unrecognized characters, replace substitutes the Unicode replacement character (\uFFFD) and ignore skips over invalid characters. The codecs and unicodedata modules provide more functions for working with Unicode.


String Comparison Operations		String Methods