Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

String Literal Values

A string is a sequence of characters. The literal value for a string is written by surrounding the value with quotes or apostrophes. There are several variations to provide some additional features.

Basic String

"xyz" or 'xyz'. A basic string must be completed on a single line, or continued with a \ as the very last character of a line.

Multi-Line String, Triple-Quoted String

"""xyz""" or '''xyz'''. A multi-line string continues on until the concluding triple-quote or triple-apostrophe.

Unicode String

u"Unicode", u'Unicode', u"""Unicode""", etc. Unicode is the Universal Character Set; each character requires from 1 to 4 bytes of storage. ASCII is a single-byte character set; each of the 256 ASCII characters requires a single byte of storage. Unicode permits any character in any of the languages in common use around the world.

Raw String

r"raw\nstring", r'raw\nstring', etc. The backslash characters (\) are not interpreted by Python, but are left as is. This is handy for Windows files names that contain \'s. It is also handy for regular expressions that make extensive use of backslashes. Example: '\n' is a one-character string with a non-printing newline; r'\n' is a two-character string.

Outside of raw strings, non-printing characters and Unicode characters that aren't found on your keyboard are created using escapes. A table of escapes is provided below. These are Python representations for unprintable ASCII characters. They're called escapes because the \ is an escape from the usual meaning of the following character.

Escape Meaning
\ at end of a line The end-of-line is ignored, the string continues on the next line
\\ Backslash (\)
\' Apostrophe (')
\" Quote (")
\a ASCII Bell (BEL), an audible signal. Some OS's translate this to a screen flash or ignore it completely.
\b ASCII Backspace (BS)
\f ASCII Formfeed (FF)
\n ASCII Linefeed (LF)
\r ASCII Carriage Return (CR)
\t ASCII Horizontal Tab (TAB)
\v ASCII Vertical Tab (VT)
\ ooo ASCII character with octal value ooo . Exactly three octal digits are required.
\x hh ASCII character with hex value hh

Note that adjacent strings are automatically put together to make a longer string.

"ab" "cd" "ef" is the same as "abcdef".

For Unicode, a special \u xxxx escape is provided. This requires the four digit Unicode character identification. 日本 is written in Python as u'\u65e5\u672c' using two Unicode characters provided via escapes. There are a variety of Unicode encoding schemes, for example, UTF-8, UTF-16 and LATIN-1. The codecs module provides mechanisms for encoding and decoding Unicode strings.


 
 
  Published under the terms of the Open Publication License Design by Interspire