Python - Sequence Processing Functions: map, filter, reduce and <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta name="keywords" content="python, python Tutorial, python Scripting, python Guide"> <meta name="description" content="python Programming Guide"> <META NAME="distribution" CONTENT="global"> <script language="JavaScript" type="text/JavaScript" src="/images/myfreetemplates.js"></script> <link href="/styles/linuxtopia.css" rel="stylesheet" type="text/css">  <script src="//g.ezoic.net/ezoic/ezoiclitedata.go?did=31310"></script>  <link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="https://www.linuxtopia.org/linuxtopia.rss"> <script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-350808-1', 'auto'); ga('send', 'pageview'); </script> <script type="text/javascript"> // Stop some malicious behaviors... if ( top != self) { top.location.replace(document.location); alert("iFrame not allowed; click OK to load this page without the iFrame.") } if ( location.search.length > 0 ) { location.search = '' ; } </script> <div id="search-div"> <form action="https://www.linuxtopia.org/SearchResults.html" id="cse-search-box"> <div> <input type="hidden" name="cx" value="partner-pub-7393409044112403:n9o0jl-7n7c" /> <input type="hidden" name="cof" value="FORID:10" /> <input type="hidden" name="ie" value="ISO-8859-1" /> <input type="text" name="q" size="15" /> <input type="submit" name="sa" value="Search" /> </div> <closeform></closeform></form> <script type="text/javascript" src="https://www.google.com/coop/cse/brand?form=cse-search-box&lang=en"></script> </div> </head> <body leftmargin="0" topmargin="0" marginwidth="0" marginheight="0" onLoad="MM_preloadImages('/images/btn_home_dn.jpg','/images/btn_about_dn.jpg','/images/btn_contact_dn.jpg','/images/btn_products_dn.jpg','/images/btn_support_dn.jpg','/images/btn_news_dn.jpg')"> <table width="100%" height="100%" border="0" cellpadding="0" cellspacing="0"> <tr> <td height="87"><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="342" rowspan="2"><img src="/images/toplogo.jpg" width="342" height="87"></td> <td width="34" rowspan="2"><img src="/images/topmidspace.jpg" width="34" height="87"></td> <td background="/images/topbg.jpg"><img src="/images/topbg.jpg" width="1" height="54"></td> </tr> <tr> <td background="/images/topnavbg.jpg"> <img src="/images/btn_home.jpg" name="btn_home" width="55" height="33" id="btn_home" onClick="location.href='/index.html'" onMouseOver="MM_swapImage('btn_home','','/images/btn_home_dn.jpg',1)" onMouseOut="MM_swapImgRestore()"><img src="/images/navspacer.jpg" width="1" height="33"><img src="/images/btn_library.jpg" name="btn_library" width="67" height="33" id="btn_library" onClick="location.href='/online_books/index.html'" onMouseOver="MM_swapImage('btn_library','','/images/btn_library_dn.jpg',1)" onMouseOut="MM_swapImgRestore()"><img src="/images/navspacer.jpg" width="1" height="33"><img src="/images/btn_techotopia.jpg" name="btn_techotopia_dn.jpg" width="83" height="33" id="btn_techotopia" onClick="location.href='https://www.techotopia.com'" onMouseOver="MM_swapImage('btn_techotopia','','/images/btn_techotopia_dn.jpg',1)" onMouseOut="MM_swapImgRestore()"><img src="/images/navspacer.jpg" width="1" height="33"><img src="/images/btn_virtuatopia.jpg" name="btn_virtuatopia_dn.jpg" width="83" height="33" id="btn_virtuatopia" onClick="location.href='https://www.virtuatopia.com'" onMouseOver="MM_swapImage('btn_virtuatopia','','/images/btn_virtuatopia_dn.jpg',1)" onMouseOut="MM_swapImgRestore()"><img src="/images/navspacer.jpg" width="1" height="33"><img src="/images/btn_store.jpg" name="btn_store_dn.jpg" width="83" height="33" id="btn_store" onClick="location.href='https://www.payloadbooks.com'" onMouseOver="MM_swapImage('btn_store','','/images/btn_store_dn.jpg',1)" onMouseOut="MM_swapImgRestore()"><img src="/images/navspacer.jpg" width="1" height="33"></td> </tr> </table></td> </tr> <tr> <td valign="top"> <table width="100%" height="100%" border="0" cellpadding="0" cellspacing="0" background="/images/topnavbg.jpg"> <tr bgcolor="#297dac"> <td colspan="2" align="center"><table border="0" cellpadding="1"><tr><td>  <div id="ezoic-pub-ad-placeholder-106"></div>  </td></tr></table> </td> </tr> <tr> <td width="200" rowspan="2" valign="top"><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr align="center" bgcolor="#98bfdc" height="40"> <td><a href="https://www.twitter.com/Techotopia"><img src="https://twitter-badges.s3.amazonaws.com/follow_us-a.png" alt="Follow Techotopia on Twitter"/></a></td> </tr> <tr> <td><img src="/images/spacer.gif" width="1" height="1"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="20" rowspan="24"><img src="/images/spacer.gif" width="20" height="1"></td> <td><br> <span class="sidelinks">On-line Guides</span></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/index.html">All Guides</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="https://www.payloadbooks.com">eBook Store</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/android_index.html">iOS / Android</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_for_beginners_index.html">Linux for Beginners</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/office_tools_index.html">Office Productivity</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_installation_index.html">Linux Installation</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_security_index.html">Linux Security</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_tools_index.html">Linux Utilities</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_virtualization_index.html">Linux Virtualization</a></td> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_kernel_index.html">Linux Kernel</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_administration_index.html">System/Network Admin</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_programming_index.html">Programming</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_programming_index.html">Scripting Languages</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_devtools_index.html">Development Tools</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/web_development_index.html">Web Development</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/linux_gui_toolkit_index.html">GUI Toolkits/Desktop</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/database_guides_index.html">Databases</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/mail_systems/postfix_documentation/index.html">Mail Systems</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/opensolaris_2008/index.html">openSolaris</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/online_books/eclipse_guides.html">Eclipse Documentation</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="https://www.techotopia.com">Techotopia.com</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="https://www.virtuatopia.com">Virtuatopia.com</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="https://www.answertopia.com">Answertopia.com</a></td> </tr> <tr> <td rowspan="11"><img src="/images/spacer.gif" width="20" height="1"></td> <td><br> <span class="sidelinks">How To Guides</span></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/virtualization/index.html">Virtualization</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/index.html">General System Admin</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/index.html">Linux Security</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/index.html">Linux Filesystems</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/index.html">Web Servers</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/index.html">Graphics & Desktop</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/index.html">PC Hardware</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/HowToGuides/windows/index.html">Windows</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/ProblemSolutions/index.html">Problem Solutions</a></td> </tr> <tr> <td><img src="/images/arrow.jpg" width="20" height="10"><a class="navlink" href="/privacy_policy.html">Privacy Policy</a></td> </tr> <tr> <td rowspan="2"><img src="/images/spacer.gif" width="20" height="1"></td> <td> <span class="sidelinks"> <span class="sidelinks"><br> <a href="https://www.payloadbooks.com"> <img src="/images/eBookStore_trans.png" /></a> <br><br> <br> <a href="https://www.payloadbooks.com/product/red-hat-enterprise-linux-9-essentials-ebook/"><img src="/cover_images/rhel_9_sky.png" /></a> <br> <div style="height:600px;width:160px">  <div id="ezoic-pub-ad-placeholder-101"></div>  </div> <table> <tr> <td> <div style="height:600px;width:160px">  <div id="ezoic-pub-ad-placeholder-104"></div>  </div> </td> </tr> <tr> <td> <div style="height:600px;width:160px">  <div id="ezoic-pub-ad-placeholder-105"></div>  </div> </td> </tr> <tr> <td>  </td> </tr> </table> </span></td> </tr> <tr> <td> </tr> </td> </span> </table></td> </tr> </table></td> <td valign="top" bgcolor="#FFFFFF"> <table width="100%" border="0" cellpadding="0" cellspacing="0"> <tr> <td height="10" colspan="3" background="/images/navbasebg.jpg"><img src="/images/navbasebg.jpg" width="1" height="10"></td> </tr> <tr> <td colspan="3" align="center"> <br>  <div id="ezoic-pub-ad-placeholder-102"></div>  </td> </tr> <tr> <td width="40"> </td> <td> <p> </p>  <div id=EchoTopic> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> zip

Sequence Processing Functions: `map`, `filter`, `reduce` and `zip`
	Chapter 20. Advanced Sequences

Sequence Processing Functions: `map`, `filter`, `reduce` and `zip`

The map, filter, and reduce built-in functions are handy functions for processing sequences. These owe much to the world of functional programming languages. The idea is to take a small function you write and apply it to all the elements of a sequence. This saves you writing an explicit loop. The implicit loop within each of these functions may be faster than an explicit for or while loop.

Additionally, each of these is a pure function, returning a result value. This allows the results of the functions to be combined into complex expressions relatively easily.

It is common to have multiple-step processes on lists of values. For instance, filtering a large set of data to locate useful samples, transforming those samples and then computing a sum or average. Rather than write three explicit loops, the result can be computed in a single expression.

Let's say we have two sequences and we have a multi-step process like this:

for i in seq1:.... apply function f1 and create sequence r1
for i in seq2:.... apply function f2 and create sequence r2
for i in len(r1):.... apply function f3 to r1[i] and r2[i]

Instead of these lengthy explicit loops, we can take a functional approach that look like this:

f= reduce( f3, zip( map(f1,seq1), map(f2,seq2) ) )

Where f1, f2, and f3 are functions that define the body of the each of the above loops.

Definitions. These functions transform lists. The map and filter each apply some function to a sequence to create a new sequence. The reduce function applies a function which will reduce the sequence to a single value. The zip function interleaves values from lists to create a list of tuples.

The map, zip and filter functions have no internal state, they simply apply the function to each individual value of the sequence. The reduce function, in contrast, maintains an internal state which is seeded from an initial value, passed to the function along with each value of the sequence and returned as the final result.

Here are the formal definitions.

map ( function , sequence , [ sequence... ] ) → list: Create a new list from the results of applying the given function to the items of the the given sequence . If more than one sequence is given, the function is called with multiple arguments, consisting of the corresponding item of each sequence. If any sequence is too short, None is used for missing value. If the function is None, map will create tuples from corresponding items in each list, much like the zip function.
filter ( function , sequence ) → list: Return a list containing those items of sequence for which function( item ) is true. If function is None, return a list of items that are equivalent to True.
reduce ( function , sequence , [ initial ]) → value: Apply a function of two arguments cumulatively to the items of a sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce( lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty.
zip ( seq1 , [ seq2,... ] ) → [ ( seq1 [0], seq2 [0],...), ... ]: Return a list of tuples, where each tuple contains the matching element from each of the argument sequences. The returned list is truncated in length to the length of the shortest argument sequence.

Costs and Benefits. What are the advantages? First, the functional version can be clearer. It's a single line of code that summarizes the processing. Second, and more important, Python can execute the sequence processing functions far faster than the equivalent explicit loop.

You can see that map and filter are equivalent to simple list comprehensions. This gives you two ways to specify these operations, both of which have approximately equivalent performance. This means that map and filter aren't essential to Python, but hey are widely used.

The reduce function is a bit of a problem. It can have remarkably bad performance if it is misused. Consequently, there is some debate about the value of having this function. We'll present the function along with the caveat that it can lead to remarkable slowness.

The map Function. The map function transforms a sequence into another sequence by applying a function to each item in the sequence. The idea is to apply a mapping tranformation to a sequence. This is is a common design pattern within numerous kinds of programs. Generally, the transformation is the interesting part of the programming, and the loop is just another boring old loop.

The function call map( aFunction , aSequence ) behaves as if it had the following definition.

def map( aFunction, aSequence ):
    return [ aFunction(v) for v in aSequence ]

For example:

>>> 

map( int, [ "10", "12", "14", 3.1415926, 5L ] )

[10, 12, 14, 3, 5]

This applies the int function to each element of the input sequence (a list that contains some strings, a floating point value and a long integer value) to create the output sequence (a list of integers).

The function used in map can be a built-in function, or a user-defined function created with the def statement (see Chapter 9, Functions ).

>>> 
def oddn(x):

... 
    return x*2+1

...
>>> 
map( oddn, range(6) )

[1, 3, 5, 7, 9, 11]

This example defines a function oddn, which creates an odd number from the input. The map function applies our oddn function to each value of the sequence created by range( 6 ). We get the first 6 odd numbers.

The filter Function. The filter function chooses elements from the input sequence where a supplied function is True. Elements for which the supplied function is False are discarded.

The function call filter( aFunction , aSequence ) behaves as if it had the following definition.

def filter( aFunction, aSequence ):
    return [ v for v in aSequence if aFunction(v) ]

For example:

>>> 
def gt2( a ):

... 
    return a > 2

...
>>> 
filter( gt2, range(8) )

[3, 4, 5, 6, 7]

This example uses a function, gt2, which returns True for inputs greater than 2. We create a range from 0 to 7, but only keep the values for which the filter function returns True.

Here's another example that keeps all numbers that are evenly divisible by 3.

>>> 
def div3( a ):

... 
    return a % 3 == 0

...
>>> 
filter( div3, range(10) )

[0, 3, 6, 9]

Our function, div3, returns True when the remainder of dividing a number by 3 is exactly 0. This will keep numbers that are evenly divisible by 3. We create a range, apply the function to each value in the range, and keep only the values where the filter is True.

The reduce Function. The reduce function can be used to implement the common spread-sheet functions that compute sums and products. This function works by seeding a result with an initial value. It then calls the user-supplied function with this result and each value of the sequence. This is remarkably common, but it can't be done as a list comprehension.

The function call reduce( aFunction , aSequence , [, init ]) behaves as if it had the following definition.

def reduce( aFunction, aSequence, init= 0 ):
    r= init
    for s in aSequence:
        r= aFunction( r, s )
    return r

The important thing to note is that the function is applied to the internal value, r, and each element of the list to compute a new internal value.

For example:

>>> 
def add(a,b):

... 
    return a+b

...
>>> 
reduce( add, range(10) )

45

This expression computes the sum of the 10 numbers from zero through nine. The function we defined, add, adds the previous result and the next sequence value together.

Here's an interesting example that combines reduce and map. This uses two functions defined in earlier examples, add and oddn.

for i in range(10):
    sq=reduce( add, map(oddn, range(i)), 0 )
    print i, sq

Let's look at the evaluation from innermost to outermost. The range( i ) generates a list of numbers from 0 to i-1. The map applies the oddn function form to create a sequence of i odd numbers from 1 to 2i+1. The reduce then adds this sequence of odd numbers. Interestingly, these sums add to the square of i.

The zip Function. The zip function interleaves values from two sequences to create a new sequence. The new sequence is a sequence of tuples. Each item of a tuple is the corresponding values from from each sequence.

>>> 
zip( range(5), range(1,20,2) )

[(0, 1), (1, 3), (2, 5), (3, 7), (4, 9)]

In this example, we zipped two sequences together. The first sequence was range(5), which has five values. The second sequence was range(1,20,2) which has 10 odd numbers from 1 to 19. Since zip truncates to the shorter list, we get five tuples, each of which has the matching values from both lists.

The map function behaves a little like zip when there is no function provided, just sequences. However, map does not truncate, it fills the shorter list with None values.

>>> 
map( None, range(5), range(1,20,2) )

[(0, 1), (1, 3), (2, 5), (3, 7), (4, 9), (None, 11), (None, 13), (None, 15), 
(None, 17), (None, 19)]


List Comprehensions		Advanced List Sorting

Sequence Processing Functions: map, filter, reduce and zip

Sequence Processing Functions: `map`, `filter`, `reduce` and `zip`