## Sequence Processing Functions: `map`, `filter`, `reduce` and `zip`

The `map`, `filter`, and `reduce` built-in functions are handy functions for processing sequences. These owe much to the world of functional programming languages. The idea is to take a small function you write and apply it to all the elements of a sequence. This saves you writing an explicit loop. The implicit loop within each of these functions may be faster than an explicit for or while loop.

Additionally, each of these is a pure function, returning a result value. This allows the results of the functions to be combined into complex expressions relatively easily.

It is common to have multiple-step processes on `list`s of values. For instance, filtering a large set of data to locate useful samples, transforming those samples and then computing a sum or average. Rather than write three explicit loops, the result can be computed in a single expression.

Let's say we have two sequences and we have a multi-step process like this:

1. for i in seq1:.... apply function f1 and create sequence r1
2. for i in seq2:.... apply function f2 and create sequence r2
3. for i in len(r1):.... apply function f3 to r1[i] and r2[i]

Instead of these lengthy explicit loops, we can take a functional approach that look like this:

```f= reduce( f3, zip( map(f1,seq1), map(f2,seq2) ) )```

Where `f1`, `f2`, and `f3` are functions that define the body of the each of the above loops.

Definitions. These functions transform lists. The `map` and `filter` each apply some function to a sequence to create a new sequence. The `reduce` function applies a function which will reduce the sequence to a single value. The `zip` function interleaves values from lists to create a list of tuples.

The `map`, `zip` and `filter` functions have no internal state, they simply apply the function to each individual value of the sequence. The `reduce` function, in contrast, maintains an internal state which is seeded from an initial value, passed to the function along with each value of the sequence and returned as the final result.

Here are the formal definitions.

`map `( `function` , `sequence` , [ `sequence...` ] ) → list

Create a new `list` from the results of applying the given `function` to the items of the the given `sequence` . If more than one sequence is given, the function is called with multiple arguments, consisting of the corresponding item of each sequence. If any sequence is too short, `None` is used for missing value. If the `function` is `None`, `map` will create tuples from corresponding items in each list, much like the `zip` function.

`filter `( `function` , `sequence` ) → list

Return a `list` containing those items of `sequence` for which ```function ```( `item` ) is true. If `function` is `None`, return a `list` of items that are equivalent to `True`.

`reduce `( `function` , `sequence` , [ `initial` ]) → value

Apply a `function` of two arguments cumulatively to the items of a sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce( lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If `initial` is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty.

`zip `( `seq1` , [ `seq2,...` ] ) → [ ( `seq1` , `seq2` ,...), ... ]

Return a `list` of `tuple`s, where each `tuple` contains the matching element from each of the argument sequences. The returned `list` is truncated in length to the length of the shortest argument sequence.

Costs and Benefits. What are the advantages? First, the functional version can be clearer. It's a single line of code that summarizes the processing. Second, and more important, Python can execute the sequence processing functions far faster than the equivalent explicit loop.

You can see that `map` and `filter` are equivalent to simple list comprehensions. This gives you two ways to specify these operations, both of which have approximately equivalent performance. This means that `map` and `filter` aren't essential to Python, but hey are widely used.

The `reduce` function is a bit of a problem. It can have remarkably bad performance if it is misused. Consequently, there is some debate about the value of having this function. We'll present the function along with the caveat that it can lead to remarkable slowness.

The `map` Function. The `map` function transforms a sequence into another sequence by applying a function to each item in the sequence. The idea is to apply a mapping tranformation to a sequence. This is is a common design pattern within numerous kinds of programs. Generally, the transformation is the interesting part of the programming, and the loop is just another boring old loop.

The function call `map`( `aFunction` , `aSequence` ) behaves as if it had the following definition.

```def map( aFunction, aSequence ):
return [ aFunction(v) for v in aSequence ]
```

For example:

````>>> `

`map( int, [ "10", "12", "14", 3.1415926, 5L ] )`

`[10, 12, 14, 3, 5]`
```

This applies the `int` function to each element of the input sequence (a `list` that contains some `string`s, a floating point value and a long integer value) to create the output sequence (a `list` of integers).

The function used in `map` can be a built-in function, or a user-defined function created with the def statement (see Chapter 9, Functions ).

````>>>`
`def oddn(x):`

`...`
`    return x*2+1`

`...`
`>>>`
`map( oddn, range(6) )`

`[1, 3, 5, 7, 9, 11]`
```

This example defines a function `oddn`, which creates an odd number from the input. The `map` function applies our `oddn` function to each value of the sequence created by `range`( `6` ). We get the first 6 odd numbers.

The `filter` Function. The `filter` function chooses elements from the input sequence where a supplied function is `True`. Elements for which the supplied function is `False` are discarded.

The function call ```filter ```( `aFunction` , `aSequence` ) behaves as if it had the following definition.

```def filter( aFunction, aSequence ):
return [ v for v in aSequence if aFunction(v) ]```

For example:

````>>>`
`def gt2( a ):`

`...`
`    return a > 2`

`...`
`>>>`
`filter( gt2, range(8) )`

`[3, 4, 5, 6, 7]`
```

This example uses a function, `gt2`, which returns `True` for inputs greater than 2. We create a range from 0 to 7, but only keep the values for which the filter function returns `True`.

Here's another example that keeps all numbers that are evenly divisible by 3.

````>>>`
`def div3( a ):`

`...`
`    return a % 3 == 0`

`...`
`>>>`
`filter( div3, range(10) )`

`[0, 3, 6, 9]`
```

Our function, `div3`, returns `True` when the remainder of dividing a number by 3 is exactly 0. This will keep numbers that are evenly divisible by 3. We create a range, apply the function to each value in the range, and keep only the values where the filter is `True`.

The `reduce` Function. The `reduce` function can be used to implement the common spread-sheet functions that compute sums and products. This function works by seeding a result with an initial value. It then calls the user-supplied function with this result and each value of the sequence. This is remarkably common, but it can't be done as a list comprehension.

The function call ```reduce ```( `aFunction` , `aSequence ` , [, `init` ]) behaves as if it had the following definition.

```def reduce( aFunction, aSequence, init= 0 ):
r= init
for s in aSequence:
r= aFunction( r, s )
return r
```

The important thing to note is that the function is applied to the internal value, `r`, and each element of the list to compute a new internal value.

For example:

````>>>`
`def add(a,b):`

`...`
`    return a+b`

`...`
`>>>`
`reduce( add, range(10) )`

`45`
```

This expression computes the sum of the 10 numbers from zero through nine. The function we defined, `add`, adds the previous result and the next sequence value together.

Here's an interesting example that combines `reduce` and `map`. This uses two functions defined in earlier examples, `add` and `oddn`.

```for i in range(10):
sq=reduce( add, map(oddn, range(i)), 0 )
print i, sq
```

Let's look at the evaluation from innermost to outermost. The `range`( `i` ) generates a `list` of numbers from 0 to `i`-1. The `map` applies the `oddn` function form to create a sequence of `i` odd numbers from 1 to 2`i`+1. The `reduce` then adds this sequence of odd numbers. Interestingly, these sums add to the square of `i`.

The `zip` Function. The `zip` function interleaves values from two sequences to create a new sequence. The new sequence is a sequence of `tuple`s. Each item of a `tuple` is the corresponding values from from each sequence.

````>>>`
`zip( range(5), range(1,20,2) )`

`[(0, 1), (1, 3), (2, 5), (3, 7), (4, 9)]`
```

In this example, we zipped two sequences together. The first sequence was `range(5)`, which has five values. The second sequence was range(1,20,2) which has 10 odd numbers from 1 to 19. Since `zip` truncates to the shorter list, we get five tuples, each of which has the matching values from both lists.

The `map` function behaves a little like `zip` when there is no function provided, just sequences. However, `map` does not truncate, it fills the shorter list with `None` values.

````>>>`
`map( None, range(5), range(1,20,2) )`

```[(0, 1), (1, 3), (2, 5), (3, 7), (4, 9), (None, 11), (None, 13), (None, 15),
(None, 17), (None, 19)]```
``` Published under the terms of the Open Publication License Design by Interspire