6.6. Filter¶
Select elements from sequence
Generator (lazy evaluated)
Built-in
Syntax:
filter(callable, *iterables)
required
callable
- Functionrequired
iterables
- 1 or many sequence or iterator objects
>>> def even(x):
... return x % 2 == 0
>>>
>>> result = (x for x in range(0,5) if even(x))
>>> result = filter(even, range(0,5))
>>> result = (x for x in range(0,5) if x%2==0)
>>> result = filter(lambda x: x%2==0, range(0,5))
6.6.1. Problem¶
Plain code:
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = []
>>>
>>> for x in DATA:
... if even(x):
... result.append(x)
>>>
>>> print(result)
[2, 4, 6]
Comprehension:
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = [x for x in DATA if even(x)]
>>>
>>> print(result)
[2, 4, 6]
6.6.2. Solution¶
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, DATA)
>>>
>>> list(result)
[2, 4, 6]
6.6.3. Lazy Evaluation¶
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, DATA)
>>>
>>> next(result)
2
>>> next(result)
4
>>> next(result)
6
>>> next(result)
Traceback (most recent call last):
StopIteration
6.6.4. Rationale¶
>>> people = [
... {'age': 21, 'name': 'Pan Twardowski'},
... {'age': 25, 'name': 'Mark Watney'},
... {'age': 18, 'name': 'Melissa Lewis'}]
>>>
>>>
>>> def adult(person):
... return person['age'] >= 21
>>>
>>>
>>> result = filter(adult, people)
>>> list(result)
[{'age': 21, 'name': 'Pan Twardowski'},
{'age': 25, 'name': 'Mark Watney'}]
>>> people = [
... {'is_astronaut': False, 'name': 'Pan Twardowski'},
... {'is_astronaut': True, 'name': 'Mark Watney'},
... {'is_astronaut': True, 'name': 'Melissa Lewis'}]
>>>
>>>
>>> def astronaut(person):
... return person['is_astronaut']
>>>
>>>
>>> result = filter(astronaut, people)
>>> list(result)
[{'is_astronaut': True, 'name': 'Mark Watney'},
{'is_astronaut': True, 'name': 'Melissa Lewis'}]
>>> astronauts = ['Mark Watney', 'Melissa Lewis']
>>> people = ['Mark Watney', 'Melissa Lewis', 'Jimenez']
>>>
>>>
>>> def is_astronaut(person):
... return person in astronauts
>>>
>>>
>>> result = filter(is_astronaut, people)
>>> list(result)
['Mark Watney', 'Melissa Lewis']
6.6.5. Performance¶
>>> # %%timeit -r 10 -n 100_000
>>> # result = (x for x in range(0,5) if x%2==0)
>>> # 490 ns ± 44 ns per loop (mean ± std. dev. of 10 runs, 100000 loops each)
>>> # %%timeit -r 10 -n 100_000
>>> # result = filter(lambda x: x%2==0, range(0,5))
>>> # 384 ns ± 34.2 ns per loop (mean ± std. dev. of 10 runs, 100000 loops each)
6.6.6. Assignments¶
"""
* Assignment: Idioms Filter Chain
* Complexity: easy
* Lines of code: 5 lines
* Time: 8 min
English:
1. Use generator expression to create `result`
2. Use `range()` to get numbers:
a. from 0 (inclusive)
b. to 10 (exclusive)
3. Use `filter()` to get odd numbers from `result`
(and assign to `result`)
4. Use `map()` to cube all numbers in `result`
5. Create `result: float` with arithmetic mean of `result`
6. Do not use `lambda` expressions
7. Note, that all the time you are working on one data stream
8. Run doctests - all must succeed
Polish:
1. Użyj wyrażenia generatorowego do stworzenia `result`
2. Użyj `range()` aby otrzymać liczby:
a. od 0 (włącznie)
b. do 10 (rozłącznie)
3. Użyj `filter()` aby otrzymać liczby nieparzyste z `result`
(i przypisz je do `result`)
4. Użyj `map()` aby podnieść wszystkie liczby w `result` do sześcianu
5. Stwórz `result: float` ze średnią arytmetyczną z `result`
6. Nie używaj wyrażeń lambda
7. Zwróć uwagę, że cały czas pracujesz na jednym strumieniu danych
8. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* type cast to `list()` to expand generator before calculating mean
* `mean = sum(...) / len(...)`
* TypeError: object of type 'map' has no len()
* ZeroDivisionError: division by zero
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from inspect import isfunction
>>> isfunction(odd)
True
>>> isfunction(cube)
True
>>> type(result) is float
True
>>> result
245.0
"""
def odd(x):
return x % 2
def cube(x):
return x ** 3
# Range numbers from 1 to 10 (exclusive)
# Filter odd numbers
# Cube result
# Calculate mean
# type: float
result = ...