共计 12768 个字符,预计需要花费 32 分钟才能阅读完成。
引言
functools
, itertools
, operator
是Python标准库为我们提供的支持函数式编程的三大模块,合理的使用这三个模块,我们可以写出更加简洁可读的Pythonic代码,接下来我们通过一些example来了解三大模块的使用。
functools的使用
functools是Python中很重要的模块,它提供了一些非常有用的高阶函数。高阶函数就是说一个可以接受函数作为参数或者以函数作为返回值的函数,因为Python中函数也是对象,因此很容易支持这样的函数式特性
partial
>>> from functools import partial
>>> basetwo = partial(int, base=2)
>>> basetwo('10010')
18
basetwo('10010')
实际上等价于调用int('10010', base=2)
,当函数的参数个数太多的时候,可以通过使用functools.partial来创建一个新的函数来简化逻辑从而增强代码的可读性,而partial内部实际上就是通过一个简单的闭包来实现的
def partial(func, *args, **keywords):
def newfunc(*fargs, **fkeywords):
newkeywords = keywords.copy()
newkeywords.update(fkeywords)
return func(*args, *fargs, **newkeywords)
newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords
return newfunc
partialmethod
partialmethod和partial类似,但是对于绑定一个非对象自身的方法
的时候,这个时候就只能使用partialmethod了,我们通过下面这个例子来看一下两者的差异
>>> from functools import partial, partialmethod
>>> def standalone(self, a=1, b=2):
... "Standalone function"
... print(' called standalone with:', (self, a, b))
... if self is not None:
... print(' self.attr =', self.attr)
>>> class MyClass:
... "Demonstration class for functools"
... def __init__(self):
... self.attr = 'instance attribute'
... method1 = functools.partialmethod(standalone) # 使用partialmethod
... method2 = functools.partial(standalone) # 使用partial
>>> o = MyClass()
>>> o.method1()
called standalone with: (<__main__.MyClass object at 0x7f46d40cc550>, 1, 2)
self.attr = instance attribute
# 不能使用partial
>>> o.method2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: standalone() missing 1 required positional argument: 'self'
singledispatch
虽然Python不支持同名方法允许有不同的参数类型,但是我们可以借用singledispatch来动态指定相应的方法所接收的参数类型
,而不用把参数判断放到方法内部去判断从而降低代码的可读性
from functools import singledispatch
class TestClass(object):
@singledispatch
def test_method(arg, verbose=False):
if verbose:
print("Let me just say,", end=" ")
print(arg)
@test_method.register(int)
def _(arg):
print("Strength in numbers, eh?", end=" ")
print(arg)
@test_method.register(list)
def _(arg):
print("Enumerate this:")
for i, elem in enumerate(arg):
print(i, elem)
if __name__ == '__main__':
TestClass.test_method(55555) # call @test_method.register(int)
TestClass.test_method([33, 22, 11]) # call @test_method.register(list)
TestClass.test_method('hello world', verbose=True) # call default
根据运行结果来解释一下上面这段代码,上面通过@test_method.register(int)和@test_method.register(list)指定当test_method的第一个参数为int或者list的时候,分别调用不同的方法来进行处理
Enumerate this:
0 33
1 22
2 11
Let me just say, hello world
wraps
装饰器会遗失被装饰函数的__name__和__doc__等属性,可以使用@wraps来恢复
>>> from functools import wraps
>>> def my_decorator(f):
... @wraps(f)
... def wrapper():
... """wrapper_doc"""
... print('Calling decorated function')
... return f()
... return wrapper
...
>>> @my_decorator
... def example():
... """example_doc"""
... print('Called example function')
...
>>> example.__name__
'example'
>>> example.__doc__
'example_doc'
# 尝试去掉@wraps(f)来看一下运行结果,example自身的__name__和__doc__都已经丧失了
>>> example.__name__
'wrapper'
>>> example.__doc__
'wrapper_doc'
我们也可以使用update_wrapper来改写
from itertools import update_wrapper
def g():
...
g = update_wrapper(g, f)
# 等价于
@wraps(f)
def g():
...
@wraps内部实际上就是基于update_wrapper来实现的
def wraps(wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES):
def decorator(wrapper):
return update_wrapper(wrapper, wrapped=wrapped...)
return decorator
total_ordering
Python2中可以通过自定义__cmp__的返回值0/-1/1来比较对象的大小,在Python3中废弃了__cmp__,但是我们可以通过total_ordering然后修改 __lt__() , __le__() , __gt__(), __ge__(), __eq__(), __ne__() 等魔术方法来自定义类的比较规则。p.s: 如果使用必须在类里面定义 __lt__() , __le__() , __gt__(), __ge__()中的一个,以及给类添加一个__eq__() 方法
import functools
@functools.total_ordering
class MyObject:
def __init__(self, val):
self.val = val
def __eq__(self, other):
print(' testing __eq__({}, {})'.format(
self.val, other.val))
return self.val == other.val
def __gt__(self, other):
print(' testing __gt__({}, {})'.format(
self.val, other.val))
return self.val > other.val
a = MyObject(1)
b = MyObject(2)
for expr in ['a < b', 'a <= b', 'a == b', 'a >= b', 'a > b']:
print('\n{:<6}:'.format(expr))
result = eval(expr)
print(' result of {}: {}'.format(expr, result))
下面是运行结果
a < b :
testing __gt__(1, 2)
testing __eq__(1, 2)
result of a < b: True
a <= b:
testing __gt__(1, 2)
result of a <= b: True
a == b:
testing __eq__(1, 2)
result of a == b: False
a >= b:
testing __gt__(1, 2)
testing __eq__(1, 2)
result of a >= b: False
a > b :
testing __gt__(1, 2)
result of a > b: False
itertools的使用
itertools为我们提供了非常有用的用于操作迭代对象的函数
无限迭代器
count
count(start=0, step=1) 会返回一个无限的整数iterator,每次增加1。可以选择提供起始编号,默认为0
>>> from itertools import count
>>> for i in zip(count(1), ['a', 'b', 'c']):
... print(i, end=' ')
...
(1, 'a') (2, 'b') (3, 'c')
cycle
cycle(iterable) 会把传入的一个序列无限重复下去,不过可以提供第二个参数就可以制定重复次数
>>> from itertools import cycle
>>> for i in zip(range(6), cycle(['a', 'b', 'c'])):
... print(i, end=' ')
...
(0, 'a') (1, 'b') (2, 'c') (3, 'a') (4, 'b') (5, 'c')
repeat
repeat(object[, times]) 返回一个元素无限重复下去的iterator,可以提供第二个参数就可以限定重复次数
>>> from itertools import repeat
>>> for i, s in zip(count(1), repeat('over-and-over', 5)):
... print(i, s)
...
1 over-and-over
2 over-and-over
3 over-and-over
4 over-and-over
5 over-and-over
Iterators terminating on the shortest input sequence
accumulate
accumulate(iterable[, func])
>>> from itertools import accumulate
>>> import operator
>>> list(accumulate([1, 2, 3, 4, 5], operator.add))
[1, 3, 6, 10, 15]
>>> list(accumulate([1, 2, 3, 4, 5], operator.mul))
[1, 2, 6, 24, 120]
chain
itertools.chain(*iterables)可以将多个iterable组合成一个iterator
>>> from itertools import chain
>>> list(chain([1, 2, 3], ['a', 'b', 'c']))
[1, 2, 3, 'a', 'b', 'c']
chain的实现原理如下
def chain(*iterables):
# chain('ABC', 'DEF') --> A B C D E F
for it in iterables:
for element in it:
yield element
chain.from_iterable
chain.from_iterable(iterable)和chain类似,但是只是接收单个iterable,然后将这个iterable中的元素组合成一个iterator
>>> from itertools import chain
>>> list(chain.from_iterable(['ABC', 'DEF']))
['A', 'B', 'C', 'D', 'E', 'F']
实现原理也和chain类似
def from_iterable(iterables):
# chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
for it in iterables:
for element in it:
yield element
compress
compress(data, selectors)接收两个iterable作为参数,只返回selectors中对应的元素为True的data,当data/selectors之一用尽时停止
>>> list(compress([1, 2, 3, 4, 5], [True, True, False, False, True]))
[1, 2, 5]
zip_longest
zip_longest(*iterables, fillvalue=None)和zip类似,但是zip的缺陷是iterable中的某一个元素被遍历完,整个遍历都会停止,具体差异请看下面这个例子
from itertools import zip_longest
r1 = range(3)
r2 = range(2)
print('zip stops early:')
print(list(zip(r1, r2)))
r1 = range(3)
r2 = range(2)
print('\nzip_longest processes all of the values:')
print(list(zip_longest(r1, r2)))
下面是输出结果
zip stops early:
[(0, 0), (1, 1)]
zip_longest processes all of the values:
[(0, 0), (1, 1), (2, None)]
islice
islice(iterable, stop) or islice(iterable, start, stop[, step]) 与Python的字符串和列表切片有一些类似,只是不能对start、start和step使用负值
>>> from itertools import islice
>>> for i in islice(range(100), 0, 100, 10):
... print(i, end=' ')
...
0 10 20 30 40 50 60 70 80 90
tee
tee(iterable, n=2) 返回n个独立的iterator,n默认为2
from itertools import islice, tee
r = islice(count(), 5)
i1, i2 = tee(r)
print('i1:', list(i1))
print('i2:', list(i2))
for i in r:
print(i, end=' ')
if i > 1:
break
下面是输出结果,注意tee(r)后,r作为iterator已经失效,所以for循环没有输出值
i1: [0, 1, 2, 3, 4]
i2: [0, 1, 2, 3, 4]
starmap
starmap(func, iterable)假设iterable将返回一个元组流,并使用这些元组作为参数调用func:
>>> from itertools import starmap
>>> import os
>>> iterator = starmap(os.path.join,
... [('/bin', 'python'), ('/usr', 'bin', 'java'),
... ('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')])
>>> list(iterator)
['/bin/python', '/usr/bin/java', '/usr/bin/perl', '/usr/bin/ruby']
filterfalse
filterfalse(predicate, iterable) 与filter()相反,返回所有predicate返回False的元素
itertools.filterfalse(is_even, itertools.count()) =>
1, 3, 5, 7, 9, 11, 13, 15, ...
takewhile
takewhile(predicate, iterable) 只要predicate返回True,不停地返回iterable中的元素。一旦predicate返回False,iteration将结束
def less_than_10(x):
return x < 10
itertools.takewhile(less_than_10, itertools.count())
=> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
itertools.takewhile(is_even, itertools.count())
=> 0
dropwhile
dropwhile(predicate, iterable) 在predicate返回True时舍弃元素,然后返回其余迭代结果
itertools.dropwhile(less_than_10, itertools.count())
=> 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
itertools.dropwhile(is_even, itertools.count())
=> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
groupby
groupby(iterable, key=None) 把iterator中相邻的重复元素
挑出来放在一起。p.s: The input sequence needs to be sorted on the key value in order for the groupings to work out as expected.
- [k for k, g in groupby(‘AAAABBBCCDAABBB’)] –> A B C D A B
- [list(g) for k, g in groupby(‘AAAABBBCCD’)] –> AAAA BBB CC D
>>> import itertools
>>> for key, group in itertools.groupby('AAAABBBCCDAABBB'):
... print(key, list(group))
...
A ['A', 'A', 'A', 'A']
B ['B', 'B', 'B']
C ['C', 'C']
D ['D']
A ['A', 'A']
B ['B', 'B', 'B']
city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
('Anchorage', 'AK'), ('Nome', 'AK'),
('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
...
]
def get_state(city_state):
return city_state[1]
itertools.groupby(city_list, get_state) =>
('AL', iterator-1),
('AK', iterator-2),
('AZ', iterator-3), ...
iterator-1 => ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
iterator-2 => ('Anchorage', 'AK'), ('Nome', 'AK')
iterator-3 => ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
Combinatoric generators
product
product(*iterables, repeat=1)
- product(A, B) returns the same as ((x,y) for x in A for y in B)
- product(A, repeat=4) means the same as product(A, A, A, A)
from itertools import product
def show(iterable):
for i, item in enumerate(iterable, 1):
print(item, end=' ')
if (i % 3) == 0:
print()
print()
print('Repeat 2:\n')
show(product(range(3), repeat=2))
print('Repeat 3:\n')
show(product(range(3), repeat=3))
Repeat 2:
(0, 0) (0, 1) (0, 2)
(1, 0) (1, 1) (1, 2)
(2, 0) (2, 1) (2, 2)
Repeat 3:
(0, 0, 0) (0, 0, 1) (0, 0, 2)
(0, 1, 0) (0, 1, 1) (0, 1, 2)
(0, 2, 0) (0, 2, 1) (0, 2, 2)
(1, 0, 0) (1, 0, 1) (1, 0, 2)
(1, 1, 0) (1, 1, 1) (1, 1, 2)
(1, 2, 0) (1, 2, 1) (1, 2, 2)
(2, 0, 0) (2, 0, 1) (2, 0, 2)
(2, 1, 0) (2, 1, 1) (2, 1, 2)
(2, 2, 0) (2, 2, 1) (2, 2, 2)
permutations
permutations(iterable, r=None)返回长度为r的所有可能的组合
from itertools import permutations
def show(iterable):
first = None
for i, item in enumerate(iterable, 1):
if first != item[0]:
if first is not None:
print()
first = item[0]
print(''.join(item), end=' ')
print()
print('All permutations:\n')
show(permutations('abcd'))
print('\nPairs:\n')
show(permutations('abcd', r=2))
下面是输出结果
All permutations:
abcd abdc acbd acdb adbc adcb
bacd badc bcad bcda bdac bdca
cabd cadb cbad cbda cdab cdba
dabc dacb dbac dbca dcab dcba
Pairs:
ab ac ad
ba bc bd
ca cb cd
da db dc
combinations
combinations(iterable, r) 返回一个iterator,提供iterable中所有元素可能组合的r元组。每个元组中的元素保持与iterable返回的顺序相同。下面的实例中,不同于上面的permutations,a总是在bcd之前,b总是在cd之前,c总是在d之前
from itertools import combinations
def show(iterable):
first = None
for i, item in enumerate(iterable, 1):
if first != item[0]:
if first is not None:
print()
first = item[0]
print(''.join(item), end=' ')
print()
print('Unique pairs:\n')
show(combinations('abcd', r=2))
下面是输出结果
Unique pairs:
ab ac ad
bc bd
cd
combinations_with_replacement
combinations_with_replacement(iterable, r)函数放宽了一个不同的约束:元素可以在单个元组中重复,即可以出现aa/bb/cc/dd等组合
from itertools import combinations_with_replacement
def show(iterable):
first = None
for i, item in enumerate(iterable, 1):
if first != item[0]:
if first is not None:
print()
first = item[0]
print(''.join(item), end=' ')
print()
print('Unique pairs:\n')
show(combinations_with_replacement('abcd', r=2))
下面是输出结果
aa ab ac ad
bb bc bd
cc cd
dd
operator的使用
attrgetter
operator.attrgetter(attr)和operator.attrgetter(*attrs)
- After f = attrgetter(‘name’), the call f(b) returns b.name.
- After f = attrgetter(‘name’, ‘date’), the call f(b) returns (b.name, b.date).
- After f = attrgetter(‘name.first’, ‘name.last’), the call f(b) returns (b.name.first, b.name.last).
我们通过下面这个例子来了解一下itergetter的用法
>>> class Student:
... def __init__(self, name, grade, age):
... self.name = name
... self.grade = grade
... self.age = age
... def __repr__(self):
... return repr((self.name, self.grade, self.age))
>>> student_objects = [
... Student('john', 'A', 15),
... Student('jane', 'B', 12),
... Student('dave', 'B', 10),
... ]
>>> sorted(student_objects, key=lambda student: student.age) # 传统的lambda做法
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
>>> from operator import itemgetter, attrgetter
>>> sorted(student_objects, key=attrgetter('age'))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
# 但是如果像下面这样接受双重比较,Python脆弱的lambda就不适用了
>>> sorted(student_objects, key=attrgetter('grade', 'age'))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
attrgetter的实现原理
def attrgetter(*items):
if any(not isinstance(item, str) for item in items):
raise TypeError('attribute name must be a string')
if len(items) == 1:
attr = items[0]
def g(obj):
return resolve_attr(obj, attr)
else:
def g(obj):
return tuple(resolve_attr(obj, attr) for attr in items)
return g
def resolve_attr(obj, attr):
for name in attr.split("."):
obj = getattr(obj, name)
return obj
itemgetter
operator.itemgetter(item)和operator.itemgetter(*items)
- After f = itemgetter(2), the call f(r) returns r[2].
- After g = itemgetter(2, 5, 3), the call g(r) returns (r[2], r[5], r[3]).
我们通过下面这个例子来了解一下itergetter的用法
>>> student_tuples = [
... ('john', 'A', 15),
... ('jane', 'B', 12),
... ('dave', 'B', 10),
... ]
>>> sorted(student_tuples, key=lambda student: student[2]) # 传统的lambda做法
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
>>> from operator import attrgetter
>>> sorted(student_tuples, key=itemgetter(2))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
# 但是如果像下面这样接受双重比较,Python脆弱的lambda就不适用了
>>> sorted(student_tuples, key=itemgetter(1,2))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
itemgetter的实现原理
def itemgetter(*items):
if len(items) == 1:
item = items[0]
def g(obj):
return obj[item]
else:
def g(obj):
return tuple(obj[item] for item in items)
return g
methodcaller
operator.methodcaller(name[, args…])
- After f = methodcaller(‘name’), the call f(b) returns b.name().
- After f = methodcaller(‘name’, ‘foo’, bar=1), the call f(b) returns b.name(‘foo’, bar=1).
methodcaller的实现原理
def methodcaller(name, *args, **kwargs):
def caller(obj):
return getattr(obj, name)(*args, **kwargs)
return caller
References
转载自http://www.ziwenxie.site/2017/01/15/python-functools-itertools-operator/