场景：数字统计任务

Ethan 是一位 Python 初学者，在学习了如何用 Python 读取文件后，他想要做一个小练习：统计某个文件中数字字符的数量：

def count_digits(fname):
    count = 0
    with open(fname) as file:
        for line in file:
            for s in line:
                if s.isdigit():
                    count += 1
    return count

首先用 with open(file_name) 上下文管理器语法获得一个文件对象，然后用 for 循环迭代它，逐行获取文件里的内容。它有两个好处：

with 上下文管理器会自动关闭文件描述符；
在迭代文件对象时，内容是一行一行返回的，不会占用太多内存；

可是，假如被读取的文件里没有任何换行符，程序遍历文件对象时就不知道该何时中断，最终只能一次性生成一个巨大的字符串对象，白白消耗大量时间和内存。我们可以调用更底层的 file.read(chunk_size) 方法，读取从当前游标位置往后 chunk_size 大小的文件内容，不必等待任何换行符出现：

def count_digits(fname):
    count = 0
    chunk_size = 1024 * 8
    with open(fname) as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            for s in chunk:
                if s.isdigit():
                    count += 1
    return count

新代码虽然解决了大文件读取时的性能问题，循环内的逻辑却变得更零碎了，如果使用 iter() 函数，我们可以进一步简化代码。以 iter(callable, sentinel) 的方式调用 iter() 函数时，会拿到一个特殊的迭代器对象，用循环遍历这个迭代器，会不断返回调用 callable() 的结果，假如结果等于 sentinel，迭代过程中止：

from functools import partial

def count_digits(fname):
    count = 0
    chunk_size = 1024 * 8
    with open(fname) as fp:
        # 使用 functools.partial 构造一个新的无需参数的函数
        _read = partial(fp.read, chunk_size)

        for chunk in iter(_read, ''):
            for s in chunk:
                if s.isdigit():
                    count += 1
    return count

上面的代码，数据生产和数据消费两个独立的逻辑被放在了同一个循环体内，耦合在了一起，解耦循环体生成器（迭代器）是首选：

from functools import partial

def read_file_digits(fp, chunk_size=1024 * 8):
    _read = partial(fp.read, block_size)
    for chunk in iter(_read, ''):
        for s in chunk:
            if s.isdigit():
                yield s

def count_digits(fname):
    count = 0
    with open(fname) as file:
        for num in read_file_digits(file):
            count += 1
    return count