程序流程控制

条件语句、循环语句和推导式

作者

HPDell

1 条件语句

如果程序需要根据一定条件执行不同的操作，就需要使用条件语句

1.1 格式

if condition1:
    expression1
else
    expression0

if condition1:
    expression1
elif condition2:
    expression2
elif condition2:
    expression2
else
    expression0

if condition1:
    expression1
    if condition2:
        expression2
    else condition3:
        expression3
else
    expression0

condition 条件表达式的结果必须是 bool 类型
elif 可以有任意多个，也可以是 0 个
if-else 结构可以嵌套
注意保持缩进

1.1.1 案例：根据房龄进行分类标记

根据二手房的建成年份判断是不是 10 年以上的老房子。如果是，标记为 "aged10"；如果不是，标记为 "new"。

def is_old_property(built_year, current_year=2024):
    if (current_year - built_year) > 10:
        return "aged10"
    else:
        return "new"

is_old_property(2010)

'aged10'

根据二手房的建成年份进行房龄分级。如果是 20 年以上，标记为 "aged20"；如果是 10 年以上，标记为 "aged10"；如果 10 年以下，标记为 "new"。

. . .

def proeprty_age_label(built_year, current_year=2024):
    if (current_year - built_year) > 20:
        return "aged20"
    elif (current_year - built_year) > 10:
        return "aged10"
    else:
        return "new"

print(proeprty_age_label(2000))
print(proeprty_age_label(2010))
print(proeprty_age_label(2020))

aged20
aged10
new

1.1.2 案例：校验数据

上次我们定义了一个函数，根据二手房总数和每页显示的数量计算总页数。
def page_num(page_size, total):
    return total // page_size + int(total % page_size > 0)
但是有一个隐藏条件： page_size 的值不能是 0，而且一般也不能超过 20。修改这个函数，当条件不满足时，返回 0 。

提示

这里需要用到条件运算符 and or 等。

def page_num(page_size, total):
    if page_size > 0 and page_size <= 20:
        return total // page_size + int(total % page_size > 0)
    else:
        return 0

print(page_num(0, 100))
print(page_num(1000, 845))
print(page_num(20, 845))

0
0
43

注释

这里事实上使用 0 作为一种错误代码。一般情况下，page_num 为 0 ，表示没有页，属于一种错误情况。这种情况下，我们就可以停止爬虫。

1.2 缩进

Python 中通过缩进区分不同层级的代码。

: 后就表示开始一个新的层级
缩进可以使用空格（Space）或者制表符（Tab）

1.3 短路判断

当 if 语句中的条件由很多部分组成时，Python 一般从左到右逐个判断。但是当某一个语句的结果判断完成后，如果后面的判断无法改变这个结果，Python 就不再继续判断了。

当多个判断由 or 连接，其中一个是 True 时，触发短路，结果是 True
当多个判断由 and 连接，其中一个是 False 时，触发短路，结果是 False

这个特性可以保证可读性和正确性。

1.3.1 案例：空字符串

通常情况下，如果一个变量的值可以是 None，那么我们称这个变量是可空变量。但是对于字符串，不一定只有值是 None 的字符串才是空字符串。值是 "" 的字符串实际上也是空字符串。写一个函数，判断一个字符串实际上是不是“空”字符串。

def is_null_str(value):
    if (value is None) or (len(value) == 0):
        return True
    else:
        return False

print(is_null_str(None))
print(is_null_str(""))
print(is_null_str("abc"))

True
True
False

1.3.2 案例：非空字符串

同样的问题，但是写一个函数判断是不是非空字符串。

def not_null_str(value):
    if (value is not None) and (len(value) > 0):
        return True
    else:
        return False

print(not_null_str(None))
print(not_null_str(""))
print(not_null_str("abc"))

False
False
True

2 循环语句

如果要对一组变量执行同样的操作，可以使用循环语句。循环语句有两种，for 循环和 while 循环。

2.1 For 循环

For 循环又称“迭代”，是最常用的一种循环。

for item in iterable:
    # 对 item 执行一些操作

重要

item 是循环变量，依次从 iterable 中取一个值。
放在 iterable 位置上的一定是可以迭代的。
- 一般 list dict set tuple 都是可以迭代的。
- 函数 range 的返回值也是可以迭代的
- enumerate zip 的返回值也是可以迭代的

2.1.1 案例：批量房龄分级

现在有一批房子的建成年份，根据这些年份进行分级。

prop_years = [1998, 2005, 2007, 2012, 2016, 2019]

for item in prop_years:
    print(proeprty_age_label(item))

aged20
aged10
aged10
aged10
new
new

2.1.2 案例：查找因数

写一个函数，输出一个整数除了1和它本身以外的所有因数。

def fractions(value):
    for i in range(2, value):
        if value % i == 0:
            print(i)

fractions(24)

fractions(15)

3
5

2.2 While 循环

当循环范围无法确定，只能确定循环执行的条件，使用 while 循环。

while (condition):
    operations

当 condition 是 True 时，执行 operations 的操作；
每次执行完 operations 的操作，检查 condition 是否还满足；
- 如果 condition 仍然满足，再次执行 operations 的操作，然后再检查；
- 一旦 condition 不满足了，就不再执行 operations 的操作。

2.2.1 案例：阶乘

计算一个正整数的阶乘。

def factorial(x):
    result = 1
    while x > 0:
        result = x * result
        x = x - 1
    return result

factorial(5)

注释

避免了递归函数，提高了计算性能。
更便于读懂程序。

2.2.2 案例：级数

计算几何级数 \(y = \sum_{i=0}^\infty \frac{1}{2^i}\) 的数值，精度为 \(1 \times 10^{-6}\)。也就是说，当计算结果的变化不超过 \(1 \times 10^{-6}\) 时，停止计算。

def geometric_progression(base=2, eps=1e-6):
    result0 = 0
    result = 1
    power = 0
    while (result - result0) > eps:
        result0 = result
        power = power + 1
        result = result + 1.0 / (base ** power)
    return result

geometric_progression()

1.9999990463256836

2.3 循环嵌套

循环里面也可以写循环，但是要注意循环变量不要重复。这种循环被称为多重循环。

for i in range(0, 10):
    for j in range(0, 10):
        print((i, j))

多重循环有的时候不得不用，但是尽量减少使用。

Python 的效率并不是很高，循环嵌套层数太多影响效率。
循环层数越多，代码越难以维护和分析。

2.3.1 案例：查找质数

给定一个最大值，查找不大于该最大值的自然数中的质数。

def prime_number(upper):
    if upper <= 1:
        return []
    elif upper == 2:
        return [2]
    primes = []
    for i in range(2, upper + 1):
        has_factor = False
        for j in range(2, i):
            if i % j == 0:
                has_factor = True
        if not has_factor:
            primes = primes + [i]
    return primes

prime_number(20)

[2, 3, 5, 7, 11, 13, 17, 19]

prime_number(50)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

警告

当输入值越来越大时，得到结果的过程也会越来越慢。

2.4 中断和继续

在循环中，使用 break continue 两个关键词可以主动控制循环的走向。

break 使循环立刻终止，不再执行循环体内后面的语句，直接回到循环体外。
continue 跳过这一次循环中后面的语句，直接进入下一次循环。

通常和判断语句一起使用。

2.4.1 案例：有限制的级数计算

计算之前那个几何级数，但是要求迭代次数不超过 10 次，如果超过了，就直接返回当前的结果。

def geometric_progression2(base=2, eps=1e-6, max_iter=10):
    result0 = 0
    result = 1
    power = 0
    it = 0
    while (result - result0) > eps:
        result0 = result
        power = power + 1
        result = result + 1.0 / (base ** power)
        if it >= 10:
            break
    return result

geometric_progression2()

1.9999990463256836

2.4.2 案例：处理二手房均价

房地产交易平台上，通常只有别墅的房价是以总价的形式显示的，其他都是按单位面积均价显示。编写一个函数，将所有价格统一处理成单位面积均价的形式。房地产信息由字典给出。

def process_price(property_list):
    for item in property_list:
        if item['type'] != '别墅':
            continue
        item['price'] = item['price'] / item['size']

重要

continue 并不是总是非常必要，通过调整判断条件大多数情况下都可以避免。
一般只有当处理非常复杂、代码嵌套层级很多时，在处理开始前使用 continue，可以减少嵌套层级。

3 推导式

3.1 列表推导式

与循环语句类似，由一个叫“列表推导式”的语法，可以从一个列表得到另一个列表。也可以设置一些条件。

[opeartion(x) for x in iterable]
[opeartion(x) for x in iterable if condition]

上面的例子可以直接使用列表推导式

[proeprty_age_label(item) for item in prop_years]

['aged20', 'aged10', 'aged10', 'aged10', 'new', 'new']

重要

列表推导式非常重要！

3.1.1 优势

这种处理称为“向量化处理”，相比于循环语句，效率更高。
对于简单的处理，列表推导式更易读。
不仅可以做一维循环，也可以做多维循环。

3.1.2 劣势

多维循环时不易理解和阅读。
执行处理复杂时代码会比较长，不便于调试。

3.1.3 案例：根据坐标生成唯一标识

现在有一组坐标，精确到小数点后 6 位，按照之前的方法转换成唯一标识。

coords = [(112.2452, 20.45456), (112.456127, 20.145678), (112.976825, 30.142501)]

[f"{round(c[0], 6)}-{round(c[1], 6)}" for c in coords]

['112.2452-20.45456', '112.456127-20.145678', '112.976825-30.142501']

我们也可以将循环变量直接展开为两个元素，这种语法被称为“解包”。

[f"{round(lon, 6)}-{round(lat, 6)}" for lon, lat in coords]

['112.2452-20.45456', '112.456127-20.145678', '112.976825-30.142501']

3.1.4 案例：向量内积

给两个向量（默认为列向量），计算内积 \(\mathbf{x}^\mathrm{T} \mathbf{y}\)。如果 \(\mathbf{x}=(x_1,x_2,\cdots,x_n)\)， \(\mathbf{y}=(y_1,y_2,\cdots,y_n)\)，那么 \(\mathbf{x}^\mathrm{T} \mathbf{y} = \sum_{i=1}^n x_i y_i\)。

x = [1,2,3]
y = [4,5,6]

提示

使用 zip 可以将两个列表并列。

sum([xi * yi for xi, yi in zip(x, y)])

3.1.5 案例：莱布尼兹级数计算圆周率

莱布尼兹级数 \(1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \frac{1}{9} + \cdots = \frac{\pi}{4}\) 。只计算前 100 项的和并乘以 4。

(1 + sum([((-1) ** i) / (2 * i + 1) for i in range(1, 100)])) * 4

3.131592903558553

3.1.6 案例：计算向量累积和

假设一个向量有 \(n\) 个元素，计算从第一个元素开始到每个元素位置的累积和。比如，\([1,2,3]\) 的累积和是 \([1,3,6]\)。

提示

解包时可以使用 _ 表示不需要的元素。
enumerate 可以获取元素的序号。

def accumulate(vec):
    return [sum(vec[0:(i+1)]) for i, _ in enumerate(vec)]

accumulate([1,2,3])

[1, 3, 6]

3.2 字典推导式

类似于列表推导式，字典推导式如下

{key_operation(x): value_operation(x) for x in iterable}
{key_operation(x): value_operation(x) for x in iterable if condition}

需要注意的是：

不论是 key_operation 还是 value_operation 都可以和 x 无关
key_operation(x) 的结果必须是一种“可哈希”的类型，包括 int float str bool tuple 以及其他所有具有自定义哈希方法的类型。

哈希的具体原理暂时不用深究，记住这几个类型即可。

3.2.1 案例：元组列表转字典

现在有一组元组构成的列表，元组有三个元素，分别表示小区名、小区均价和物业费。为了便于读取，将元组列表转换成字典，以小区名为键，值是字典 {'avg_price': price, 'management': fee}

communities = [('A', 1.2, 1.2), ('b', 2.3, 1.3)]

{x[0]: {'avg_price': x[1], 'management': x[2]} for x in communities}

{'A': {'avg_price': 1.2, 'management': 1.2},
 'b': {'avg_price': 2.3, 'management': 1.3}}

3.2.2 案例：字符串列表记录长度

将字符串列表转换为字典，键是字符串，值是长度。

tokens = [
    'abcdefg',
    '1927h3ksmfd',
    '19lpwwodksj'
]
{i: len(i) for i in tokens}

{'abcdefg': 7, '1927h3ksmfd': 11, '19lpwwodksj': 11}

聚合查询

实际中，我们会对数据进行分组，然后计算这个组的某个指标，比如数量、平均数等。

每个小区里面有多少二手房
每个省的基尼系数

3.3 其他推导式

3.3.1 集合推导式

{opeartion(x) for x in iterable}
{opeartion(x) for x in iterable if condition}

重复元素会自动被去除。

3.3.2 元组推导式

(opeartion(x) for x in iterable)
(opeartion(x) for x in iterable if condition)

元组推导式的结果并不是一个元组，而是一个类似于 range 的生成器，并不会立刻求值。

4 下节

自定义类

成员和方法
构造函数
属性
静态函数