python3 cookbook中常遇問題的解答記錄

將序列分解為單獨的變量

問題：現在有一個包含 N 個元素的元組或者是序列，怎樣將它里面的值解壓后同時賦值給 N 個變量？

解答：任何的序列（或者是可迭代對象）可以通過一個簡單的賦值操作來分解為單獨的變量。唯一的要求就是變量的總數和結構必須與序列相吻合。

代碼示例：（當前環境為python3.8版本）

1p = (1, 2)
 2x, y = p
 3
 4x   # 1
 5y   # 2
 6
 7data = [ 'ABC', 20, 51.1, (2023, 2, 5) ]
 8name, age, price, date = data
 9name  # 'ABC'
10date  #  (2023, 2, 5)

解壓可迭代對象賦值給多個變量

問：如果一個可迭代對象的元素個數超過變量個數時，會拋出一個 ValueError。那么怎樣才能從這個可迭代對象中解壓出 N 個元素出來？

解答：Python 的星號表達式可以用來解決這個問題。比如，你在學習一門課程，在學期末的時候，你想統計下家庭作業的平均成績，但是排除掉第一個和最后一個分數。如果只有四個分數，你可能就直接去簡單的手動賦值，但如果有 24 個呢？這時候星號表達式就派上用場了。

1def drop_first_last(grades):
2    first, *middle, last = grades
3    return avg(middle)

另外一種情況，假設你現在有一些用戶的記錄列表，每條記錄包含一個名字、郵件，接著就是不確定數量的電話號碼。你可以像下面這樣分解這些記錄：

1record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')
2name, email, *phone_numbers = record
3name  # 'Dave'
4email  # 'dave@example.com'
5phone_numbers   #  ['773-555-1212', '847-555-1212']

值得注意的是上面解壓出的 phone_numbers 變量永遠都是列表類型，不管解壓的電話號碼數量是多少（包括 0 個）。

查找最大或最小的 N 個元素

問題：怎樣從一個集合中獲得最大或者最小的 N 個元素列表？

解答：heapq 模塊有兩個函數：nlargest() 和 nsmallest() 可以完美解決這個問題。

1import heapq
2nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
3print(heapq.nlargest(3, nums)) # Prints [42, 37, 23]
4print(heapq.nsmallest(3, nums)) # Prints [-4, 1, 2]

兩個函數都能接受一個關鍵字參數，用于更復雜的數據結構中：

1portfolio = [
 2    {'name': 'IBM', 'shares': 100, 'price': 91.1},
 3    {'name': 'AAPL', 'shares': 50, 'price': 543.22},
 4    {'name': 'FB', 'shares': 200, 'price': 21.09},
 5    {'name': 'HPQ', 'shares': 35, 'price': 31.75},
 6    {'name': 'YHOO', 'shares': 45, 'price': 16.35},
 7    {'name': 'ACME', 'shares': 75, 'price': 115.65}
 8]
 9cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
10cheap
11# [{'name': 'YHOO', 'shares': 45, 'price': 16.35}, {'name': 'FB', 'shares': 200, 'price': 21.09},{'name': 'HPQ', 'shares': 35, 'price': 31.75}]
12expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])
13expensive
14# [{'name': 'AAPL', 'shares': 50, 'price': 543.22},{'name': 'ACME', 'shares': 75, 'price': 115.65}, {'name': 'IBM', 'shares': 100, 'price': 91.1}]

字典的運算

問題：怎樣在數據字典中執行一些計算操作（比如求最小值、最大值、排序等等）？

解答：考慮下面的股票名和價格映射字典

1prices = {
2    'ACME': 45.23,
3    'AAPL': 612.78,
4    'IBM': 205.55,
5    'HPQ': 37.20,
6    'FB': 10.75
7}

為了對字典值執行計算操作，通常需要使用 zip() 函數先將鍵和值反轉過來。比如，下面是查找最小和最大股票價格和股票值的代碼：

1min_price = min(zip(prices.values(), prices.keys()))
2# min_price is (10.75, 'FB')
3max_price = max(zip(prices.values(), prices.keys()))
4# max_price is (612.78, 'AAPL')

類似的，可以使用 zip() 和 sor ted() 函數來排列字典數據：

1prices_sorted = sorted(zip(prices.values(), prices.keys()))
2# prices_sorted is [(10.75, 'FB'), (37.2, 'HPQ'),
3#                   (45.23, 'ACME'), (205.55, 'IBM'),
4#                   (612.78, 'AAPL')]

執行這些計算的時候，需要注意的是 zip() 函數創建的是一個只能訪問一次的迭代器。比如，下面的代碼就會產生錯誤：

1prices_and_names = zip(prices.values(), prices.keys())
2print(min(prices_and_names)) # (10.75, 'FB')
3print(max(prices_and_names)) # ValueError: max() arg is an empty sequence

查找兩字典的相同點

問題：怎樣在兩個字典中尋找相同點（比如相同的鍵、相同的值等等）？

解答：考慮下面兩個字典

1a = {
 2    'x' : 1,
 3    'y' : 2,
 4    'z' : 3
 5}
 6
 7b = {
 8    'w' : 10,
 9    'x' : 11,
10    'y' : 2
11}

為了尋找兩個字典的相同點，可以簡單的在兩字典的 keys() 或者 items() 方法返回結果上執行集合操作。比如：

1# Find keys in common
2a.keys() & b.keys() # { 'x', 'y' }
3# Find keys in a that are not in b
4a.keys() - b.keys() # { 'z' }
5# Find (key,value) pairs in common
6a.items() & b.items() # { ('y', 2) }

這些操作也可以用于修改或者過濾字典元素。比如，假如你想以現有字典構造一個排除幾個指定鍵的新字典。下面利用字典推導來實現這樣的需求：

1# Make a new dictionary with certain keys removed
2c = {key:a[key] for key in a.keys() - {'z', 'w'}}
3# c is {'x': 1, 'y': 2}

序列中出現次數最多的元素

問題：怎樣找出一個序列中出現次數最多的元素呢？

解答：collections.Counter 類就是專門為這類問題而設計的，它甚至有一個有用的 most_common() 方法直接給了你答案。

先假設你有一個單詞列表并且想找出哪個單詞出現頻率最高。你可以這樣做：

1words = [
 2    'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
 3    'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
 4    'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',
 5    'my', 'eyes', "you're", 'under'
 6]
 7from collections import Counter
 8word_counts = Counter(words)
 9# 出現頻率最高的3個單詞
10top_three = word_counts.most_common(3)
11print(top_three)
12# Outputs [('eyes', 8), ('the', 5), ('look', 4)]

通過某個字段將記錄分組

問題：你有一個字典或者實例的序列，然后你想根據某個特定的字段比如 date 來分組迭代訪問。

解答：itertools.groupby() 函數對于這樣的數據分組操作非常實用。假設你已經有了下列的字典列表：

1rows = [
 2    {'address': '5412 N CLARK', 'date': '07/01/2012'},
 3    {'address': '5148 N CLARK', 'date': '07/04/2012'},
 4    {'address': '5800 E 58TH', 'date': '07/02/2012'},
 5    {'address': '2122 N CLARK', 'date': '07/03/2012'},
 6    {'address': '5645 N RAVENSWOOD', 'date': '07/02/2012'},
 7    {'address': '1060 W ADDISON', 'date': '07/02/2012'},
 8    {'address': '4801 N BROADWAY', 'date': '07/01/2012'},
 9    {'address': '1039 W GRANVILLE', 'date': '07/04/2012'},
10]

現在假設你想在按 date 分組后的數據塊上進行迭代。為了這樣做，你首先需要按照指定的字段(這里就是date)排序，然后調用 itertools.groupby() 函數：

1from operator import itemgetter
 2from itertools import groupby
 3
 4# Sort by the desired field first
 5rows.sort(key=itemgetter('date'))
 6# Iterate in groups
 7for date, items in groupby(rows, key=itemgetter('date')):
 8    print(date)
 9    for i in items:
10        print(' ', i)

運行結果：

107/01/2012
 2  {'date': '07/01/2012', 'address': '5412 N CLARK'}
 3  {'date': '07/01/2012', 'address': '4801 N BROADWAY'}
 407/02/2012
 5  {'date': '07/02/2012', 'address': '5800 E 58TH'}
 6  {'date': '07/02/2012', 'address': '5645 N RAVENSWOOD'}
 7  {'date': '07/02/2012', 'address': '1060 W ADDISON'}
 807/03/2012
 9  {'date': '07/03/2012', 'address': '2122 N CLARK'}
1007/04/2012
11  {'date': '07/04/2012', 'address': '5148 N CLARK'}
12  {'date': '07/04/2012', 'address': '1039 W GRANVILLE'}

過濾序列元素

問題：你有一個數據序列，想利用一些規則從中提取出需要的值或者是縮短序列。

解答：最簡單的過濾序列元素的方法就是使用列表推導。比如：

1mylist = [1, 4, -5, 10, -7, 2, 3, -1]
2[n for n in mylist if n > 0]
3# [1, 4, 10, 2, 3]
4[n for n in mylist if n < 0]
5# [-5, -7, -1]

有時候，過濾規則比較復雜，不能簡單的在列表推導或者生成器表達式中表達出來。比如，假設過濾的時候需要處理一些異常或者其他復雜情況。這時候你可以將過濾代碼放到一個函數中，然后使用內建的 filter() 函數。示例如下：

1values = ['1', '2', '-3', '-', '4', 'N/A', '5']
 2def is_int(val):
 3    try:
 4        x = int(val)
 5        return True
 6    except ValueError:
 7        return False
 8ivals = list(filter(is_int, values))
 9print(ivals)
10# Outputs ['1', '2', '-3', '4', '5']

filter() 函數創建了一個迭代器，因此如果你想得到一個列表的話，就得像示例那樣使用 list() 去轉換。

轉換并同時計算數據

問題：你需要在數據序列上執行聚集函數（比如 sum() , min() , max()），但是首先你需要先轉換或者過濾數據

解答：一個非常優雅的方式去結合數據計算與轉換就是使用一個生成器表達式參數。比如，如果你想計算平方和，可以像下面這樣做：

1nums = [1, 2, 3, 4, 5]
2s = sum(x * x for x in nums)

下面是更多的例子：

1# Determine if any .py files exist in a directory
 2import os
 3files = os.listdir('dirname')
 4if any(name.endswith('.py') for name in files):
 5    print('There be python!')
 6else:
 7    print('Sorry, no python.')
 8# Output a tuple as CSV
 9s = ('ACME', 50, 123.45)
10print(','.join(str(x) for x in s))
11# Data reduction across fields of a data structure
12portfolio = [
13    {'name':'GOOG', 'shares': 50},
14    {'name':'YHOO', 'shares': 75},
15    {'name':'AOL', 'shares': 20},
16    {'name':'SCOX', 'shares': 65}
17]
18min_shares = min(s['shares'] for s in portfolio)

合并多個字典或映射

問題：現在有多個字典或者映射，你想將它們從邏輯上合并為一個單一的映射后執行某些操作，比如查找值或者檢查某些鍵是否存在。

解答：假如你有如下兩個字典

1a = {'x': 1, 'z': 3 }
2b = {'y': 2, 'z': 4 }

現在假設你必須在兩個字典中執行查找操作（比如先從 a 中找，如果找不到再在 b 中找）。一個非常簡單的解決方案就是使用 collections 模塊中的 ChainMap 類。比如：

1from collections import ChainMap
2c = ChainMap(a,b)
3print(c['x']) # Outputs 1 (from a)
4print(c['y']) # Outputs 2 (from b)
5print(c['z']) # Outputs 3 (from a)