Python zip 函数教程（简单示例）

2023-10-12

The zip()function 是一个内置的 Python 函数，它接受两个或多个序列或集合（如列表或字符串），并创建一个并行聚合每个集合中的元素的迭代器。

这种组合这些值的过程称为“压缩”，它源于将两个单独的项目集合压缩在一起的想法。

目录 hide

1 Python zip() syntax and usage
- 1.1 使用严格关键字
2 zip() 有两个列表
3 zip() 具有两个以上列表
4 zip() 与元组
5 带集合的 zip()
6 zip() 与字典
7 带字符串的 zip()
8 压缩不同长度的可迭代对象
9 zip() 替代方案 (zip_longest())
10 zip() 与 * 运算符（解压缩）
11 使用 zip() 时的常见错误
12 使用 zip() 进行内存优化
13 带有 lambda 函数的 zip()
14 嵌套 zip()
15 使用 zip() 的真实示例
16 资源

Python zip() 语法和用法

Python中zip()函数的基本语法如下：


zip(*iterables, strict=False)

其中“*iterables”可以是一个或多个可迭代对象。这将返回一个元组迭代器，其中每个传递的迭代器中的第一个项目配对在一起，然后每个传递的迭代器中的第二个项目配对在一起，依此类推。

如果传递的迭代器具有不同的长度，则具有最少项的迭代器决定新迭代器的长度。
这是其基本用法的示例：


fruits = ["apple", "banana", "cherry"]
numbers = [1, 2, 3]
result = zip(fruits, numbers)

# convert result to list
print(list(result))

Output:


[('apple', 1), ('banana', 2), ('cherry', 3)]

在这个例子中，我们有两个列表，fruits and numbers。我们将这两个列表作为参数传递给zip()函数，它返回一个 zip 对象。

zip 对象是元组的迭代器，其中每个传递的迭代器中的第一项（在本例中为列表）配对在一起，然后每个传递的迭代器中的第二项配对在一起，等等。

最后，我们将 zip 对象转换为元组列表。

使用严格关键字

在 Python 3.10 及更高版本中，zip()函数接受一个额外的strict关键字参数。

If strict is True, zip()将检查所有输入迭代是否具有相同的长度，并会引发ValueError如果长度不匹配。

这是一个例子zip() with strict=True:


numbers = [1, 2, 3]
letters = ['a', 'b']
try:
    zipped = zip(numbers, letters, strict=True)    
    zipped_list = list(zipped)
    print(zipped_list)
except ValueError as ve:
    print(f"Caught an exception: {ve}")

Output:


Caught an exception: zip() argument 2 is shorter than argument 1

在此代码中，zip()函数提出了一个ValueError因为numbers列表包含三项，而letters列表只有两个，并且strict被设定为True.

zip() 有两个列表

The zip()函数通常与两个列表一起使用，但也可以与两个以上列表一起使用。这是一个使用示例zip()有两个列表：


fruits = ["Apple", "Banana", "Cherry"]
colors = ["Red", "Yellow", "Red"]
zipped = zip(fruits, colors)
print(list(zipped))

Output:


[('Apple', 'Red'), ('Banana', 'Yellow'), ('Cherry', 'Red')]

在这个例子中，我们定义了两个列表fruits and colors。然后我们使用zip()函数将这两个列表组合成对。

The zip()函数返回一个 zip 对象，它是一个包含元组的迭代器。每个元组由输入列表中相同索引的元素组成。

每个元组中的第一个元素来自第一个列表，第二个元素来自第二个列表。最后，我们将 zip 对象转换为列表以使其更易于阅读。

zip() 具有两个以上列表

The zip()函数可以接受任意数量的迭代，并且生成的元组将具有相同的大小。


fruits = ["Apple", "Banana", "Cherry"]
colors = ["Red", "Yellow", "Red"]
weights = [120, 150, 50]
result = zip(fruits, colors, weights)
print(list(result))

Output:


[('Apple', 'Red', 120), ('Banana', 'Yellow', 150), ('Cherry', 'Red', 50)]

在这个例子中，我们使用了zip()函数将三个不同列表中的元素配对 -fruits, colors, and weights.

The zip()函数并行迭代这些列表，从每个列表中取出一个元素并将它们分组为一个元组。

重复此操作，直到用尽最短列表 - 在这种情况下，所有列表的长度相等，因此它们的所有元素都配对。

zip() 与元组

The zip()函数也可以与元组一起使用。该函数将每个元组的元素在同一索引处配对，就像列表一样。


tuple1 = ("Apple", "Banana", "Cherry")
tuple2 = ("Red", "Yellow", "Red")
result = zip(tuple1, tuple2)
print(list(result))

Output:


[('Apple', 'Red'), ('Banana', 'Yellow'), ('Cherry', 'Red')]

在这个例子中，我们定义了两个元组，tuple1 and tuple2。然后我们使用了zip()函数创建一个 zip 对象，将每个元组中的相应元素配对在一起，从而产生一个元组列表。

带集合的 zip()

蟒蛇的zip()函数也可以与集合一起使用。但是，请记住，集合是唯一元素的无序集合，因此每次运行代码时元素配对的顺序可能会有所不同。


set1 = {"Apple", "Banana", "Cherry"}
set2 = {"Red", "Yellow", "Red"}
result = zip(set1, set2)
print(list(result))

输出（可能由于集合的无序性质而变化）：


[('Cherry', 'Red'), ('Banana', 'Yellow'), ('Apple', 'Red')]

在这个例子中，我们定义了两个集合，set1 and set2。然后我们使用了zip()函数创建一个 zip 对象，将每个集合中的相应元素配对在一起，从而生成元组列表。

zip() 与字典

当。。。的时候zip()函数与字典一起使用，默认情况下它将迭代键：


dict1 = {"name": "Alice", "age": 25, "country": "USA"}
dict2 = {"name": "Bob", "age": 30, "country": "Canada"}
result = zip(dict1, dict2)
print(list(result))

Output:


[('name', 'name'), ('age', 'age'), ('country', 'country')]

在这个例子中，我们定义了两个字典，dict1 and dict2。然后我们使用了zip()函数创建一个 zip 对象，该对象将每个字典中的相应键配对在一起，从而产生一个元组列表。

如果你想压缩字典的值，你可以使用values()函数如下：


result_values = zip(dict1.values(), dict2.values())
print(list(result_values))

Output:


[('Alice', 'Bob'), (25, 30), ('USA', 'Canada')]

Here, zip()用于值dict1 and dict2，因此生成的元组列表包含来自两个字典的配对值。

带字符串的 zip()

每个字符串都是一个可迭代的字符，所以zip()将字符串中相同索引处的字符配对。


str1 = "ABC"
str2 = "123"
result = zip(str1, str2)
print(list(result))

Output:


[('A', '1'), ('B', '2'), ('C', '3')]

在这个例子中，我们定义了两个字符串，str1 and str2。然后我们使用了zip()函数创建一个 zip 对象，将每个字符串中的相应字符配对在一起。

压缩不同长度的可迭代对象

当。。。的时候zip()函数与不同长度的迭代一起使用，当最小的迭代耗尽时，它会停止创建元组。让我们看看实际效果：


# Define three lists of different lengths
list1 = [1, 2, 3, 4, 5]
list2 = ["a", "b", "c"]
list3 = [1.1, 2.2, 3.3, 4.4]

# Use zip function
result = zip(list1, list2, list3)

# Convert result to list
print(list(result))

Output:


[(1, 'a', 1.1), (2, 'b', 2.2), (3, 'c', 3.3)]

当我们使用zip()在这三个列表上运行函数，它会在第三个元素之后停止创建元组，因为list2（最短列表）只有三个元素。因此生成的元组列表仅包含三个元组。

zip() 替代方案 (zip_longest())

正如我们在上面看到的，当处理长度不均匀的可迭代时，最小的可迭代决定了迭代器的长度，但是如果您不想要这样怎么办？

这就是zip_longest()函数从itertools模块进来了。

它的工作原理类似于zip()，但是当最短迭代耗尽时，它不会停止迭代，而是填充None对于较长的可迭代对象的剩余值。

您还可以使用提供不同的填充值fillvalue范围。


import itertools
list1 = [1, 2, 3, 4, 5]
list2 = ["a", "b", "c"]
result = itertools.zip_longest(list1, list2)
print(list(result))

Output:


[(1, 'a'), (2, 'b'), (3, 'c'), (4, None), (5, None)]

当我们使用itertools.zip_longest()在这些列表上运行函数，它通过填充在第三个元素之后继续创建元组None对于剩余的值list2.

zip() 与 * 运算符（解压缩）

蟒蛇的zip()函数可以与星号“*”运算符结合使用来解压可迭代对象。这通常用于“解压缩”元组列表。


pairs = [("a", 1), ("b", 2), ("c", 3)]
letters, numbers = zip(*pairs)
print(letters)
print(numbers)

Output:


('a', 'b', 'c')
(1, 2, 3)

在这个例子中，我们使用zip()函数与*运算符来解压元组pairs，产生两个元组：letters，其中包含元组中的所有第一个元素，并且numbers，其中包含元组中的所有第二个元素。

使用 zip() 时的常见错误

使用时可能会出现错误zip()如果您没有提供正确的参数，或者尝试以非预期的方式使用该函数，则该函数将被调用。以下是一些常见错误：

类型错误：zip 参数必须支持迭代

当您尝试使用时会出现此错误zip()带有不可迭代的参数。


number = 5
list(zip(number))

这会引发错误：TypeError: 'int' object is not iterable

仅使用zip()具有可迭代参数，例如列表、元组、集合、字典和字符串。

类型错误：“zip”对象不可订阅

当您尝试对 zip 对象建立索引时，会发生此错误。


list1 = [1, 2, 3]
list2 = ["a", "b", "c"]
zipped = zip(list1, list2)
print(zipped[0])

这会引发错误：TypeError: 'zip' object is not subscriptable

Zip 对象是迭代器，并且不支持索引。要索引对，首先将 zip 对象转换为列表或其他可下标的类型，如下所示：


list1 = [1, 2, 3]
list2 = ["a", "b", "c"]
zipped = zip(list1, list2)
print(list(zipped)[0])

使用 zip() 进行内存优化

该产品的优点之一是zip()功能是它在内存使用方面的效率。这是因为zip()返回一个迭代器，而不是列表或其他序列。

迭代器仅在需要时一次生成一个对，而不是一次将所有对存储在内存中。
在处理大型数据集时，这会产生显着的差异。
为此，我们将使用sys用于检查列表与 zip 对象的内存使用情况的模块。


import sys

# Two large lists of numbers
list1 = list(range(1, 10000001))  # numbers from 1 to 10 million
list2 = list(range(10000001, 20000001))  # numbers from 10 million + 1 to 20 million

list_memory = sys.getsizeof(list1) + sys.getsizeof(list2)
zipped = zip(list1, list2)
zip_memory = sys.getsizeof(zipped)

print(f"Memory used by lists: {list_memory} bytes")
print(f"Memory used by zip object: {zip_memory} bytes")

Output:


Memory used by lists: 160000112 bytes
Memory used by zip object: 64 bytes

与两个列表相比，zip 对象使用的内存明显更少，这说明了zip().

带有 lambda 函数的 zip()

让我们看一个示例，其中将两个列表中的元素配对，然后使用 lambda 函数创建一个新列表，其中每个元素都是每对中元素的总和。


list1 = [1, 2, 3, 4, 5]
list2 = [6, 7, 8, 9, 10]
pairs = zip(list1, list2)

# Use a lambda function to sum each pair
sums = list(map(lambda pair: pair[0] + pair[1], pairs))
print(sums)

Output:


[7, 9, 11, 13, 15]

在这个例子中，我们首先使用zip(list1, list2)创建一个 zip 对象，将来自的元素配对在一起list1 and list2.

然后，我们使用了map()带有 lambda 函数的函数lambda pair: pair[0] + pair[1]创建一个新列表sums.

此 lambda 函数接受一对数字并返回它们的和。
这样，zip()function 允许您将多个可迭代的元素配对，而 lambda 函数让您以紧凑且高效的方式处理和操作这些对。

嵌套 zip()

你可以筑巢zip()函数以更复杂的方式对多个可迭代的元素进行配对。让我们考虑一个示例，其中您有两个元组列表，并且您希望将每个列表中相应的元组配对在一起。
这是代码：


# Two lists of tuples
list1 = [(1, 2), (3, 4), (5, 6)]
list2 = [('a', 'b'), ('c', 'd'), ('e', 'f')]

# Use nested zip to pair tuples from the two lists
pairs = list(zip(*[zip(*list1), zip(*list2)]))
print(pairs)

Output:


[((1, 2), ('a', 'b')), ((3, 4), ('c', 'd')), ((5, 6), ('e', 'f'))]

在这个例子中，zip(*list1) and zip(*list2)首先解压元组list1 and list2分成第一元素和第二元素的单独元组。

Then, zip(*[zip(*list1), zip(*list2)])将结果中相应的元组配对在一起zip(*list1) and zip(*list2)，有效地将原始元组配对在一起list1 and list2.
请注意，如果过度使用或使用过多级别的嵌套，它可能会变得难以阅读和理解。

使用 zip() 的真实示例

为了更好地了解该设备的功能和实用性zip()函数，让我们探索一个现实世界的例子，其中zip()可能会派上用场。

将两个文件中的行配对

假设您有两个文本文件，file1.txt and file2.txt，并且您希望将每个文件中的相应行配对在一起。


with open('file1.txt', 'r') as file1, open('file2.txt', 'r') as file2:
    # Use zip to pair lines from the two files
    for line1, line2 in zip(file1, file2):
        print(line1.strip(), line2.strip())

在这个例子中，zip(file1, file2)将来自的相应行配对在一起file1 and file2。然后循环打印每对行。

特征和标签分离

在机器学习中，您经常使用数据集，其中每个实例都表示为特征和标签的元组（或列表）。

对于预处理，您可能需要分离特征和标签。这zip()函数非常适合这个。


dataset = [
    ((1.1, 2.2, 3.3), 0),
    ((4.4, 5.5, 6.6), 1),
    ((7.7, 8.8, 9.9), 0),
]

# Use zip to separate the features and labels
features, labels = zip(*dataset)

print("Features:")
for feature in features:
    print(feature)

print("\nLabels:")
for label in labels:
    print(label)

Output:


Features:
(1.1, 2.2, 3.3)
(4.4, 5.5, 6.6)
(7.7, 8.8, 9.9)

Labels:
0
1
0

在这个例子中，zip(*dataset)将元组分开dataset分成单独的特征和标签元组。

资源

https://docs.python.org/3/library/functions.html#zip

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

python