如果您尝试对共享文件中前两个单词的连续行进行分组,那么这是一个用例itertools.groupby
, eg:
from itertools import groupby
with open('somefile') as fin:
lines = ((line.split(None, 2)[:2], line) for line in fin if line.strip())
for k, g in groupby(lines, lambda L: L[0]):
lines = [el[1] for el in g]
Here k
是分组键(最多前两个单词)并且lines
将是文件中共享该密钥的行。
Example somefile
input:
one two three four five
one two five six seven
three four something
three four something else
one two start of new one two block
的结果print k, lines
:
['one', 'two'] ['one two three four five\n', 'one two five six seven\n']
['three', 'four'] ['three four something\n', 'three four something else\n']
['one', 'two'] ['one two start of new one two block\n']
从列表中排除前两个单词line
, use:
with open('somefile') as fin:
lines = (line.split(None, 2) for line in fin if line.strip())
for k, g in groupby(lines, lambda L: L[:2]):
lines = [el[2] for el in g]