Use collections.Counter https://docs.python.org/2/library/collections.html#collections.Counter.
示例文件:
/tmp/foo.txt
hello world
hello world
foo bar
foo bar baz
/tmp/bar.txt
hello world
hello world
foo bar
foo bar baz
foo foo foo
您可以创建一个Counter
每个文件,然后将它们添加在一起!
from collections import Counter
def word_count(filename):
with open(filename, 'r') as f:
c = Counter()
for line in f:
c.update(line.strip().split(' '))
return c
files = ['/tmp/foo.txt', '/tmp/bar.txt']
counters = [word_count(filename) for filename in files]
# counters content (example):
# [Counter({'world': 2, 'foo': 2, 'bar': 2, 'hello': 2, 'baz': 1}),
# Counter({'foo': 5, 'world': 2, 'bar': 2, 'hello': 2, 'baz': 1})]
# Add all the word counts together:
total = sum(counters, Counter()) # sum needs an empty counter to start with
# total content (example):
# Counter({'foo': 7, 'world': 4, 'bar': 4, 'hello': 4, 'baz': 2})