我正在尝试使用以下代码打印最常见的 10 个单词。但是,它不起作用。关于如何修复它有什么想法吗?
def reducer_count_words(self, word, counts):
# send all (num_occurrences, word) pairs to the same reducer.
# num_occurrences is so we can easily use Python's max() function.
yield None, (sum(counts), word)
# discard the key; it is just None
def reducer_find_max_10_words(self, _, word_count_pairs):
# each item of word_count_pairs is (count, word),
# so yielding one results in key=counts, value=word
tmp = sorted(word_count_pairs)[0:10]
yield tmp
Use collections.Counter https://docs.python.org/2/library/collections.html和它的most_common
method:
>>>from collections import Counter
>>>my_words = 'a a foo bar foo'
>>>Counter(my_words.split()).most_common()
[('foo', 2), ('a', 2), ('b', 1)]
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)