打印10个最常用的单词
该程序试图在文件中打印最常用的10个单词。但我无法打印的10个最常用的词打印10个最常用的单词
from string import * file = open('shakespeare.txt').read().lower().split()
number_of_words = 0
onlyOneWord = []
for i in file:
if i in onlyOneWord: continue
else: onlyOneWord.append(i)
lot_of_words = {}
for all_Words in onlyOneWord:
all_Words = all_Words.strip(punctuation)
number_of_words = 0
for orignal_file in file:
orignal_file = orignal_file.strip(punctuation)
if all_Words == orignal_file:
number_of_words += 1
lot_of_words[all_Words] = number_of_words
for x,y in sorted(lot_of_words.items()):
print(max(y))
现在它将打印什么是完整的文件
我需要它来打印10个最常用的词这样并使其运行速度快了很多
的:251 苹果:234 等
回答:
您可以使用collections.Counter.most_common
轻松完成此操作。我还使用str.translate
删除标点符号。
from collections import Counter from string import punctuation
strip_punc = str.maketrans('', '', punctuation)
with open('shakespeare.txt') as f:
wordCount = Counter(f.read().lower().translate(strip_punc).split())
print(wordCount.most_common(10))
将打印元组
列表[('the', 251), ('apple', 100), ...]
编辑: 我们可能会使用相同的translate
电话,我们用它来去除标点符号改变字母的大小写加快这
from string import punctuation, ascii_uppercase, ascii_lowercase strip_punc = str.maketrans(ascii_lowercase, ascii_uppercase, punctuation)
以上是 打印10个最常用的单词 的全部内容, 来源链接: utcz.com/qa/258472.html