打印10个最常用的单词

该程序试图在文件中打印最常用的10个单词。但我无法打印的10个最常用的词打印10个最常用的单词

from string import * 

file = open('shakespeare.txt').read().lower().split()

number_of_words = 0

onlyOneWord = []

for i in file:

if i in onlyOneWord: continue

else: onlyOneWord.append(i)

lot_of_words = {}

for all_Words in onlyOneWord:

all_Words = all_Words.strip(punctuation)

number_of_words = 0

for orignal_file in file:

orignal_file = orignal_file.strip(punctuation)

if all_Words == orignal_file:

number_of_words += 1

lot_of_words[all_Words] = number_of_words

for x,y in sorted(lot_of_words.items()):

print(max(y))

现在它将打印什么是完整的文件

我需要它来打印10个最常用的词这样并使其运行速度快了很多

的:251 苹果:234 等

回答:

您可以使用collections.Counter.most_common轻松完成此操作。我还使用str.translate删除标点符号。

from collections import Counter 

from string import punctuation

strip_punc = str.maketrans('', '', punctuation)

with open('shakespeare.txt') as f:

wordCount = Counter(f.read().lower().translate(strip_punc).split())

print(wordCount.most_common(10))

将打印元组

列表

[('the', 251), ('apple', 100), ...] 

编辑: 我们可能会使用相同的translate电话,我们用它来去除标点符号改变字母的大小写加快这

from string import punctuation, ascii_uppercase, ascii_lowercase 

strip_punc = str.maketrans(ascii_lowercase, ascii_uppercase, punctuation)

以上是 打印10个最常用的单词 的全部内容, 来源链接: utcz.com/qa/258472.html

回到顶部