TypeError:ufunc'add'不包含签名匹配类型的循环

我正在创建单词表示该句子的包。然后将句子中存在的单词与文件“

vectors.txt”进行比较,以获取其嵌入矢量。在获取句子中每个单词的向量之后,我将句子中单词的向量取平均值。这是我的代码:

import nltk

import numpy as np

from nltk import FreqDist

from nltk.corpus import brown

news = brown.words(categories='news')

news_sents = brown.sents(categories='news')

fdist = FreqDist(w.lower() for w in news)

vocabulary = [word for word, _ in fdist.most_common(10)]

num_sents = len(news_sents)

def averageEmbeddings(sentenceTokens, embeddingLookupTable):

listOfEmb=[]

for token in sentenceTokens:

embedding = embeddingLookupTable[token]

listOfEmb.append(embedding)

return sum(np.asarray(listOfEmb)) / float(len(listOfEmb))

embeddingVectors = {}

with open("D:\\Embedding\\vectors.txt") as file:

for line in file:

(key, *val) = line.split()

embeddingVectors[key] = val

for i in range(num_sents):

features = {}

for word in vocabulary:

features[word] = int(word in news_sents[i])

print(features)

print(list(features.values()))

sentenceTokens = []

for key, value in features.items():

if value == 1:

sentenceTokens.append(key)

sentenceTokens.remove(".")

print(sentenceTokens)

print(averageEmbeddings(sentenceTokens, embeddingVectors))

print(features.keys())

不知道为什么,但是我得到这个错误:

TypeError                                 Traceback (most recent call last)

<ipython-input-4-643ccd012438> in <module>()

39 sentenceTokens.remove(".")

40 print(sentenceTokens)

---> 41 print(averageEmbeddings(sentenceTokens, embeddingVectors))

42

43 print(features.keys())

<ipython-input-4-643ccd012438> in averageEmbeddings(sentenceTokens, embeddingLookupTable)

18 listOfEmb.append(embedding)

19

---> 20 return sum(np.asarray(listOfEmb)) / float(len(listOfEmb))

21

22 embeddingVectors = {}

TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('<U9') dtype('<U9') dtype('<U9')

PS嵌入向量如下所示:

the 0.011384 0.010512 -0.008450 -0.007628 0.000360 -0.010121 0.004674 -0.000076 

of 0.002954 0.004546 0.005513 -0.004026 0.002296 -0.016979 -0.011469 -0.009159

and 0.004691 -0.012989 -0.003122 0.004786 -0.002907 0.000526 -0.006146 -0.003058

one 0.014722 -0.000810 0.003737 -0.001110 -0.011229 0.001577 -0.007403 -0.005355

in -0.001046 -0.008302 0.010973 0.009608 0.009494 -0.008253 0.001744 0.003263

使用np.sum后,出现此错误:

TypeError                                 Traceback (most recent call last)

<ipython-input-13-8a7edbb9d946> in <module>()

40 sentenceTokens.remove(".")

41 print(sentenceTokens)

---> 42 print(averageEmbeddings(sentenceTokens, embeddingVectors))

43

44 print(features.keys())

<ipython-input-13-8a7edbb9d946> in averageEmbeddings(sentenceTokens, embeddingLookupTable)

18 listOfEmb.append(embedding)

19

---> 20 return np.sum(np.asarray(listOfEmb)) / float(len(listOfEmb))

21

22 embeddingVectors = {}

C:\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py in sum(a, axis, dtype, out, keepdims)

1829 else:

1830 return _methods._sum(a, axis=axis, dtype=dtype,

-> 1831 out=out, keepdims=keepdims)

1832

1833

C:\Anaconda3\lib\site-packages\numpy\core\_methods.py in _sum(a, axis, dtype, out, keepdims)

30

31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):

---> 32 return umr_sum(a, axis, dtype, out, keepdims)

33

34 def _prod(a, axis=None, dtype=None, out=None, keepdims=False):

TypeError: cannot perform reduce with flexible type

回答:

您有一个numpy的字符串数组,而不是浮点数。这就是dtype('<U9')一个小端编码的unicode字符串,最多9个字符。

尝试:

return sum(np.asarray(listOfEmb, dtype=float)) / float(len(listOfEmb))

但是,这里根本不需要numpy。您真的可以做到:

return sum(float(embedding) for embedding in listOfEmb) / len(listOfEmb)

或者,如果您真的想使用numpy。

return np.asarray(listOfEmb, dtype=float).mean()

以上是 TypeError:ufunc&#39;add&#39;不包含签名匹配类型的循环 的全部内容, 来源链接: utcz.com/qa/409338.html

回到顶部