Scikit学习不字符串值在KNN

我使用Scikit学会做工作

K最近Neigbour分类:Scikit学习不字符串值在KNN

from sklearn.neighbors import KNeighborsClassifier 

model=KNeighborsClassifier()

model.fit(train_input,train_labels)

如果打印我的数据:

print("train_input:") 

print(train_input.iloc[0])

print("\n")

print("train_labels:")

print(train_labels.iloc[0])

我得到这样的:

train_input: 

PassengerId 1

Pclass 3

Name Braund, Mr. Owen Harris

Sex male

Age 22

SibSp 1

Parch 0

Ticket A/5 21171

Fare 7.25

Cabin NaN

Embarked S

Name: 0, dtype: object

train_labels:

0

代码失败,此错误:

ValueError        Traceback (most recent call last) 

<ipython-input-21-1f18eec1e602> in <module>()

63

64 model=KNeighborsClassifier()

---> 65 model.fit(train_input,train_labels)

ValueError: could not convert string to float: 'Q'

那么,KNN算法不适用于String值吗?

如何修改我的数据,使其符合Scikit-Learn中的KNN实现?

回答:

对于标称String功能,请考虑一个热门编码:http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html。

对于有序String特点,考虑标签编码(根据您的功能的理解明智的顺序):http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html。

以上是 Scikit学习不字符串值在KNN 的全部内容, 来源链接: utcz.com/qa/257285.html

回到顶部