Scikit学习不字符串值在KNN
我使用Scikit学会做工作
K最近Neigbour分类:Scikit学习不字符串值在KNN
from sklearn.neighbors import KNeighborsClassifier model=KNeighborsClassifier()
model.fit(train_input,train_labels)
如果打印我的数据:
print("train_input:") print(train_input.iloc[0])
print("\n")
print("train_labels:")
print(train_labels.iloc[0])
我得到这样的:
train_input: PassengerId 1
Pclass 3
Name Braund, Mr. Owen Harris
Sex male
Age 22
SibSp 1
Parch 0
Ticket A/5 21171
Fare 7.25
Cabin NaN
Embarked S
Name: 0, dtype: object
train_labels:
0
代码失败,此错误:
ValueError Traceback (most recent call last) <ipython-input-21-1f18eec1e602> in <module>()
63
64 model=KNeighborsClassifier()
---> 65 model.fit(train_input,train_labels)
ValueError: could not convert string to float: 'Q'
那么,KNN算法不适用于String
值吗?
如何修改我的数据,使其符合Scikit-Learn中的KNN实现?
回答:
对于标称String
功能,请考虑一个热门编码:http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html。
对于有序String
特点,考虑标签编码(根据您的功能的理解明智的顺序):http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html。
以上是 Scikit学习不字符串值在KNN 的全部内容, 来源链接: utcz.com/qa/257285.html