如何使用MinMaxScaler sklearn归一化训练和测试数据

Z时代
2024-01-10
分类：问答

因此，我对此有疑问，一直在寻找答案。所以问题是我何时使用

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
df = pd.DataFrame({'A':[1,2,3,7,9,15,16,1,5,6,2,4,8,9],'B':[15,12,10,11,8,14,17,20,4,12,4,5,17,19],'C':['Y','Y','Y','Y','N','N','N','Y','N','Y','N','N','Y','Y']})
df[['A','B']] = min_max_scaler.fit_transform(df[['A','B']])
df['C'] = df['C'].apply(lambda x: 0 if x.strip()=='N' else 1)

这之后，我将训练和测试模型（A，B作为特征，C如标签），并得到一些准确度得分。现在我的疑问是，当我必须预测新数据集的标签时会发生什么。说，

df = pd.DataFrame({'A':[25,67,24,76,23],'B':[2,54,22,75,19]})

因为当我规范化列时，A和的值B将根据新数据而不是将在其上训练模型的数据来更改。因此，现在将是下面的数据准备步骤之后的数据。

data[['A','B']] = min_max_scaler.fit_transform(data[['A','B']])

的价值A和B将关于改变Max和Min价值df[['A','B']]。的数据准备df[['A','B']]是关于Min

Max的df[['A','B']]。

有关不同数字的数据准备如何有效相关？我不明白这个预测在这里如何正确。

回答：

步骤1：装scaler上TRAINING data

步骤2：使用scaler至transform the TRAINING data

第3步：使用transformed training data来fit the predictive model

步骤4：使用scaler至transform the TEST data

步骤5：predict使用trained model（步骤3）和transformed TEST data（步骤4）。

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
#training data
df = pd.DataFrame({'A':[1,2,3,7,9,15,16,1,5,6,2,4,8,9],'B':[15,12,10,11,8,14,17,20,4,12,4,5,17,19],'C':['Y','Y','Y','Y','N','N','N','Y','N','Y','N','N','Y','Y']})
#fit and transform the training data and use them for the model training
df[['A','B']] = min_max_scaler.fit_transform(df[['A','B']])
df['C'] = df['C'].apply(lambda x: 0 if x.strip()=='N' else 1)
#fit the model
model.fit(df['A','B'])
#after the model training on the transformed training data define the testing data df_test
df_test = pd.DataFrame({'A':[25,67,24,76,23],'B':[2,54,22,75,19]})
#before the prediction of the test data, ONLY APPLY the scaler on them
df_test[['A','B']] = min_max_scaler.transform(df_test[['A','B']])
#test the model
y_predicted_from_model = model.predict(df_test['A','B'])

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVC
data = datasets.load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
model = SVC()
model.fit(X_train_scaled, y_train)
X_test_scaled = scaler.transform(X_test)
y_pred = model.predict(X_test_scaled)

希望这可以帮助。

以上是如何使用MinMaxScaler sklearn归一化训练和测试数据的全部内容，来源链接： utcz.com/qa/429563.html

如何使用MinMaxScaler sklearn归一化训练和测试数据

回答：

回答：

其他人也看了：