GridSearchCV 文档 http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html表明我可以通过评分功能。
评分:字符串,可调用或无,默认=无
我想使用本机准确率_分数 http://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html作为评分函数。
这是我的尝试。导入和一些数据:
import numpy as np
from sklearn.cross_validation import KFold, cross_val_score
from sklearn.grid_search import GridSearchCV
from sklearn.metrics import accuracy_score
from sklearn import neighbors
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
Y = np.array([0, 1, 0, 0, 0, 1])
现在,当我仅使用 k 折交叉验证而不使用评分函数时,一切都会按预期进行:
parameters = {
'n_neighbors': [2, 3, 4],
'weights':['uniform', 'distance'],
'p': [1, 2, 3]
}
model = neighbors.KNeighborsClassifier()
k_fold = KFold(len(Y), n_folds=6, shuffle=True, random_state=0)
clf = GridSearchCV(model, parameters, cv=k_fold) # TODO will change
clf.fit(X, Y)
print clf.best_score_
但是当我将线路更改为
clf = GridSearchCV(model, parameters, cv=k_fold, scoring=accuracy_score) # or accuracy_score()
我收到错误:ValueError: Cannot have number of folds n_folds=10 greater than the number of samples: 6.
我认为这并不代表真正的问题。
我认为问题在于accuracy_score
不遵循签名scorer(estimator, X, y)
,这是在文档中写的
那么我该如何解决这个问题呢?