类型错误:get_params() 缺少 1 个必需的位置参数:'self'

2024-03-17

我试图使用scikit-learn与 python-3.4 一起打包以进行网格搜索,

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model.logistic import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.grid_search import GridSearchCV
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.metrics import precision_score, recall_score, accuracy_score
from sklearn.preprocessing import LabelBinarizer
import numpy as np

pipeline = Pipeline([
    ('vect', TfidfVectorizer(stop_words='english')),
    ('clf', LogisticRegression)
])

parameters = {
    'vect__max_df': (0.25, 0.5, 0.75),
    'vect__stop_words': ('english', None),
    'vect__max_features': (2500, 5000, 10000, None),
    'vect__ngram_range': ((1, 1), (1, 2)),
    'vect__use_idf': (True, False),
    'vect__norm': ('l1', 'l2'),
    'clf__penalty': ('l1', 'l2'),
    'clf__C': (0.01, 0.1, 1, 10)
}

if __name__ == '__main__':
    grid_search = GridSearchCV(pipeline, parameters, n_jobs=-1, verbose=1, scoring='accuracy', cv = 3)
    df = pd.read_csv('SMS Spam Collection/SMSSpamCollection', delimiter='\t', header=None)
    lb = LabelBinarizer()
    X, y = df[1], np.array([number[0] for number in lb.fit_transform(df[0])])
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    grid_search.fit(X_train, y_train)
    print('Best score: ', grid_search.best_score_)
    print('Best parameter set:')
    best_parameters = grid_search.best_estimator_.get_params()
    for param_name in sorted(best_parameters):
        print(param_name, best_parameters[param_name])

然而,它并没有运行成功,错误信息如下:

Fitting 3 folds for each of 1536 candidates, totalling 4608 fits
Traceback (most recent call last):
  File "/home/xiangru/PycharmProjects/machine_learning_note_with_sklearn/grid search.py", line 36, in <module>
    grid_search.fit(X_train, y_train)
  File "/usr/local/lib/python3.4/dist-packages/sklearn/grid_search.py", line 732, in fit
    return self._fit(X, y, ParameterGrid(self.param_grid))
  File "/usr/local/lib/python3.4/dist-packages/sklearn/grid_search.py", line 493, in _fit
    base_estimator = clone(self.estimator)
  File "/usr/local/lib/python3.4/dist-packages/sklearn/base.py", line 47, in clone
    new_object_params[name] = clone(param, safe=False)
  File "/usr/local/lib/python3.4/dist-packages/sklearn/base.py", line 35, in clone
    return estimator_type([clone(e, safe=safe) for e in estimator])
  File "/usr/local/lib/python3.4/dist-packages/sklearn/base.py", line 35, in <listcomp>
    return estimator_type([clone(e, safe=safe) for e in estimator])
  File "/usr/local/lib/python3.4/dist-packages/sklearn/base.py", line 35, in clone
    return estimator_type([clone(e, safe=safe) for e in estimator])
  File "/usr/local/lib/python3.4/dist-packages/sklearn/base.py", line 35, in <listcomp>
    return estimator_type([clone(e, safe=safe) for e in estimator])
  File "/usr/local/lib/python3.4/dist-packages/sklearn/base.py", line 45, in clone
    new_object_params = estimator.get_params(deep=False)
TypeError: get_params() missing 1 required positional argument: 'self'

我也尝试过只使用

if __name__ == '__main__':
    pipeline.get_params()

它给出了相同的错误消息。 谁知道如何解决这个问题?


This error is almost always misleading, and actually means that you're calling an instance method on the class, rather than the instance (like calling dict.keys() instead of d.keys() on a dict named d).*

这正是这里发生的事情。The docs http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html意味着best_estimator_属性,例如estimator初始化器的参数,不是估计器instance,这是一个估计器type,以及“为每个网格点实例化该类型的对象。”

因此,如果您想调用方法,则必须为某个特定的网格点构造该类型的对象。

然而,快速浏览一下文档,如果您试图获取用于返回最佳分数的最佳估计器的特定实例的参数,那不是就是这样吗?best_params_? (我很抱歉这部分有点猜测......)


For the Pipeline打电话,你肯定有一个实例。而唯一的文档 http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html该方法是一个参数规范,它表明它需要一个可选参数,deep。但在幕后,它可能正在转发get_params()调用其属性之一。与('clf', LogisticRegression),看起来你正在用class LogisticRegression,而不是该类的实例,所以如果这就是它最终转发到的,那就可以解释问题了。


* The reason the error says "missing 1 required positional argument: 'self'" instead of "must be called on an instance" or something is that in Python, d.keys() is effectively turned into dict.keys(d), and it's perfectly legal (and sometimes useful) to call it that way explicitly, so Python can't really tell you that dict.keys() is illegal, just that it's missing the self argument.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

类型错误:get_params() 缺少 1 个必需的位置参数:'self' 的相关文章

随机推荐