该代码之前可以运行,没有显示任何错误。
这是一个情感分析机器学习项目。该代码基于字数统计的逻辑回归模型:
c = CountVectorizer(stop_words = 'english')
def text_fit(X, y, model,clf_model,coef_show=1):
X_c = model.fit_transform(X)
print('# features: {}'.format(X_c.shape[1]))
X_train, X_test, y_train, y_test = train_test_split(X_c, y, random_state=0)
print('# train records: {}'.format(X_train.shape[0]))
print('# test records: {}'.format(X_test.shape[0]))
clf = clf_model.fit(X_train, y_train)
acc = clf.score(X_test, y_test)
print ('Model Accuracy: {}'.format(acc))
if coef_show == 1:
w = model.get_feature_names()
coef = clf.coef_.tolist()[0]
coeff_df = pd.DataFrame({'Word' : w, 'Coefficient' : coef})
coeff_df = coeff_df.sort_values(['Coefficient', 'Word'], ascending=[0, 1])
print('')
print('-Top 20 positive-')
print(coeff_df.head(20).to_string(index=False))
print('')
print('-Top 20 negative-')
print(coeff_df.tail(20).to_string(index=False))
text_fit(X, y, c, LogisticRegression())
我删除了该项目并创建了一个新项目,并且代码可以正常工作。但几天后,它再次开始显示相同的错误。