LinearSVC
不提供predict_proba
但它提供了decision_function
它给出了距超平面的有符号距离。
来自文档:
决策函数(自我,X):
预测样本的置信度分数。
样本的置信度分数是该样本到超平面的有符号距离。
根据@warped 评论,
我们可以用decision_function
输出,找到顶部n
从模型预测类别。
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
X, y = make_classification(n_samples=1000,
n_clusters_per_class=1,
n_informative=10,
n_classes=5, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2,
random_state=42)
clf = make_pipeline(StandardScaler(),
LinearSVC(random_state=0, tol=1e-5))
clf.fit(X, y)
top_n_classes = 2
predictions = clf.decision_function(
X_test).argsort()[:,-top_n_classes:][:,::-1]
pred_df = pd.DataFrame(predictions,
columns= [f'{i+1}_pred' for i in range(top_n_classes)])
df = pd.DataFrame({'true_class': y_test})
df = df.assign(**pred_df)
df