【LightGBM】feature_importance获取特征重要性

2023-05-16

使用LightGBM.feature_importance()函数给训练完毕的LightGBM模型的各特征进行重要性排序。

feature_importance             = pd.DataFrame()
feature_importance['fea_name'] = train_features
feature_importance['fea_imp']  = clf.feature_importance()
feature_importance             = feature_importance.sort_values('fea_imp',ascending = False)
 
plt.figure(figsize=[20,10],dpi=100)
ax = sns.barplot(x = feature_importance['fea_name'], y = feature_importance['fea_imp'])
ax.set_xticklabels(labels = ['file_id_api_nunique','file_id_api_count','file_id_tid_max','file_id_tid_mean','file_id_tid_min','file_id_tid_std','file_id_index_mean','file_id_tid_nunique','file_id_index_nunique','file_id_index_std','file_id_index_max','file_id_tid_count','file_id_index_count','file_id_index_min'],
                                    rotation = 45,fontsize = 15)
ax.set_yticklabels(labels = [0,2000,4000,6000,8000,10000,12000,14000,16000],fontsize = 15)
plt.xlabel('fea_name',fontsize=18)
plt.ylabel('fea_imp',fontsize=18)
# plt.tight_layout()
plt.savefig('D:/A_graduation_project/pictures/2_baseline1/特征重要性')

官方文档

feature_importance(importance_type='split', iteration=-1)
Get feature importances.

Parameters:
importance_type (string__, optional (default="split")) – How the importance is calculated. If “split”, result contains numbers of times the feature is used in a model. If “gain”, result contains total gains of splits which use the feature.
iteration (int or None, optional (default=None)) –  Limit number of iterations in the feature importance calculation. If None, if the best iteration exists,  it is used; otherwise, all trees are used. If <= 0, all trees are used(no limits).
Returns:
result – Array with feature importances.
Return type:
numpy array

————————————————
来源:https://blog.csdn.net/qq_41904729/article/details/117928981

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

【LightGBM】feature_importance获取特征重要性 的相关文章

随机推荐