前言
自动化机器学习已经被广泛应用于各种(跨)业务场景的模型构建,实验以及生产部署当中。
automl领域中有各种各样的开源项目可以直接使用,本篇文章尝试对一些主流的开源框架进行介绍。
-
autogluon
-
hypergbm
-
h2o automl
-
lightautoml
-
FLAML
备注:本篇文章的所使用的数据集为tabular-playground-series-may-2021数据集.
import pandas as pd
train_data = pd.read_csv("../input/tabular-playground-series-may-2021/train.csv",index_col=0)
test_data = pd.read_csv("../input/tabular-playground-series-may-2021/test.csv",index_col=0)
target = 'target'
reward_metric = 'auc'
AutoGluon
AutoGluon: AutoML for Text, Image, and Tabular Data . by Amazon Web Services - Labs
from autogluon.tabular import TabularPredictor
model_autogluon = TabularPredictor(label='target')
model_autogluon.fit(train_data=train_data, time_limit=300)
model_autogluon.leaderboard()
preds = model_autogluon.predict(test_data)
运行过程展示:
Hypergbm
Hypergbm is A full pipeline AutoML tool for tabular data. by DataCanvas
from hypergbm import make_experiment
exp = make_experiment(train_data.copy(),target=target)
estimator = exp.run()
preds = estimator.predict(test_data)
运行过程展示:
H2O AutoML
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML). by H2O.ai
import h2o
from h2o.automl import H2OAutoML
h2o.init()
h2o_train = h2o.H2OFrame(train_data.copy())
h2o_test = h2o.H2OFrame(test_data.copy())
h2o_train[target] = h2o_train[target].asfactor()
feature_columns = train_data.iloc[:, 1:-1].columns.values
aml = H2OAutoML(max_runtime_secs=300)
aml.train(x=list(feature_columns),
y=target,
training_frame=h2o_train
)
aml.leaderboard
aml.predict(h2o.H2OFrame(h2o_test))
运行过程展示:
Lightauotml
LAMA - automatic model creation framework. by Sberbank AI Lab
from lightautoml.automl.presets.tabular_presets import TabularAutoML
from lightautoml.tasks import Task
model_laml = TabularAutoML(task = Task('multiclass'), timeout = 300)
model_laml.fit_predict(train_data=train_data, roles={'target': 'target'})
model_laml.predict(test_data)
运行过程展示:
默认过程没有输出
FLAML
Flaml is a fast and lightweight AutoML library. by Microsoft
运行过程展示:
***** 持续更新中
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)