华为云AI开发平台ModelArtsLightGBM回归_云淘科技
概述
对mmlspark python包中LightGBM回归的封装
输入
|
参数 |
子参数 |
参数说明 |
|---|---|---|
|
inputs |
dataframe |
inputs为字典类型,dataframe为pyspark中的DataFrame类型对象 |
输出
spark pipeline类型的模型
参数说明
|
参数 |
子参数 |
参数说明 |
|---|---|---|
|
input_features_str |
– |
输入的列名以逗号分隔组成的字符串,例如: “column_a” “column_a,column_b” |
|
label_col |
– |
目标列 |
|
regressor_feature_vector_col |
– |
算子输入的特征向量列的列名,默认为”model_features” |
|
prediction_col |
– |
算子输出的预测label的列名,默认为”prediction” |
|
objective |
– |
目标函数,默认为”regression” |
|
max_depth |
– |
树的最大深度,默认为-1 |
|
num_iteration |
– |
迭代次数,默认为100 |
|
learning_rate |
– |
学习率,默认为0.1 |
|
num_leaves |
– |
叶子数目,默认为31 |
|
max_bin |
– |
最大分箱数,默认为255 |
|
bagging_fraction |
– |
bagging的比例,默认为1 |
|
bagging_freq |
– |
bagging的频率,默认为0 |
|
bagging_seed |
– |
bagging时的随机数种子,默认为3 |
|
early_stopping_round |
– |
提前结束迭代的轮数,默认为0 |
|
feature_fraction |
– |
特征的比例,默认为1.0 |
|
min_sum_hessian_in_leaf |
– |
一个叶子上最小hessian和。取值区间为[0, 1],默认为1e-3 |
|
boost_from_average |
– |
是否将初始分数调整为标签的平均值,以加快收敛速度,,默认为True |
|
boosting_type |
– |
提升方法的提升类型。 可选值有:gbdt、gbrt、rf、dart、goss,默认为gbdt |
|
lambda_l1 |
– |
L1正则化系数,默认为0.0 |
|
lambda_l2 |
– |
L2正则化系数,,默认为0.0 |
|
num_batches |
– |
如果大于0,在训练中将数据集分割成不同的批次,默认为0 |
|
parallelism |
– |
学习树时的并行方法,支持data_parallel, voting_parallel,默认为”data_parallel” |
样例
inputs = {
"dataframe": None # @input {"label":"dataframe","type":"DataFrame"}
}
params = {
"inputs": inputs,
"b_output_action": True,
"outer_pipeline_stages": None,
"input_features_str": "", # @param {"label":"input_features_str","type":"string","required":"false","helpTip":""}
"label_col": "", # @param {"label":"label_col","type":"string","required":"true","helpTip":""}
"regressor_feature_vector_col": "model_features", # @param {"label":"regressor_feature_vector_col","type":"string","required":"false","helpTip":""}
"prediction_col": "prediction", # @param {"label":"prediction_col","type":"string","required":"false","helpTip":""}
"objective": "regression", # @param {"label":"objective","type":"string","required":"false","helpTip":""}
"max_depth": -1, # @param {"label":"max_depth","type":"integer","required":"false","range":"[-1,2147483647]","helpTip":""}
"num_iteration": 100, # @param {"label":"num_iteration","type":"integer","required":"false","range":"(0,2147483647]","helpTip":""}
"learning_rate": 0.1, # @param {"label":"learning_rate","type":"number","required":"false","helpTip":""}
"num_leaves": 31, # @param {"label":"num_leaves","type":"integer","required":"false","range":"(0,2147483647]","helpTip":""}
"max_bin": 255, # @param {"label":"max_bin","type":"integer","required":"false","range":"(0,2147483647]","helpTip":""}
"bagging_fraction": 1.0, # @param {"label":"bagging_fraction","type":"number","required":"false","helpTip":""}
"bagging_freq": 0, # @param {"label":"bagging_freq","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""}
"bagging_seed": 3, # @param {"label":"bagging_seed","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""}
"early_stopping_round": 0, # @param {"label":"early_stopping_round","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""}
"feature_fraction": 1.0, # @param {"label":"feature_fraction","type":"number","required":"false","helpTip":""}
"min_sum_hessian_in_leaf": 1e-3, # @param {"label":"min_sum_hessian_in_leaf","type":"number","required":"false","helpTip":""}
"boost_from_average": True, # @param {"label":"boost_from_average","type":"boolean","required":"false","helpTip":""}
"boosting_type": "gbdt", # @param {"label":"boosting_type","type":"string","required":"false","helpTip":""}
"lambda_l1": 0.0, # @param {"label":"lambda_l1","type":"number","required":"false","helpTip":""}
"lambda_l2": 0.0, # @param {"label":"lambda_l2","type":"number","required":"false","helpTip":""}
"num_batches": 0, # @param {"label":"num_batches","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""}
"parallelism": "data_parallel" # @param {"label":"parallelism","type":"string","required":"false","helpTip":""}
}
lightgbm_regressor____id___ = MLSLightGbmRegression(**params)
lightgbm_regressor____id___.run()
# @output {"label":"pipeline_model","name":"lightgbm_regressor____id___.get_outputs()['output_port_1']","type":"PipelineModel"}
父主题: 回归
同意关联代理商云淘科技,购买华为云产品更优惠(QQ 78315851)
内容没看懂? 不太想学习?想快速解决? 有偿解决: 联系专家