current position:Home>Advanced Integrated Learning
Advanced Integrated Learning
2022-08-06 09:32:00【Ding Jiaxiong】
16. Advanced Integrated Learning
文章目录
16.1 xgboost算法原理
16.1.1 目标函数
16.1.2 回归树构建方法
16.1.3 XGBoost与GDBT的区别
区别一
- XGBoost生成CART树考虑了树的复杂度
- GDBT未考虑,GDBT在树的剪枝步骤中考虑了树的复杂度
区别二
- XGBoost是拟合上一轮损失函数的二阶导展开,GDBT是拟合上一轮损失函数的一阶导展开,因此,XGBoost的准确性更高,且满足相同的训练效果,需要的迭代次数更少
区别三
- XGBoost与GDBT都是逐次迭代来提高模型性能,但是XGBoost在选取最佳切分点时可以开启多线程进行,大大提高了运行速度
16.2 xgboost算法api
pip3 install xgboost
16.2.1 参数
通用参数(general parameters)
宏观函数控制
booster [缺省值=gbtree]
- 决定使用哪个booster,可以是gbtree,gblinear或者dart
silent [缺省值=0]
- 设置为0打印运行信息;设置为1静默模式,不打印
nthread [缺省值=设置为最大可能的线程数]
- 并行运行xgboost的线程数
num_pbuffer [xgboost自动设置,不需要用户设置]
- 预测结果缓存大小
num_feature [xgboost自动设置,不需要用户设置]
- 在boosting中使用特征的维度,设置为特征的最大维度
Booster 参数(booster parameters)
用于控制每一步的booster(tree, regressiong)
Tree Booster
eta [缺省值=0.3,别名:learning_rate]
- 更新中减少的步长来防止过拟合
gamma [缺省值=0,别名: min_split_loss](分裂最小loss)
- Gamma指定了节点分裂所需的最小损失函数下降值
max_depth [缺省值=6]
- 树的最大深度
min_child_weight [缺省值=1]
- 最小叶子节点样本权重和
subsample [缺省值=1]
- 控制对于每棵树,随机采样的比例
colsample_bytree [缺省值=1]
- Controls the proportion of the number of columns randomly sampled per tree
colsample_bylevel [缺省值=1]
- 控制树的每一级的每一次分裂,对列数的采样的占比
lambda [缺省值=1,别名: reg_lambda]
- 控制XGBoost的正则化部分
alpha [缺省值=0,别名: reg_alpha]
- 权重的L1正则化项
scale_pos_weight[缺省值=1]
- 样本的数目与正样本数目的比值
Linear Booster
lambda [缺省值=0,别称: reg_lambda]
- L2正则化惩罚系数
alpha [缺省值=0,别称: reg_alpha]
- L1正则化惩罚系数
lambda_bias [缺省值=0,别称: reg_lambda_bias]
- 偏置上的L2正则化
学习目标参数(task parameters)
控制训练目标的表现
objective [缺省值=reg:linear]
eval_metric [缺省值=通过目标函数选择]
seed [缺省值=0]
- 随机数的种子
16.3 lightGBM
16.3.1 演进过程
2017年1月,微软在GItHub上开源的一个新的梯度提升框架
16.3.2 原理(优势)
优化
- 基于Histogram(直方图)的决策树算法
- Lightgbm 的Histogram(直方图)做差加速
- 带深度限制的Leaf-wise的叶子生长策略
- 直接支持类别特征
- 直接支持高效并行
16.4 lightGBM API
pip3 install lightgbm
16.4.1 参数
Control Parameters
Core Parameters
IO parameter
16.4.2 调参建议
copyright notice
author[Ding Jiaxiong],Please bring the original link to reprint, thank you.
https://en.chowdera.com/2022/218/202208060925359231.html
The sidebar is recommended
- Detailed explanation of Mysql things (important)
- Linux - several ways to install MySQL
- /var/log/messages is empty
- The 22nd day of the special assault version of the sword offer
- Stone Atom Technology officially joined the openGauss community
- 18 days (link aggregation of configuration, the working process of the VRRP, IPV6 configuration)
- From "prairie cattle" to "digital cattle": Mengniu's digital transformation!
- Summary of the experience of project operation and maintenance work
- WPF - Styles and Templates
- BigEvent Demo
guess what you like
rain cloud animation
VS namespace names of different projects of the same solution are unique
Flashing Neon Text Animation
ACM common header files
Free and open source web version of Xshell [Happy New Year to everyone]
Timed task appears A component required a bean named ‘xxx‘ that could not be found
Two important self-learning functions in pytorch dir(); help()
[Mathematical Modeling] Linear Programming
Folyd
【Untitled】
Random recommended
- HCIP 18 days notes
- The web version of Xshell supports FTP connection and SFTP connection
- The values in the array into another array, and capital
- Remember to deduplicate es6 Set to implement common menus
- View the Linux log on the web side, and view the Linux log on the web side
- 21-day Learning Challenge--Pick-in on the third day (dynamically change the app icon)
- Xshell download crack, the history of the most simple tutorial
- How is the LinkedList added?
- Web version Xshell supports FTP connection and SFTP connection [Detailed tutorial] Continue from the previous article
- Usage of torch.utils.data in pytorch ---- Loading Data
- Experiment 9 (Exchange Comprehensive Experiment)
- [Mathematical Modeling] Integer Programming
- "Introduction to nlp + actual combat: Chapter 9: Recurrent Neural Network"
- Expansion mechanism of ArrayList
- (5) BuyFigrines Hd 2022 school training
- [Nanny-level tutorial] How does Tencent Cloud obtain secretId and secretKey, and enable face service
- RL reinforcement learning summary (2)
- ELT.zip 】 【 OpenHarmony chew club - the methodology of academic research paper precipitation series
- Hdu 2022 Multi-School Training (5) Slipper
- GEE(9): Area area statistics (using connectedPixelCount and ee.Image.pixelArea())
- Hdu2022 Multi-School Training (5) BBQ
- ACM common template directory
- SPFA Template
- Dijkstr heap optimization
- Looking back at ResNet - a key step in the history of deep learning
- jupyter notebook & pycharm (anaconda)
- Hongke Sharing|How to ensure the security of medical data?Moving target defense technology gives you satisfactory answers
- Let's talk about the pits of mysql's unique index, why does it still generate duplicate data?
- C. Robot in a Hallway (recursion/prefix sum/dynamic programming)
- PHP online examination system 4.0 version source code computer + mobile terminal
- Minesweeper implemented in C language
- A. Two 0-1 Sequences (greedy)
- E. Count Seconds (DAG/topological sort/tree dp)
- C. Virus (greedy)
- Domain name authorization verification system v1.0.6 open source version website source code
- F. Colouring Game (game theory/sg function)
- White, concise and easy company website source WordPress theme 2 or more
- [mysql chapter - advanced chapter] index
- B. Luke is a Foodie (greedy/simulation)
- grpc uses consul for service registration and discovery