Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

修复第15章随机森林准确率太低的bug: #17

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Sep 23, 2024

  1. # 修复第15章随机森林准确率太低的bug:

    1. 特征抽样改为“无放回”抽样;
    2. 本系列代码经常把“损失”和“增益”搞混,cart决策树用“损失(基尼不纯度)”来选择最优特征和分裂点,RandomForest却用“增益”,RandomForest类中把min_gain初始化为0将导致cart决策树几乎无法训练,所以这里改为min_gain=float("inf");
    3. 缺失utils.py文件,从第11章拷贝过来;
    4. 保持cart.py文件内容与前两次提交的决策树修改内容一致(第7章决策树和第11章GBDT);
    # 优化:
    1. 使用多进程并行优化训练过程,否则训练太慢了,不便于调试,而且并行训练是随机森林的特点;
    2. 与sklearn.RandomForestClassifier对比准确率时,sklearn的参数与我们自定义的参数不一致,不利于比较两者的区别,所以给sklearn补充2个参数;
    lzy committed Sep 23, 2024
    Configuration menu
    Copy the full SHA
    9cb98ff View commit details
    Browse the repository at this point in the history