小结-Python机器学习——预测分析核心算法在线阅读

语速1.0: 2.0

进度0:

小结

本章介绍了 Python 实现的集成方法工具包。实例代码展示了使用这些方法针对不同类型的问题构建模型。本章涵盖了回归问题、二分类问题、多类别分类问题，并讨论了一些变化，如何类别属性的编码、分层取样等。这些例子涵盖了可能在实践中遇到的不同问题类型。

这里的例子也展示了集成方法的重要特征：为什么对于数据科学集成方法是首选的原因。集成方法相对易于使用。它们不需要调很多参数。它们可以给出属性的重要性信息，有利于模型开发早期阶段的对比和分析，集成方法通常也可以获得最佳的性能。

本章介绍了相关 Python 工具包的使用。第 6 章的背景知识帮助理解这些参数的设置及其调整。通过观察实例代码中参数的设置，可以帮助尝试使用这些软件包。

在本章的最后对比分析各种算法。集成方法通常可以获得最佳的性能。惩罚回归方法通常比集成方法快，在某些情况下，可以获得接近的性能。

参考文献

1. sklearn documentation for RandomForestRegressor http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html

2. Leo Breiman. (2001). “Random Forests.” Machine Learning, 45 (1): 5–32.doi:10.1023/A:10109334043243. J. H. Friedman. “Greedy Function Approximation: A Gradient BoostingMachine,https://statweb.stanford.edu/~jhf/ftp/trebst.pdf

3. J. H. Friedman. “Greedy Function Approximation: A Gradient Boosting Machine,”https://statweb.stanford.edu/~jhf/ftp/trebst.pdf

4. sklearn documentation for RandomForestRegressor, http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html

5. L. Breiman, “Bagging predictors,” http://statistics.berkeley.edu/sites/default/files/tech-reports/421.pdf

6. Tin Ho. (1998). “The Random Subspace Method for Constructing DecisionForests.” IEEE Transactions on Pattern Analysis and Machine Intelligence ,20 (8): 832–844.doi:10.1109/34.709601

7. J. H. Friedman. “Greedy Function Approximation: A Gradient BoostingMachine,” https://statweb.stanford.edu/~jhf/ftp/trebst.pdf

8. J. H. Friedman. “Stochastic Gradient Boosting,” https://statweb.stanford.edu/~jhf/ftp/stobst.pdf

9. sklearn documentation for GradientBoostingRegressor, http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html

10. J. H. Friedman. “Greedy Function Approximation: A Gradient BoostingMachine,” https://statweb.stanford.edu/~jhf/ftp/trebst.pdf

11. J. H. Friedman. “Stochastic Gradient Boosting,”https://statweb.stanford.edu/~jhf/ftp/stobst.pdf

12. J. H. Friedman. “Stochastic Gradient Boosting,” [https://statweb.stanford.edu/~jhf/ftp/stobst.pdf

13. sklearn documentation for RandomForestClassifier, http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

14. sklearn documentation for GradientBoostingClassifier, http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html