小结-Python机器学习——预测分析核心算法在线阅读

语速1.0: 2.0

进度0:

小结

本章展示了对预测建模问题使用惩罚回归以及一些通用工具的案例，以及实际应用中经常会遇到的几种不同类型的问题。这些问题包括回归、二分类以及多类别分类问题。本章使用基于 Python 的不同版本的惩罚回归函数来解决这些问题。此外，本章还展示了几种工具的使用方法。这些工具包括对类别变量的编码，使用二分器来解决多类别分类问题，对线性方法进行扩展来预测属性及输出之间的非线性关系。

本章还介绍了模型性能评价方法。回归问题最容易评估，因为它的错误可以表示为数值。分类问题也可以概括进来。我们看到分类性能可以被量化为误分类错误率、接收曲线的曲线下面积，以及经济代价。应该挑选最能反映实际目标的性能指标，这些目标包括商业目标、科学目标等。

参考文献

1. P. Cortez , A. Cerdeira , F. Almeida , T. Matos , and J. Reis . (2009). Modeling wine preferences by data mining from physicochemical properties . Decision Support Systems , Elsevier , 47(4): 547 – 553.

2. T. Hastie , R. Tibshirani , and J. Friedman . (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer-Verlag , New York .

3. J. Friedman , T. Hastie , and R. Tibshirani . (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software , 33( 1 ).

4. K. Bache and M. Lichman . ( 2013 ). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science . http://archive.ics.uci.edu/ml.