Impact of subsampling and pruning on random forestsReport as inadecuate

Impact of subsampling and pruning on random forests - Download this document for free, or read online. Document in PDF available to download.

* Corresponding author 1 LSTA - Laboratoire de Statistique Théorique et Appliquée

Abstract : Random forests are ensemble learning methods introduced by Breiman 2001 that operate by averaging several decision trees built on a randomly selected subspace of the data set. Despite their widespread use in practice, the respective roles of the different mechanisms at work in Breiman-s forests are not yet fully understood, neither is the tuning of the corresponding parameters. In this paper, we study the influence of two parameters , namely the subsampling rate and the tree depth, on Breiman-s forests performance. More precisely, we show that fully developed sub-sampled forests and pruned without subsampling forests have similar performances, as long as respective parameters are well chosen. Moreover , experiments show that a proper tuning of subsampling or pruning lead in most cases to an improvement of Breiman-s original forests errors.

Keywords : tree depth parameter tuning randomization Index Terms — Random forests sub-sampling

Author: Roxane Duroux - Erwan Scornet -



Related documents