Decision Trees
- non-parametric supervised learning method
- can learn classification and regression models
- black-box estimator
Tree Algorithms
Sklearn Implementation of Trees
scikit-learn uses an optimized version of the CART algorithm; however, it does not support categorical variables for now
- Classification - sklearn.tree.DecisionTreeClassifier
- Regression - sklearn.tree.DecisionTreeRegressor
Both these estimators have the same set of parameters (except for criterion used for tree splitting).
Parameters
- splitter
- Strategy for splitting at each node.
- best, random
- max_depth (int)
- Maximum depth of the tree
- When None, the tree expanded until all leaves are pure or they contain less than
min_samples_leaf.
- min_samples_split (int, float)
- The minimum number of samples required to split an internal node
- Default: 2
- min_samples_leaf (int, float)
- The minimum number of samples required to be at a leaf node
- Default: 1
- criterion
- Classification
- Regression
- squared_error
- friedman_mse
- absolute_error
- poisson
Note: Default values are in italics