A decision tree regressor.
Parameters: | criterion : string, optional (default=”mse”)
splitter : string, optional (default=”best”)
max_features : int, float, string or None, optional (default=None)
max_depth : int or None, optional (default=None)
min_samples_split : int, optional (default=2)
min_samples_leaf : int, optional (default=1)
min_weight_fraction_leaf : float, optional (default=0.)
max_leaf_nodes : int or None, optional (default=None)
random_state : int, RandomState instance or None, optional (default=None)
output_transformer : scikit-learn transformer or None (default),
|
---|
See also
References
[R19] | http://en.wikipedia.org/wiki/Decision_tree_learning |
[R20] | L. Breiman, J. Friedman, R. Olshen, and C. Stone, “Classification and Regression Trees”, Wadsworth, Belmont, CA, 1984. |
[R21] | T. Hastie, R. Tibshirani and J. Friedman. “Elements of Statistical Learning”, Springer, 2009. |
[R22] | L. Breiman, and A. Cutler, “Random Forests”, http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm |
Examples
>>> from sklearn.datasets import load_boston
>>> from sklearn.cross_validation import cross_val_score
>>> from sklearn.tree import DecisionTreeRegressor
>>> boston = load_boston()
>>> regressor = DecisionTreeRegressor(random_state=0)
>>> cross_val_score(regressor, boston.data, boston.target, cv=10)
...
...
array([ 0.61..., 0.57..., -0.34..., 0.41..., 0.75...,
0.07..., 0.29..., 0.33..., -1.42..., -1.77...])
Attributes
feature_importances_ | Return the feature importances. |
tree_ | Tree object | The underlying Tree object. |
max_features_ | int, | The infered value of max_features. |
Methods
fit(X, y[, sample_weight, check_input]) | Build a decision tree from the training set (X, y). |
fit_transform(X[, y]) | Fit to data, then transform it. |
get_params([deep]) | Get parameters for this estimator. |
predict(X) | Predict class or regression value for X. |
score(X, y[, sample_weight]) | Returns the coefficient of determination R^2 of the prediction. |
set_params(**params) | Set the parameters of this estimator. |
transform(*args, **kwargs) | DEPRECATED: Support to use estimators as feature selectors will be removed in version 0.19. |
Return the feature importances.
The importance of a feature is computed as the (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance.
Returns: | feature_importances_ : array, shape = [n_features] |
---|
Build a decision tree from the training set (X, y).
Parameters: | X : array-like, shape = [n_samples, n_features]
y : array-like, shape = [n_samples] or [n_samples, n_outputs]
sample_weight : array-like, shape = [n_samples] or None
check_input : boolean, (default=True)
|
---|---|
Returns: | self : object
|
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: | X : numpy array of shape [n_samples, n_features]
y : numpy array of shape [n_samples]
|
---|---|
Returns: | X_new : numpy array of shape [n_samples, n_features_new]
|
Get parameters for this estimator.
Parameters: | deep: boolean, optional :
|
---|---|
Returns: | params : mapping of string to any
|
Predict class or regression value for X.
For a classification model, the predicted class for each sample in X is returned. For a regression model, the predicted value based on X is returned.
Parameters: | X : array-like of shape = [n_samples, n_features]
|
---|---|
Returns: | y : array of shape = [n_samples] or [n_samples, n_outputs]
|
Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the regression sum of squares ((y_true - y_pred) ** 2).sum() and v is the residual sum of squares ((y_true - y_true.mean()) ** 2).sum(). Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: | X : array-like, shape = (n_samples, n_features)
y : array-like, shape = (n_samples) or (n_samples, n_outputs)
sample_weight : array-like, shape = [n_samples], optional
|
---|---|
Returns: | score : float
|
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns: | self : |
---|
DEPRECATED: Support to use estimators as feature selectors will be removed in version 0.19. Use SelectFromModel instead.
Reduce X to its most important features.
Uses coef_ or feature_importances_ to determine the most important features. For models with a coef_ for each class, the absolute sum over the classes is used.
Parameters: | X : array or scipy sparse matrix of shape [n_samples, n_features]
|
---|---|
Returns: | X_r : array of shape [n_samples, n_selected_features]
|