Polynomial Regression
<aside>
💡 Polynomial regression = Polynomial transformation + Linear Regression
</aside>
Polynomial Features
- Polynomial features are those features created by raising existing features to an exponent.
- The “degree” of the polynomial is used to control the number of features added, e.g. a degree of 3 will add two new variables for each input variable.
- PolynomialFeatures transformer transforms an input data matrix into a new data matrix of a given degree.
Training
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
# Two steps:
# 1. Polynomial features of the desired degree (here degree=2)
# 2. Linear regression
poly_model = Pipeline([
('polynomial_transform', PolynomialFeatures(degree=2))),
('linear_regression', LinearRegression())])
# Train with feature matrix X_train and label vector y_train
poly_model.fit(X_train, y_train)
Interaction Features
- It is also common to add new variables that represent the interaction between features, e.g a new column that represents one variable multiplied by another. This too can be repeated for each input variable creating a new “interaction” variable for each pair of input variables.
- The “interaction_only” argument means that only the raw values (degree 1) and the interaction (pairs of values multiplied with each other) are included.
from sklearn.preprocessing import PolynomialFeatures
poly_transform = PolynomialFeatures(degree=2, interaction_only=True)
Hyperparameter Tuning
- Hyper-parameters are parameters that are not directly learnt within estimators.
- In sklearn, they are passed as arguments to the constructor of the estimator classes.
Setting hyperparameters
Select hyperparameters that result in the best cross-validation scores.