Linear Regression
Baseline Linear Regression Model
<aside>
💡 DummyRegressor
</aside>
from sklearn.dummy import DummyRegressor
dummy_regr = DummyRegressor(strategy="mean")
dummy_regr.fit(X_train, y_train)
dummy_regr.predict(X_test)
dummy_regr.score(X_test, y_test)
SGDRegressor Estimator
- Implements stochastic gradient descent
- Used for large training set-up (> 10k samples)
- senstive to feature scaling
from sklearn.linear_model import SGDRegressor
linear_regressor = SGDRegressor(random_state=42)
Hyperparameters
Provides greater control on optimization process through provision for hyperparameter settings.
-
Loss:
- loss= 'squared error': Ordinary least squares
- loss = 'huber': Huber loss for robust regression
-
Regularisation:
- Penalty = 'l2': L2 norm penalty on coef_ . This is default setting.
- penalty = 'l1': L1 norm penalty on coef_. This leads to sparse solutions.
- penalty = 'elasticnet': Convex combination of L2 and L1;
-
Learning Rate:
-
invscaling: (default) The learning rate in $t^{th}$ iteration or time step is calculated as:
$$
\eta ^ t = \frac{\eta_0}{t^{power_t}}
$$
-
constant
-
adaptive:
- When the stopping criterion is reached, the learning rate is divided by 5, and the training loop continues.
- The algorithm stops when the learning rate goes below 10 .
-
optimal:
- Used as a default setting for classification problems.
- The learning rate in $t^{th}$ iteration or time step is calculated as:
$$
\eta ^ t = \frac{1}{\alpha(t_0+t)}
$$
- Here
- α is a regularization rate.
- t is the time step (there are a total of n_samples*n_iter time steps)
- $t_0$ is determined based on a heuristic proposed by Léon Bottou such
that the expected initial updates are comparable with the expected
size of the weights (this assuming that the norm of the training
samples is approx. 1).
-
Stopping Criteria: SGDRegressor provides two stopping criteria to stop the algorithm
when a given level of convergence is reached:
- early_stopping = True
- The input data is split into a training set and a validation set based on the validation_fraction parameter.
- The model is fitted on the training set, and the stopping criterion is based on the prediction score (using the scoring method) computed on the validation set.
- early_stopping = False
- The model is fitted on the entire input data and
- The stopping criterion is based on the objective function computed on the training data.
In both cases, the criterion is evaluated once by epoch, and the algorithm stops when the criterion does not improve n_iter_no_change times in a row.
The improvement is evaluated with absolute tolerance tol.
The algorithm stops in any case after a maximum number of iteration max_iter
Shuffle training data after each epoch
from sklearn.linear_model import SGDRegressor
linear_regressor = SGDRegressor(shuffle=True)
Learning Rate