Linear Regression

Baseline Linear Regression Model

<aside> 💡 DummyRegressor

</aside>

from sklearn.dummy import DummyRegressor

dummy_regr = DummyRegressor(strategy="mean")
dummy_regr.fit(X_train, y_train)
dummy_regr.predict(X_test)
dummy_regr.score(X_test, y_test)

SGDRegressor Estimator

Implements stochastic gradient descent
Used for large training set-up (> 10k samples)
senstive to feature scaling

from sklearn.linear_model import SGDRegressor
linear_regressor = SGDRegressor(random_state=42)

Hyperparameters

Provides greater control on optimization process through provision for hyperparameter settings.

Untitled

Loss:
- loss= 'squared error': Ordinary least squares
- loss = 'huber': Huber loss for robust regression
Regularisation:
- Penalty = 'l2': L2 norm penalty on coef_ . This is default setting.
- penalty = 'l1': L1 norm penalty on coef_. This leads to sparse solutions.
- penalty = 'elasticnet': Convex combination of L2 and L1;
Learning Rate:
- invscaling: (default) The learning rate in $t^{th}$ iteration or time step is calculated as:
  
  $$ \eta ^ t = \frac{\eta_0}{t^{power_t}} $$
- constant
- adaptive:
  - When the stopping criterion is reached, the learning rate is divided by 5, and the training loop continues.
  - The algorithm stops when the learning rate goes below 10 .
- optimal:
  - Used as a default setting for classification problems.
  - The learning rate in $t^{th}$ iteration or time step is calculated as:
  $$ \eta ^ t = \frac{1}{\alpha(t_0+t)} $$
  - Here
    - α is a regularization rate.
    - t is the time step (there are a total of n_samples*n_iter time steps)
    - $t_0$ is determined based on a heuristic proposed by Léon Bottou such that the expected initial updates are comparable with the expected size of the weights (this assuming that the norm of the training samples is approx. 1).
Stopping Criteria: SGDRegressor provides two stopping criteria to stop the algorithm when a given level of convergence is reached:
- early_stopping = True
  - The input data is split into a training set and a validation set based on the validation_fraction parameter.
  - The model is fitted on the training set, and the stopping criterion is based on the prediction score (using the scoring method) computed on the validation set.
- early_stopping = False
  - The model is fitted on the entire input data and
  - The stopping criterion is based on the objective function computed on the training data.
In both cases, the criterion is evaluated once by epoch, and the algorithm stops when the criterion does not improve n_iter_no_change times in a row. The improvement is evaluated with absolute tolerance tol. The algorithm stops in any case after a maximum number of iteration max_iter

Shuffle training data after each epoch

from sklearn.linear_model import SGDRegressor
linear_regressor = SGDRegressor(shuffle=True)

Linear Regression

Baseline Linear Regression Model

SGDRegressor Estimator

Hyperparameters

Shuffle training data after each epoch

Learning Rate