Mirza_Report_M4
.pdf
keyboard_arrow_up
School
Northeastern University *
*We aren’t endorsed by this school
Course
6015
Subject
Mathematics
Date
Apr 3, 2024
Type
Pages
30
Uploaded by DoctorDragonfly3804 on coursehero.com
Regularization
1
Module 4 Assignment — Regularization
Muhammad U. Mirza
College of Professional Studies, Northeastern University Toronto
ALY6015 - Intermediate Analytics
Dr. Matthew Goodwin
February 4, 2024
Regularization
2
Introduction
In this statistical analysis report, I explore the application of Ridge and LASSO
regression techniques alongside stepwise selection to predict graduation rates using the College
dataset from the ISLR package. Regularization methods like Ridge and LASSO help prevent
overfitting by penalizing the magnitude of coefficients, while stepwise selection iteratively
refines models by criteria such as the Akaike Information Criterion (AIC). Overall, this
comprehensive approach aims to generate more precise and insightful predictions for graduation
rates in the College dataset.
Analysis
Split the data into a train and test set
The college dataset contains 777 observations and 18 variables. To predict graduation
rates accurately, I divided the College dataset into a training set, which constitutes 70% (543
observations) of the data, and a test set, which makes up the remaining 30% (234 observations).
This split, guided by the Feature_Selection_R.pdf document, is crucial for evaluating the model's
performance on data it has not been trained on. I set a random seed for reproducibility, allowing
for consistent results across multiple runs.
For regression analysis in glmnet, the datasets were converted to matrix format using the
model.matrix function. This step separated the predictor variables into train_x and test_x, and the
response variable, Grad.Rate, into train_y and test_y. This transformation is essential, as glmnet
requires numerical inputs and a clear delineation between predictors and response. This
methodical preparation of the data ensures that the analysis is structured and poised for the
modeling phase.
Regularization
3
Ridge Regression
Ridge Regression combats multicollinearity in datasets with highly correlated predictors
by incorporating an L2 regularization penalty into the loss function. This shrinkage of coefficient
magnitudes helps mitigate overfitting, enhancing model interpretability and reducing the undue
impact of any single predictor. Additionally, Ridge Regression stabilizes the model by improving
its generalization capability, thereby decreasing the variability in the predictions it generates.
Use the cv.glmnet function to estimate the lambda.min and lambda.1se values. Compare
and discuss the values.
To accurately determine the optimal regularization strength for our Ridge regression
model, I employed the cv.glmnet function, utilizing a 10-fold cross-validation method. This
technique involves dividing the dataset into ten parts, training the model on nine, and testing it
on the tenth, repeatedly, to ensure robust estimation.
The analysis yielded two critical lambda values: lambda.min and lambda.1se. The
lambda.min represents the value that minimizes the prediction error, indicating the most
regularized model that still provides the lowest loss. On the other hand, lambda.1se is a more
conservative estimate, providing a simpler model within one standard error of the minimum
error. The logged values of these lambdas show the balance we seek between model complexity
and predictive accuracy, with lambda.min focusing on precision and lambda.1se on simplicity
and robustness.
Figure 1: Lambda min and 1se (Ridge Regression)
Regularization
4
The calculated logged values of these lambdas, -2.612328 for lambda.min and 0.2717177
for lambda.1se, suggest a spectrum of regularization from tight to more relaxed, balancing
complexity against generalization.
Plot the results from the cv.glmnet function provide an interpretation. What does this plot
tell us?
Figure 2: Ridge Regression Plot
Figure 2 visually represents the model's mean squared error against varying levels of
regularization (lambda). The plot indicates two significant lambda values: lambda.min and
lambda.1se. At lambda.min, the model retains 16 predictors, a point which corresponds to the
left dotted line and reflects the most complex model with the lowest error. The right dotted line at
lambda.1se represents a balance between complexity and generalization, where the model retains
9 predictors, offering a more generalizable model at the cost of slightly higher error. This
graphical analysis aids in selecting an optimal regularization parameter to ensure a model that
generalizes well to new data.
Regularization
5
Fit a Ridge regression model against the training set and report on the coefficients. Is there
anything interesting?
Figure 3: Ridge regression model with lambda.min
Figure 4: Ridge regression with lambda.1se
Upon fitting the Ridge regression model to the training data, I evaluated the coefficients'
magnitude and significance using both lambda.min and lambda.1se from the cross-validation
process. Lambda.min, which is associated with the least prediction error, resulted in a model
where all predictors were retained, and most had non-zero coefficients. This suggests a model
complexity that tries to fit the training data as closely as possible without undue penalization.
Regularization
6
Figure 5: Ridge regression lambda.min model coefficients
Figure 6: Ridge regression lambda.1se model coefficients
Conversely, the model using lambda.1se, which is within one standard error of the
minimum error, showed somewhat smaller coefficients, indicating a more regularized approach.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help