Module4Assignment - Copy

.docx

School

Northeastern University *

*We aren’t endorsed by this school

Course

6015

Subject

Mathematics

Date

Apr 3, 2024

Type

docx

Pages

15

Uploaded by PresidentToadPerson1018 on coursehero.com

Module 4 Assignment College of Professional Studies, Northeastern University ALY6015, 21626 Harpreet Sharma February 5 th , 2024 Table of Contents Introduction ................................................................................................................................ 3
Analysis ...................................................................................................................................... 3 Ridge Regression .................................................................................................................... 3 Figure 1 .............................................................................................................................. 3 Ridge Regression with Cross-validation ............................................................................. 3 Figure 2 .............................................................................................................................. 4 The plot of Cross-validation result of Ridge Regression .................................................... 4 Figure 3 .............................................................................................................................. 5 Coefficients of the lambda min model ................................................................................. 5 Figure 4 .............................................................................................................................. 5 Coefficients of the lambda 1se model ................................................................................. 5 Lasso Regression .................................................................................................................... 6 Figure 5 .............................................................................................................................. 6 Lasso Regression with Cross-validation ............................................................................. 6 Figure 6 .............................................................................................................................. 7 The plot of the Cross-validation result of the Lasso Regression ......................................... 7 Figure 7 .............................................................................................................................. 8 Coefficients of the lambda min model ................................................................................. 8 Figure 8 .............................................................................................................................. 8 Coefficients of the lambda 1se model ................................................................................. 8 Conclusion/Interpretation ......................................................................................................... 10 References ................................................................................................................................ 11 Appendices ............................................................................................................................... 12 2
Introduction This report focuses on building regularization models using Ridge and Lasso regression techniques on the College dataset from the ISLR library which comprises 777 records and 18 variables. To address the problem of multicollinearity and overfitting in predictive modeling, regularization methods such as Ridge and Lasso are used. The objective is to use different predictor variables in the data set to predict graduation rates. Analysis 1. The dataset is split into a training set and a testing set (see Appendix A). This splitting is crucial for evaluating the performance of the models on unseen data. Ridge Regression 2. Ridge regression with cross-validation is performed on the training data to find the optimal regularization parameter (see Appendix B). Figure 1 Ridge Regression with Cross-validation As shown in Figure 1, Lambda min (1.775) minimizes MSE for better predictive accuracy but can lead to a more complex model. Lambda 1se (16.558) offers a slightly more regularized model within one standard error, striking a balance between simplicity and accuracy. 3
3. Plot of the Results Figure 2 The plot of Cross-validation result of Ridge Regression As shown in Figure 2, The x-axis displays the log of λ, and the y-axis represents the mean-squared error. The figures above the plot indicate the number of variables (with non-zero coefficients). The two dashed lines represent two lambda values: lambda min on the far left and lambda 1se on the right. 4. Fitting a Ridge Regression Model A regression model is fit against the training set and the following coefficients are obtained (see Appendix C). 4
Figure 3 Coefficients of the lambda min model Figure 4 Coefficients of the lambda 1se model As shown in Figures 3 and 4, what seems interesting is that the ridge regression models illustrate the balance between reducing coefficients towards zero and retaining all features, highlighting the compromise between model complexity and predictive accuracy. 5. Performance of Fit Model against the Ridge Training Set by RMSE The Ridge regression model with lambda min has an RMSE of approximately 12.54 on the training set, while the model with lambda 1se has an RMSE of 5
approximately 13.05 (see Appendix D). The lower RMSE of the model with lambda min indicates slightly better predictive accuracy on the training data compared to the model with lambda.1se. 6. Performance of Fit Model against the Ridge Test Set by RMSE The RMSE for the Ridge regression model with lambda min on the test set is approximately 13.02, while for the model with lambda 1se, it is approximately 12.97 (see Appendix E). This indicates that the model with lambda 1se performs slightly better in terms of predictive accuracy on the test data compared to the model with lambda min. The model does not appear to be overfit as the test set has similar or slightly lower RMSE values than the training set. This indicates that the model generalizes well to unseen data. Lasso Regression 7. Lasso regression with cross-validation is performed on the training data to find the optimal regularization parameters (see Appendix F). Figure 5 Lasso Regression with Cross-validation As shown in Figure 5, Lambda min (0.0734) minimizes MSE for better predictive accuracy but can lead to a more complex model. Lambda 1se (1.3122) offers a slightly more regularized model within one standard error, striking a balance between simplicity and accuracy. 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help