Fit a LASSO regression model against the training set and report on the coefficients. Do any coefficients reduce to zero? If so, which ones?
Use the College dataset (https://rdrr.io/cran/ISLR/man/College.html) from the ISLR library to build regularization models by using Ridge and Lasso (least absolute shrinkage and selection operator). Predict Grad.Rate for all models.
-
Split the data into a train and test set – refer to the Feature_Selection_R.pdf document for information on how to split a dataset.
Ridge Regression
-
Use the cv.glmnet function to estimate the lambda.min and lambda.1se values. Compare and discuss the values.
-
Plot the results from the glmnet function provide an interpretation. What does this plot tell us?
-
Fit a Ridge regression model against the training set and report on the coefficients. Is there anything interesting?
-
Determine the performance of the fit model against the training set by calculating the root mean square error (RMSE). sqrt(mean((actual – predicted)^2)).
-
Determine the performance of the fit model against the test set by calculating the root mean square error (RMSE). Is your model overfit?
LASSO
-
Use the cv.glmnet function to estimate the lambda.min and lambda.1se values. Compare and discuss the values.
-
Plot the results from the glmnet function provide an interpretation. What does this plot tell us?
-
Fit a LASSO regression model against the training set and report on the coefficients. Do any coefficients reduce to zero? If so, which ones?
-
Determine the performance of the fit model against the training set by calculating the root mean square error (RMSE). sqrt(mean((actual – predicted)^2)).
-
Determine the performance of the fit model against the test set by calculating the root mean square error (RMSE). Is your model overfit?
Comparison
-
Which model performed better and why? Is that what you expected?
-
Refer to the Intermediate_Analytics_Feature_Selection_R.pdf document for how to perform stepwise selection and then fit a model. Did this model perform better or as well as Ridge regression or LASSO? Which method do you prefer and why?
SAMPLE ASSIGNMENT