Fit a LASSO regression model against the training set and report on the coefficients. Do any coefficients reduce to zero? If so, which ones?

Use the College dataset (https://rdrr.io/cran/ISLR/man/College.html) from the ISLR library to build regularization models by using Ridge and Lasso (least absolute shrinkage and selection operator). Predict Grad.Rate for all models.

  1. Split the data into a train and test set – refer to the Feature_Selection_R.pdf document for information on how to split a dataset.

Ridge Regression

  1. Use the cv.glmnet function to estimate the lambda.min and lambda.1se values. Compare and discuss the values.

  2. Plot the results from the glmnet function provide an interpretation. What does this plot tell us?

  3. Fit a Ridge regression model against the training set and report on the coefficients. Is there anything interesting?

  4. Determine the performance of the fit model against the training set by calculating the root mean square error (RMSE). sqrt(mean((actual – predicted)^2)).

  5. Determine the performance of the fit model against the test set by calculating the root mean square error (RMSE). Is your model overfit?

LASSO

  1. Use the cv.glmnet function to estimate the lambda.min and lambda.1se values. Compare and discuss the values.

  2. Plot the results from the glmnet function provide an interpretation. What does this plot tell us?

  3. Fit a LASSO regression model against the training set and report on the coefficients. Do any coefficients reduce to zero? If so, which ones?

  4. Determine the performance of the fit model against the training set by calculating the root mean square error (RMSE). sqrt(mean((actual – predicted)^2)).

  5. Determine the performance of the fit model against the test set by calculating the root mean square error (RMSE). Is your model overfit?

Comparison

  1. Which model performed better and why? Is that what you expected?

  2. Refer to the Intermediate_Analytics_Feature_Selection_R.pdf document for how to perform stepwise selection and then fit a model. Did this model perform better or as well as Ridge regression or LASSO? Which method do you prefer and why?

SAMPLE ASSIGNMENT
Powered by WordPress