## Machine learning CS688- python code

1 (40 points) Regularized Logistic Regression Let D = {(x1, y1),(x2, y2), . . . ,(xn, yn)}be the training examples, where xi ∈ R d and yi ∈ {−1, +1}. The negative log-likelihood of the regularized logistic regression, denoted by L(w) is written as L(w) = C Xn i=1 ln(1 + exp(−yiw⊤ i x)) + 1 2 ∥w∥ 2 , where C is a parameter that controls the balance between the loss and the regularization. The optimal solution for w ∈ R d is optimized by minimizing L(w). • Show that wk = wl for the optimal solution w if two attributes k and l are identical, i.e., xi,k = xi,l for any training example xi . • Train and test a regularized logistic regression model one two data sets, namely the breast cancer and sonar data sets which are available here https://www.csie.ntu.edu.tw/~cjlin/” class=”redactor-linkify-object”>https://www.csie.ntu.edu.tw/~cjlin/ libsvmtools/datasets/binary.html#breast-cancer and here https://www.csie.ntu.edu. tw/~cjlin/libsvmtools/datasets/binary.html#sonar. We will use the scaled version for our experiment. A copy of them is also enclosed in this homework. Use the provided training/testing splitting. In particular, the file “xxx-scale-test-indices.txt” contains the indices of examples in the original file for training, and “xxx-scale-test-indices.txt” contains the indices of examples in the original file for testing. – Use the 5-fold cross validation method to decide the best value of the parameter C. The candidate values for C are 0.1, 1, 10, 100, 1000. For each C, report the training error and validation error. Choose the best C that yields the lowest validation error. – Use the selected best C value to train a logistic regression model on the whole training data and evaluate and report its performance (by error rate) on the testing data. 1 – Report the results on the two data sets. Note: To train a regularized logistic regression by liblinear library, you can use the option “-s 0”.

2 (30 points) Support Vector Machine • Repeat the same experiments as in Problem 1 by using linear SVM. To train a linear SVM by liblinear you can use the option “-s

3”. • Repeat the same experiments as in Problem 1 by using kernel SVM. To train a kernel SVM by libsvm you can use the option “-s 0”. Use the optional “-t ” to choose different types of kernels. Try polynomial kernel and RBF kernel with default values of parameters. • Compare the test error given by Logistic Regression, Linear SVM and Kernel SVMs. 3 (30 points) The Constrained Version of Ridge Regression In class, we have mentioned the constrained version of ridge regression: min w∈Rd ∥Φw − y∥ 2 s.t. ∥w∥2 ≤ s, (1) where Φ ∈ R n×d and y ∈ R n . Answer the following questions. • (10 points) Prove that this problem is a convex optimization problem. • (20 points) Does strong duality hold? If yes, derive the KKT condition regarding the optimal solution w∗ for the above problem. • (Extra credit: 20 points) Does a closed-form solution exist? If yes, derive the closed-form solution. If not, can you propose an algorithm for computing the optimal solution (describe the key steps of your algorithm)?