Case Study: Support Vector Machines Based on Tail Risk Measures

Back to main page
Case study background and problem formulations

Instructions for optimization with PSG Run-File, PSG MATLAB Toolbox, PSG MATLAB Subroutines and PSG R.

Problem 1a: Nu-SVM
Minimize quadratic + cvar_risk
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 0.0 0.03
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 0.0 0.04
Problem 1a’: Cross Validation for Nu-SVM
2-fold crossvalidation
Minimize quadratic + cvar_risk
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
——————————————————————–
Download Problem Data

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Cycle statement Data Solution 25 150 -0.00249 0.02
Dataset2 25 150 -0.00175 0.04
Cross-Validation shows model performance on two pairs of in-sample and out-of-sample datasets.
For the first pair of datasets (file “solution_problem_1.txt”) :

In-sample CVaR = cvar_risk(0.5,cutout(1,2,matrix_prior_scenarios)) = -9.9e-003
Out-of-sample CVaR = cvar_risk(0.5,takein(1,2,matrix_prior_scenarios)) = 1.85e-002

For the second pair of datasets (file “solution_problem_2.txt”) :

In-sample CVaR = cvar_risk(0.5,cutout(2,2,matrix_prior_scenarios)) = -7.02e-003
Out-of-sample cvar_risk(0.5,takein(2,2,matrix_prior_scenarios)) = 9.15e-003

CVaRs in-sample and CVaR out-of-sample are significantly different, i.e., there is a significant over-fitting of the model.

Problem 1b: Nu-SVM with VaR Measure
Minimize quadratic + var_risk
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Var_risk = Value-at-Risk specified by a matrix of scenarios
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -707.78409 0.04
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -707.238 0.03
Problem 2a: Extended Nu-SVM
Minimize cvar_risk
Subject to
Quadratic = 1 (unity constraint)
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 25 1,000 0.056529 0.40
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
Problem 2b: Extended Nu-SVM with VaR Measure
Minimize var_risk
Subject to
Quadratic = 1 (unity constraint)
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Var_risk = Value-at-Risk specified by a matrix of scenarios
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 25 1,000 -1053.893608 5.08
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
Problem 3a: Robust Nu-SVM
Minimize quadratic + max_cvar_risk
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Max_cvar_risk = Maximum of Conditional Value-at-Risk functions specified by a set of matrices of scenarios
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 25 134 0.0 0.08
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
Problem 3b: Robust Nu-SVM with VaR Measure
Minimize quadratic + max_var_risk
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
Max_var_risk = Maximum of Value-at-Risk functions specified by a set of matrices of scenarios
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 25 134 -1,434.071147 0.55
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
Problem 3c: Regularized Weighted Difference of CVaRs
Minimize quadratic + cvar_difference
——————————————————————–
Quadratic = quadratic function specified by a unit matrix
cvar_difference = weighted difference of two CVaR functions specified by a set of matrices of scenarios
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 7 230 -0.0010565 0.02
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 7 230 -0.0010565 0.01
Problem 4a (Primal): Nu-SVM with CVaR in Objective and L_Infinity Norm in Constraint
Minimize cvar_risk
Subject to
L_Infinity_Norm <=1
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
L_infinity_norm = L_infinity norm
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -0.318631 0.04
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -0.318631 0.02
Problem 4b (Dual): Nu-SVM with L1 Norm in Objective and Envelope Constraint
maximize – L1_Norm
Subject to
Linear = 0,
Envelope Constraint
——————————————————————–
L1_norm = L1 norm
Envelope Constraint = CVaR envelop set of constraints
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 1024 1,000 -0.318631 0.01
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
Problem 5a (Primal): Nu-SVM with CVaR in Objective and Deltoidal (Mixture of L1 and L_Infinity) Norm in Constraint Minimize cvar_risk
Subject to
Deltoidal_norm <=1
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
Deltoidal_norm = Mixture of L1 and L_infinity norm
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -0.119276 0.04
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -0.119276 0.04
Problem 5b (Dual): Nu-SVM with Dual Deltoidal Norm in Objective and Envelope Constraint
Maximize -Dual_Deltoidal_Norm
Subject to
Linear = 0,
Envelope Constraint
——————————————————————–
Dual_deltoidal_norm = Norm dual to mixture of L1 and L_infinity norms
Envelope constraint = CVaR envelop set of constraints
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 1025 1,000 -0.119278 0.28
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
Problem 6a (Primal): Nu-SVM with CVaR in Objective and CVaR Norm in Constraint Subject to
CVaR_Norm <=1
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
CVaR_norm = Conditional Value-at-Risk specified on a point components
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -0.074699 0.02
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -0.074699 0.02
Problem 6b (Dual): Nu-SVM with Dual CVaR Norm in Objective and Envelope Constraint
Maximize -Dual_CVaR_Norm
Subject to
Linear = 0,
Envelope Constraint
——————————————————————–
Dual_CVaR_norm = maximum from L1 and L_Infinity Norms
Envelope constraint = CVaR envelop set of constraints
——————————————————————–

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 1025 1,000 -0.073899 0.08
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data

 

CASE STUDY SUMMARY
This case study illustrates the application of the CVaR methodology to the Support Vector Machine (SVM) classification problem.
Given a training data , where are features and are class labels, the basic idea of SVM is to find an optimal separating hyper-plane (in the features space) maximizing a margin between two classes. Cortes et al. (1995) proposed to solve SVM classification problem using quadratic programming. An alternative formulation, known as nu-SVM, was suggested by Scholkopf, et al. (2000). Takeda and Sugiyama (2008) proposed to use the CVaR risk measure in classification and formulated the SVM learning problem as a CVaR minimization problem. Wang (2009) proposed robust nu -Support Vector Machine based on worst-case CVaR Minimization.
Tsyurmasto and Uryasev (2012) proposed Support Vector Machines based on Value-at-Risk (VaR) Measures. They obtained new SVM classifiers based on VaR risk measure for the following CVaR-based SVMs: Nu-SVM, Extended Nu-SVM, Robust Nu-SVM.
Case study contains the following problem formulations: 1) regularized CVaR, 2) regularized VaR, 3) CVaR minimization with unity constraint, 4) VaR minimization with unity constraint, 5) regularized robust CVaR minimization, 6) regularized robust VaR minimization. Problems 1,2,5,6 include additional quadratic regularization term.

 

References
• Tsyurmasto, P., Uryasev, S. (2012): Support Vector Machine Based on Value-at-Risk Measure. Working Paper.
Cortes, C. and V. Vapnik (1995): Support-vector networks, Machine Learning 20, 273-297.
• Scholkopf, B., Smola, A., Williamson, R., and P. Bartlett (2000): New support vector algorithms, Neural Computation 12, 1207-1245.
• Takeda A. and M. Sugiyama (2008): Nu-support vector machine as conditional value-at-risk minimization, in Proceedings 25th International Conference on Machine Learning, Morgan Kaufmann, Montreal, Quebec, Canada, 1056-1063.
• Tsyurmasto, P., Uryasev, S. (2012): Advanced Risk Measures in Estimation and Classification. Conference Proceedings, Vilnius, Lithuania, July, 2012.
• Wang, Y. (2009): Robust nu-Support Vector Machine Based on Worst-case Conditional Value-at-Risk Minimization. College of Finance, Zhejiang Gongshang University, Hangzhou 3100018, Zhejiang, China. Optimization Methods and