Case Study: Classification by Buffered AUC (bAUC) Maximization

Back to main page
Case study background and problem formulations

Instructions for optimization with PSG Run-File, PSG MATLAB Toolbox, PSG MATLAB Subroutines and PSG R.

——————————————————————–———————————————————————————
At the link below, a Python script is available for calculating bAUC and plotting the bROC curve as described in “Maximization of AUC and Buffered AUC in Binary Classification”, by Matthew Norton & Stan Uryasev.
The Python script has the following dependencies (any recent versions of these packages will work):
Numpy, Gurobi, Gurobipy, Matplotlib, Scikit-Learn
Click Here For Python Script
——————————————————————–———————————————————————————
 At the link below, a MATLAB script is available which demonstrates how to optimize bAUC as well as calculate bAUC using the same artificial data set used in Section 4.5.2 of “Maximization of AUC and Buffered AUC in Binary Classification.”
Click Here for MATLAB Script
——————————————————————–———————————————————————————
Additional instructions for optimization with PSG Run-File, PSG MATLAB Toolbox and PSG MATLAB Subroutins.
——————————————————————–———————————————————————————
PROBLEM1: problem_bAUC_1
Minimizing Pm_pen (Buffered Probability of Exceedance)
——————————————————————–——————
Pm_pen = Partial Moment Penalty for Loss
——————————————————————–——————
# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 6 3990*2788=11,124,120 0.7354 0.13
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
PROBLEM2: problem_bAUC_2
Minimizing C1*Pm_pen_g + C2*Pm_pen (cost function)
——————————————————————–——————
Pm_pen = Partial Moment Penalty for Loss
Pm_pen_g = Partial Moment Penalty for Gain
——————————————————————–——————
# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 1 3990*2788=11,124,120 1.7397 <1
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data
PROBLEM3: problem_bAUC_3
Solve problem for 7 values of alfaMinimizing C1*Pm_pen_g + C2*Pm_pen (cost function)
subject to
Pm_pen <= 1-alpha (constraint on bAUC)
——————————————————————–——————
Pm_pen = Partial Moment Penalty for Loss
Pm_pen_g = Partial Moment Penalty for Gain
——————————————————————–——————

# of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec)
Dataset 7 3990*2788=11,124,120 0.7354 0.22
Environments
Run-File Problem Statement Data Solution
Matlab Toolbox Data
Matlab Subroutines Matlab Code Data
R R Code Data

 

CASE STUDY SUMMARY
This case study considers buffered version (bAUC) of classification criteria Area Under the Receiver Operating Characteristic Curve (AUC). Two optimization settings: 1) maximizing bAUC and then finding intercept by minimizing some cost function; 2) Construction of “Efficient Frontier” by minimizing cost function with several constraints on bAUC.
References
• Norton, M. and S. Uryasev. Maximization of AUC and Buffered AUC in Classification. Research Report 2014-2, ISE Dept., University of Florida, October 2014.