# Case Study: Minimization of Kantorovich-Rubinstein distance between two distributions (kantor, ksm_avg, cardn, linear)

Back to main page

** Case study background and formulation of problems**

**PROBLEM1: problem_Kantorovich_minimize**

Minimize Kantor (minimize Kantorovich-Rubinstein distance)

subject to

Linear = 1 (sum of probabilities constraint)

Box constraints (lower bounds on probabilities)

——————————————————————–——————

Kantor = Kantorovich- Rubinstein distance

Linear = linear function

Box constraints = constraints on individual decision variables

——————————————————————–——————

Problem statement without initial point.

# of Variables |
# of Scenarios |
Objective Value |
Solving Time, PC 3.14GHz (sec) |
||||

Dataset |
80 | N/A | 4.706041e-03 | 0.02 | |||
---|---|---|---|---|---|---|---|

Environments |
|||||||

Run-File | Problem Statement | Data | Solution | ||||

Matlab Toolbox | Data | ||||||

Matlab Subroutines | Matlab Code | Data | |||||

R | R Code | Data |

Problem statement with initial point 0.

# of Variables |
# of Scenarios |
Objective Value |
Solving Time, PC 3.14GHz (sec) |
||||

Dataset | 80 | N/A | 3.342386e-03 | 0.29 | |||
---|---|---|---|---|---|---|---|

Environments |
|||||||

Run-File | Problem Statement | Data | Solution | ||||

Matlab Toolbox | Data | ||||||

Matlab Subroutines | Matlab Code | Data | |||||

R | R Code | Data |

**PROBLEM2: problem_KSMavg_with_Cardinality**

Minimize KSM_avg (minimize Average Kolmogorov-Smirnov distance)

subject to

Cardn ≤ Const (constraint on cardinality)

Linear = 1 (sum of probabilities constraint)

Box constraints (lower bounds on probabilities)

——————————————————————–——————

KSM_avg = Average Kolmogorov-Smirnov distance

Cardn = Cardinality

Linear = linear function

Box constraints = constraints on individual decision variables

——————————————————————–——————

Problem statement with LINEARIZE=1 option.

# of Variables |
# of Scenarios |
Objective Value |
Solving Time, PC 3.14GHz (sec) |
||||

Dataset |
448 | N/A | 3.908609e-03 | 144.8 | |||
---|---|---|---|---|---|---|---|

Environments |
|||||||

Run-File | Problem Statement | Data | Solution | ||||

Matlab Toolbox | Data | ||||||

Matlab Subroutines | Matlab Code | Data | |||||

R | R Code | Data |

Problem statement with MIP=1 for constraint on cardinality function.

# of Variables |
# of Scenarios |
Objective Value |
Solving Time, PC 3.14GHz (sec) |
||||

Dataset |
448 | N/A | 3.023553e-03 | 3600.3 | |||
---|---|---|---|---|---|---|---|

Environments |
|||||||

Run-File | Problem Statement | Data | Solution | ||||

Matlab Toolbox | Data | ||||||

Matlab Subroutines | Matlab Code | Data | |||||

R | R Code | Data |

**PROBLEM3: problem_KSMavg_with_Cvar_inequality**

Minimize KSM_avg (minimize Average Kolmogorov-Smirnov distance)

subject to

pcvars ≥ Consts (constraints on CVaRs)

Linear = 1 (sum of probabilities constraint)

Box constraints (lower bounds on probabilities)

——————————————————————–——————

KSM_avg = Average Kolmogorov-Smirnov distance

pcvars = CVaRs for discrete distribution as a function of atom probabilities with fixed atom locations

Linear = linear function

Box constraints = constraints on individual decision variables

——————————————————————–——————

Problem statement with INEQUALITIES.

# of Variables |
# of Scenarios |
Objective Value |
Solving Time, PC 3.14GHz (sec) |
||||

Dataset |
112 | N/A | 1.265e-03 | 0.14 | |||
---|---|---|---|---|---|---|---|

Environments |
|||||||

Run-File | Problem Statement | Data | Solution | ||||

Matlab Toolbox | Data | ||||||

Matlab Subroutines | Matlab Code | Data | |||||

R | R Code | Data |

**PROBLEM4: problem_KSMavg_Cvar_equality**

Minimize KSM_avg (minimize Average Kolmogorov-Smirnov distance)

subject to

pcvars = Consts (constraints on CVaRs)

Linear = 1 (sum of probabilities constraint)

Box constraints (lower bounds on probabilities)

——————————————————————–——————

KSM_avg = Average Kolmogorov-Smirnov distance

pcvars = CVaRs for discrete distribution as a function of atom probabilities with fixed atom locations

Linear = linear function

Box constraints = constraints on individual decision variables

——————————————————————–——————

Problem statement with EQUALITIES.

# of Variables |
# of Scenarios |
Objective Value |
Solving Time, PC 3.14GHz (sec) |
||||

Dataset |
112 | N/A | 1.2651e-03 | 0.16 | |||
---|---|---|---|---|---|---|---|

Environments |
|||||||

Run-File | Problem Statement | Data | Solution | ||||

Matlab Toolbox | Data | ||||||

Matlab Subroutines | Matlab Code | Data | |||||

R | R Code | Data |

**CASE STUDY SUMMARY**

This case study demonstrates how to approximate a fixed one-dimensional discrete distribution by some other one-dimensional discrete distribution with variable probabilities and variable locations of atoms.To find the best approximations we minimized Kantorovich-Rubinstein and Average Kolmogorov-Smirnov distances between distributions.

PROBLEM1 minimizes Kantorovich-Rubinstein distance and finds the best approximating discrete distribution with a fixed number of atoms and variable probabilities and locations of the atoms. We considered two cases: without and with initial approximating distribution (initial point in the optimization problem).

PROBLEM 2 minimizes Average Kolmogorov-Smirnov distance and finds an approximating distribution with fixed locations and variable probabilities of the atoms. In this case, locations of atoms of the approximating distribution coincide with the location of atoms of the target distribution. To constraint the number of atoms with nonzero probabilities in the approximating distribution, we have imposed a cardinality constraint. We considered two options for the cardinality constraint: 1) “linearize=1”; 2) “mip=1”. These two options initiate different optimization approaches with cardinality constraints.

PROBLEM 3 and PROBLEM 4 minimize Average Kolmogorov-Smirnov distance and find an approximating distribution with fixed locations and variable probabilities of atoms. The number of atoms of the approximating distribution is 4 times smaller than the number of atoms of the target distribution. To assure that the right tail of the approximating distribution correctly matches the right tail of the target distribution we have imposed two CVaR constraints. We want to minimize distance between distributions and match two CVaRs of approximating and target distributions. We have considered two cases: 1) Problem 3: CVaRs of approximating distribution are larger than CVaRs of target distribution (in this case tails of approximating distribution are heavier than tail of target distribution). This problem is convex and optimality of the solution in guaranteed; 2) Problem 4: CVaRs of approximating distribution are equal to CVaRs of target distribution. This problem may have multiple extremums and solver finds some extremum(may be a local one).

**References**