Sunday, August 15, 2021

Stable Marriage Problem

An inter-cast marriage ceremony in Lalitpur [6]



The Stable Marriage Problem is a fascinating problem. In this problem, we want to assign ("marry") men to women much like the assignment problem. (I am just following the conventions in the literature here.) There is a twist, however. We want to require that the matchings are stable: there does not exist a combination \((m,w)\) such that \(m\) and \(w\) both prefer each other above their current partner [1].

There are famous algorithms for this problem [2]. But here, I want to look at it as an integer programming problem.

Let's have a look at a tiny data set from [4].



I am not sure what the function of the last column in these two tables is. It may be to indicate unassigned persons. So this would say: persons don't like to be single. I translated this into:
 

----     61 SET m  men

m1,    m2,    m3


----     61 SET w  women

w1,    w2,    w3


----     61 PARAMETER mpref  preferences of men for women (ranking,1=best)

            w1          w2          w3

m1           1           2           3
m2           3           1           2
m3           2           3           1


----     61 PARAMETER wpref  preferences of women for men (ranking,1=best)

            m1          m2          m3

w1           3           2           1
w2           1           3           2
w3           2           1           3


Here the preferences are represented by numbers. The numbers should be interpreted with a bit of care as they are ordinal. This is related to what in economics is called ordinal utility. Here 1 is best, followed by 2, etc. (My choice here is a bit unfortunate as we shall see later on.)
 
In [3], the following MIP model is proposed:


This is not exactly standard notation (it is explained in [3]).  Another notation I see being used is:

Stable Marriage Problem
\[\begin{align}&\sum_m \color{darkred}x_{m,w} = 1 && \forall w \\&\sum_w \color{darkred}x_{m,w} = 1 && \forall m \\&\sum_{m'\lt_w m} \color{darkred}x_{m',w} -\sum_{w'\gt_m w} \color{darkred}x_{m,w'} \le 0  && \forall m,w\\ & \color{darkred}x_{m,w}\in \{0,1\}  \end{align}\]

This needs some explanation. 
  • \(\displaystyle\sum_{m'\lt_w m} \color{darkred}x_{m',w}=1\) indicates an inferior man \(m'\) compared to \(m\) (according to \(w\)) has been assigned to \(w\),
  • \(\displaystyle\sum_{w'\gt_m w} \color{darkred}x_{m,w'}=1\) indicates a superior woman \(w'\) compared to \(w\) (according to \(m\)) has been assigned to \(m\),
  • the complete constraint \[\sum_{m'\lt_w m} \color{darkred}x_{m',w} -\sum_{w'\gt_m w} \color{darkred}x_{m,w'} \le 0 \>\> \forall m,w\] can be interpreted as: if \(w\) marries someone less desirable than \(m\) then \(m\) must be matched to a woman better than \(w\). If this is not the case then both \(m\) and \(w\) are better off marying each other. That would not be "stable."
  • A feasible solution to these equations gives us a stable matching.

It may be interesting to see how I implemented this in GAMS. Again I use the approach where we precalculate data to simplify the constraints. First I calculated the parameters worseMen and betterWomen.


*----------------------------------------------------------------
* derived data
*----------------------------------------------------------------

sets
  worseMen(w,m,mm)     
'worse men than m for w'
  betterWomen(m,w,ww)  
'better women than w for m'
;
worseMen(w,m,mm) = wpref(w,mm)>wpref(w,m);
betterWomen(m,w,ww) = mpref(m,ww)<mpref(m,w);

 

Note that worse corresponds to \(\gt\) and better to \(\lt\). This is due to the way I encoded the ranking. The output looks like:


----     61 SET worseMen  worse men than m for w

               m1          m2          m3

w1.m2         YES
w1.m3         YES         YES
w2.m1                     YES         YES
w2.m3                     YES
w3.m1                                 YES
w3.m2         YES                     YES


----     61 SET betterWomen  better women than w for m

               w1          w2          w3

m1.w2         YES
m1.w3         YES         YES
m2.w1                     YES         YES
m2.w3                     YES
m3.w1                                 YES
m3.w2         YES                     YES


With this we can write:


* if m marries someone worse than w then w must marry someone
* better than m
* otherwise we can improve both w and m

stability(m,w)..
 
sum(betterWomen(m,w,ww),x(m,ww)) =g= sum(worseMen(w,m,mm),x(mm,w));



After adding a dummy objective function, we have a model that can deliver a single solution. 


Enumerating stable matchings


Let's go back to the example in [4]. It says:


Let's see if we can reproduce this. There are at least three ways to attack this:
  • Solve the problem. Add a constraint that forbids the current solution and repeat until things become infeasible.
  • For large problems with many solutions, we can use a MIP solver with a solution pool.
  • Or we can use a Constraint Programming based solver that allows producing multiple or all solutions.

For this small problem, I decided to use use the first approach. Note that I assume that each person has to be assigned. This corresponds with the solutions \(x^1\), \(x^2\), and \(x^3\): we basically have permuted identity matrices as solutions. 

Here is the output of my enumeration scheme:


----    127 PARAMETER solutions  enumerated solutions

INDEX 1 = sol1

            w1          w2          w3

m1           1
m2                       1
m3                                   1

INDEX 1 = sol2

            w1          w2          w3

m1                       1
m2                                   1
m3           1


So my little experiment can not reproduce the results. I produce the first two solutions, \(x^1\) and \(x^2\), but not the third one, \(x^3\). We are on an unequal footing here: there are three authors plus say three reviewers, so at least six people have looked at this carefully and disagree with me, versus just me with my little model. So to get a handle on this, let's proceed slowly and analyze this a bit further. 

When plugging in solution \(x^3\) (by fixing), we see some problems in equation stability:


---- stability  =G=  stability constraint

stability(m1,w2)..  x(m1,w1) - x(m2,w2) - x(m3,w2) =G= 0 ; (LHS = -1, INFES = 1 ****)
     
stability(m1,w3)..  x(m1,w1) + x(m1,w2) - x(m3,w3) =G= 0 ; (LHS = 0)
     
stability(m2,w1)..  - x(m1,w1) + x(m2,w2) + x(m2,w3) =G= 0 ; (LHS = 0)
     
stability(m2,w3)..  - x(m1,w3) + x(m2,w2) - x(m3,w3) =G= 0 ; (LHS = -1, INFES = 1 ****)
     
stability(m3,w1)..  - x(m1,w1) - x(m2,w1) + x(m3,w3) =G= 0 ; (LHS = -1, INFES = 1 ****)
     
stability(m3,w2)..  - x(m2,w2) + x(m3,w1) + x(m3,w3) =G= 0 ; (LHS = 0)

So introducing the match \((m_1,w_2)\) seems beneficial for both \(m_1\) and \(w_2\). We can see this from:


We see that if \(m_1\) and \(w_2\) leave their current partners (indicated in orange) and marry each other (the green cells), they both improve their situation. This means \(x^3\) is not a stable assignment. Note that we can use a similar argument for the assignments \((m_2,w_3)\) and \((m_3,w_1)\).


Alternative formulation


I have used the stability formulation from [3]: \[\sum_{m'\lt_w m} \color{darkred}x_{m',w} -\sum_{w'\gt_m w} \color{darkred}x_{m,w'} \le 0 \>\> \forall m,w\] In [4] an alternative form of this constraint is used: \[\sum_{w'\gt_m w} \color{darkred}x_{m,w'}+\sum_{m'\gt_w m} \color{darkred}x_{m',w} +  \color{darkred}x_{m,w} \ge 1\>\> \forall m,w\] This is the same constraint, as can be seen from the identity: \[\sum_{m'\lt_w m} \color{darkred}x_{m',w} + \color{darkred}x_{m,w} + \sum_{m'\gt_w m} \color{darkred}x_{m',w}=1\]


Specification of preferences 


In the GAMS models, I specified the preferences as ordinal numbers (1 is the best, 2 second best etc.). In many papers, the preferences are specified as lists: the man/woman number on the left is most attractive. Of course, we can translate between these representations. E.g.:


set
  m
'men'        /m1*m3/
  w
'women'      /w1*w3/
  p
'preference' /p1*p8/
;
alias (m,mm),(w,ww);

* in the next tables the numbers correspond to women and men
* the position indicates the preference

table mpref0(m,p) 'preferences of men for women (leftmost is preferred)'
   
p1 p2 p3
m1   1  2  3
m2   2  3  1
m3   3  1  2
;

table wpref0(w,p) 'preferences of women for men (leftmost is preferred)'
   
p1 p2 p3
w1   3  2  1
w2   1  3  2
w3   2  1  3
;

* get the rankings
parameters
   mpref(m,w)
'preferences of men for women (ranking,1=best)'
   wpref(w,m)
'preferences of women for men (ranking,1=best)'
;
mpref(m,w) =
sum(p$(mpref0(m,p)=ord(w)),ord(p));
wpref(w,m) =
sum(p$(wpref0(w,p)=ord(m)),ord(p));

option mpref:0,wpref:0;
display mpref,wpref;


This yields:

----     67 PARAMETER mpref  preferences of men for women (ranking,1=best)

            w1          w2          w3

m1           1           2           3
m2           3           1           2
m3           2           3           1


----     67 PARAMETER wpref  preferences of women for men (ranking,1=best)

            m1          m2          m3

w1           3           2           1
w2           1           3           2
w3           2           1           3


The summation in the code is not really to add things up but rather to select a single value. It is not unusual to use a SUM for this, almost as a substitute for a loop.

Note that it is not possible to have set elements inside a table in GAMS. So a table like this:


gives some problems. We can use integers for the body of the table like I did. Or we can use a set initialization. Such a representation would look like:


set mpref0(m,p,w) 'preferences of men for women (leftmost is preferred)' /
  
m1.(p1.w1, p2.w2, p3.w3)
  
m2.(p1.w2, p2.w3, p3.w1)
  
m3.(p1.w3, p2.w1, p3.w2)
/;


* get the rankings
parameters
   mpref(m,w)
'preferences of men for women (ranking,1=best)'
;
mpref(m,w) =
sum(mpref0(m,p,w),ord(p));

option mpref:0;
display mpref;



Below, I'll show how we can specify these preferences in R. 

R package matchingMarkets


The R package matchingMarkets [5] can also enumerate Stable Marriage assignments. Here the preference lists are specified columnwise. The package has a function for the assignment of students to colleges, so we call men students, and women colleges.


library(matchingMarkets)
# men are students
# women are colleges
s.prefs <- matrix(c("w1","w2","w3", "w2","w3","w1", "w3","w1","w2"), 3,3)
colnames(s.prefs) <- c("m1","m2","m3")
c.prefs <- matrix(c("m3","m2","m1", "m1","m3","m2", "m2","m1","m3"), 3,3)
colnames(c.prefs) <- c("w1","w2","w3")
hri(s.prefs = s.prefs,c.prefs = c.prefs)
#> $s.prefs.smi
#>      m1   m2   m3  
#> [1,] "w1" "w2" "w3"
#> [2,] "w2" "w3" "w1"
#> [3,] "w3" "w1" "w2"
#> 
#> $c.prefs.smi
#>      w1   w2   w3  
#> [1,] "m3" "m1" "m2"
#> [2,] "m2" "m3" "m1"
#> [3,] "m1" "m2" "m3"
#> 
#> $matchings
#>   matching college slots student sOptimal cOptimal sRank cRank
#> 1        1      w1    w1      m1        1        0     1     3
#> 2        1      w2    w2      m2        1        0     1     3
#> 3        1      w3    w3      m3        1        0     1     3
#> 4        2      w1    w1      m3        0        1     2     1
#> 5        2      w2    w2      m1        0        1     2     1
#> 6        2      w3    w3      m2        0        1     2     1
#> 
#> attr(,"class")
#> [1] "hri"


Here we see that the example problem yields two assignments.


Conclusion


Even a small \(3 \times 3\) problem can be difficult to analyze manually without a model. Using a model, I can't reproduce Example 2 in [4]. That situation is always a bit unsettling: did I make a mistake (either in interpreting the paper, or in my analysis) or is the paper just sloppy? After spending a bit more time on this, I now very much suspect the example in the paper is wrong. Better reproducibility would have been a bonus.
 

References


  1. Stable marriage problem, https://en.wikipedia.org/wiki/Stable_marriage_problem
  2. Stable Marriage Problem, https://www.geeksforgeeks.org/stable-marriage-problem/. This contains an implementation of the Gale-Shapley algorithm.
  3. John H. Vande Vate, Linear Programming brings Marital Bliss, Operations Research Letters 8 (1989) pp. 147-153. I used the stability constraint as formulated in this paper.
  4. Alvin E. Roth, Uriel G. Rothblum and John H. Vande Vate, Stable Matchings, Optimal Assignments, and Linear Programming, Mathematics of Operations Research, Vol. 18, No. 4 (1993), pp. 803-828. We look at example 2 of this paper.
  5. matchingMarkets: Analysis of Stable Matchings in R, https://matchingmarkets.org/
  6. An inter-cast marriage ceremony in Lalitpur, https://commons.wikimedia.org/wiki/File:Marriage_Ceremony_06.JPG. The picture is from this link.

Appendix A: GAMS code for experiments with 3x3 data set


This model was used to write the above post. It contains all the experiments.


$ontext

 
Stable Marriage Problem

 
1. John H. Vande Vate
    
LINEAR PROGRAMMING BRINGS MARITAL BLISS
    
Operations Research Letters 8 (1989) pp. 147-153

 
2. Alvin E. Roth, Uriel G. Rothblum and John H. Vande Vate
    
Stable Matchings, Optimal Assignments, and Linear Programming
    
Mathematics of Operations Research
    
Vol. 18, No. 4 (1993), pp. 803-828



$offtext


*----------------------------------------------------------------
* data
*----------------------------------------------------------------


set
  m
'men'    /m1*m3/
  w
'women'  /w1*w3/
;
alias (m,mm),(w,ww);


table mpref(m,w) 'preferences of men for women (ranking,1=best)'
   
w1 w2 w3
m1   1  2  3
m2   3  1  2
m3   2  3  1
;

table wpref(w,m) 'preferences of women for men (ranking,1=best)'
   
m1 m2 m3
w1   3  2  1
w2   1  3  2
w3   2  1  3
;


abort$(card(m)<>card(w)) "Equal-sized sets expected",m,w;

scalar n 'size of problem';
n =
card(w);

*----------------------------------------------------------------
* derived data
*----------------------------------------------------------------

sets
  worseMen(w,m,mm)     
'worse men than m for w'
  betterWomen(m,w,ww)  
'better women than w for m'
;
worseMen(w,m,mm) = wpref(w,mm)>wpref(w,m);
betterWomen(m,w,ww) = mpref(m,ww)<mpref(m,w);

option mpref:0,wpref:0;
display m,w,mpref,wpref,worseMen,betterWomen;


*----------------------------------------------------------------
* MIP model
*----------------------------------------------------------------

binary variable x(m,w) 'assignment';

variable z 'dummy objective';

equations
   obj           
'dummy objective'
   assign1(w)    
'assignment constraint'
   assign2(m)    
'assignment constraint'
   stability(m,w)
'stability constraint'
;


obj.. z =e= 0;

assign1(w)..
sum(m, x(m,w)) =e= 1;
assign2(m)..
sum(w, x(m,w)) =e= 1;

* formulation from [1]
* if m marries someone worse than w then w must marry someone better than m
* otherwise we can improve both w and m
stability(m,w)..
 
sum(betterWomen(m,w,ww),x(m,ww)) =g= sum(worseMen(w,m,mm),x(mm,w));

model StableMarriage /all/;
solve StableMarriage using mip minimizing z;

option x:0;
display x.l;

*----------------------------------------------------------------
* generate all feasible solutions
*----------------------------------------------------------------

sets
   sol
'max number of solutions' /sol1*sol5/
   s(sol)
'dynamic subset'
;

equation cut(sol) 'no good constraints';

parameter solutions(sol,m,w) 'enumerated solutions';

cut(s)..
sum((m,w), solutions(s,m,w)*x(m,w)) =l= n-1;


model Enumerate /all/;

s(sol) =
no;
solutions(sol,m,w) = 0;
loop(sol,
   
solve Enumerate using mip minimizing z;
   
break$(Enumerate.ModelStat <> %modelStat.optimal% and
        Enumerate.ModelStat <> %modelStat.integerSolution%);

    solutions(sol,m,w) = round(x.l(m,w));
    s(sol) =
yes;
);

option solutions:0:1:1;
display solutions;


*----------------------------------------------------------------
* check solution from [2]
*----------------------------------------------------------------


parameter sol2(m,w) 'solution 3 from [2]' /
      
m2.w1 1
      
m3.w2 1
      
m1.w3 1
/;
option sol2:0;
display sol2;

x.fx(m,w) = sol2(m,w);

solve StableMarriage using mip minimizing z;


Appendix B: GAMS code for enumerating stable solutions for 8x8 data set


In this GAMS model, I streamlined the solve loop a bit. I removed all printing (GAMS prints a lot by default) and used a faster GAMS-solver communication. For normal models, this makes not much of a difference, but when we have a loop with many solves it does.


$ontext

 
Stable Marriage Problem

 
Slightly optimized version for slightly larger data set.
 
For even larger data sets use the solution pool.

 
For this example we get 9 stable solutions.

$offtext

*----------------------------------------------------------------
* data
*----------------------------------------------------------------

set
  m
'men'    /m1*m8/
  w
'women'  /w1*w8/
;
alias (m,mm),(w,ww);

table mpref(m,w) 'preferences of men for women (ranking,1=best)'
   
w1 w2 w3 w4 w5 w6 w7 w8
m1   3  4  8  7  1  5  2  6
m2   6  1  2  5  4  8  3  7
m3   3  6  7  4  2  5  8  1
m4   5  2  1  4  8  6  3  7
m5   4  2  5  8  3  6  1  7
m6   1  7  8  6  4  2  3  5
m7   8  1  5  6  2  4  3  7
m8   8  6  1  3  4  7  5  2
;


table wpref(w,m) 'preferences of women for men (ranking,1=best)'
   
m1 m2 m3 m4 m5 m6 m7 m8
w1   5  6  2  8  1  4  3  7
w2   7  6  3  8  4  2  5  1
w3   1  4  8  5  2  3  7  6
w4   6  4  3  5  7  8  2  1
w5   6  7  4  2  8  1  3  5
w6   8  1  6  4  3  5  7  2
w7   4  3  8  7  2  6  1  5
w8   3  5  6  2  4  7  1  8
;

abort$(card(m)<>card(w)) "Equal-sized sets expected",m,w;

scalar n 'size of problem';
n =
card(w);

*----------------------------------------------------------------
* derived data
*----------------------------------------------------------------

sets
  worseMen(w,m,mm)     
'worse men than m for w'
  betterWomen(m,w,ww)  
'better women than w for m'
;
worseMen(w,m,mm) = wpref(w,mm)>wpref(w,m);
betterWomen(m,w,ww) = mpref(m,ww)<mpref(m,w);

option mpref:0,wpref:0;
display m,w,mpref,wpref,worseMen,betterWomen;

*----------------------------------------------------------------
* MIP enumeration model
*----------------------------------------------------------------

sets
   sol
'max number of solutions' /sol1*sol100/
   s(sol)
'dynamic subset'
;

binary variable x(m,w) 'assignment';

variable z;

equations
   obj           
'dummy objective'
   assign1(w)    
'assignment constraint'
   assign2(m)    
'assignment constraint'
   stability(m,w)
'stability constraint'
   cut(sol)      
'no good constraints'
;

parameter solutions(sol,m,w) 'enumerated solutions';

obj.. z =e= 0;

assign1(w)..
sum(m, x(m,w)) =e= 1;
assign2(m)..
sum(w, x(m,w)) =e= 1;

stability(m,w)..
 
sum(betterWomen(m,w,ww),x(m,ww)) =g= sum(worseMen(w,m,mm),x(mm,w));

cut(s)..
sum((m,w), solutions(s,m,w)*x(m,w)) =l= n-1;

model Enumerate /all/;
Enumerate.solprint = %solprint.Silent%;
Enumerate.solvelink = 5;

s(sol) =
no;
solutions(sol,m,w) = 0;
loop(sol,
   
solve Enumerate using mip minimizing z;
   
break$(Enumerate.ModelStat <> %modelStat.optimal% and
        Enumerate.ModelStat <> %modelStat.integerSolution%);

    solutions(sol,m,w) = round(x.l(m,w));
    s(sol) =
yes;
);

option solutions:0:1:1;
display solutions;



The output of this model looks like:

----    110 PARAMETER solutions  enumerated solutions

INDEX 1 = sol1

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                   1
m2                                                                       1
m3           1
m4                                                                                               1
m5                       1
m6                                                           1
m7                                                                                   1
m8                                               1

INDEX 1 = sol2

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                                           1
m2                                   1
m3                                                                                               1
m4                                                                       1
m5                                                                                   1
m6           1
m7                       1
m8                                               1

INDEX 1 = sol3

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                                                                               1
m2                                   1
m3                                                           1
m4                                                                       1
m5                                                                                   1
m6           1
m7                       1
m8                                               1

INDEX 1 = sol4

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                                                                               1
m2                                   1
m3                       1
m4                                                                       1
m5           1
m6                                                           1
m7                                                                                   1
m8                                               1

INDEX 1 = sol5

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                   1
m2                                                                       1
m3                                                           1
m4                                                                                               1
m5                                                                                   1
m6           1
m7                       1
m8                                               1

INDEX 1 = sol6

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                                                                               1
m2                                   1
m3           1
m4                                                                       1
m5                       1
m6                                                           1
m7                                                                                   1
m8                                               1

INDEX 1 = sol7

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                   1
m2                                                                       1
m3                       1
m4                                                                                               1
m5           1
m6                                                           1
m7                                                                                   1
m8                                               1

INDEX 1 = sol8

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                   1
m2                                                                       1
m3           1
m4                                                                                               1
m5                                                                                   1
m6                                                           1
m7                       1
m8                                               1

INDEX 1 = sol9

            w1          w2          w3          w4          w5          w6          w7          w8

m1                                                                                               1
m2                                   1
m3           1
m4                                                                       1
m5                                                                                   1
m6                                                           1
m7                       1
m8                                               1

No comments:

Post a Comment