Saturday, October 21, 2023

Scheduling Team Meetings

This is a simple problem from [1]:

I'm rusty on constraint optimization and am looking for help in this particular
use case. There are individuals who are each member to several teams. This is a
fixed many-to-many relationship and is determined a-priori. There are 3 time
slots where the teams can be scheduled to conduct a business meeting, but if an
individual is a member of more than one team which are both meeting at a given
time slot, they'll only be able to attend one. The objective is to schedule the
teams into the time slots, minimizing the number of overlaps of individuals.
For beginners, it is often a good idea to split the task in two:
  1. Formulate a mathematical model (on a piece of paper)
  2. Implement the model in code
If you did a good job on (1), (2) becomes usually a breeze. Once more experienced, you don't need to explicitly do step (1). This also depends on the "distance" between the tool and the math. Using modeling tools (closer to math) can help with this. 

We start with the indices. We have: \[\begin{align}&i&&\text{individuals}\\&t && \text{teams} \\  &s &&\text{time slots}\end{align}\] Next we have the data: \[\color{darkblue}m_{i,t} = \begin{cases}1 & \text{if individual $i$ is part of team $t$}\\0 & \text{otherwise}\end{cases}\]

I generated some random data. In GAMS, we often use sets instead of binary parameters. Here is my random data:

----     23 SET member  team membership

                   team1       team2       team3       team4       team5       team6

individual1          YES                                                         YES
individual2                                  YES
individual3                                  YES                     YES
individual4                                                          YES         YES
individual5                                  YES
individual6          YES                     YES
individual7                                                                      YES
individual8                      YES                     YES
individual10         YES         YES                                 YES
individual11         YES         YES                                 YES         YES
individual12                                                         YES         YES
individual13         YES                                                         YES
individual14                     YES         YES         YES
individual15         YES         YES
individual17         YES                     YES                     YES         YES
individual18                                             YES
individual19         YES                                 YES
individual20                                 YES                     YES

We are missing some individuals: they are not in any of the teams. This is just a result of how the random set member was generated. We could remove those from the set \(i\), but I decided to leave them in. It is a good check to see if the model is robust enough to handle this.


The binary decision variables can be: \[\color{darkred}x_{t,s} = \begin{cases}1 &\text{if team $t$ is meeting during time slot $s$}\\ 0 & \text{otherwise}\end{cases}\] and  \[\color{darkred}y_{i,s} = \begin{cases}1 &\text{if individual $i$ has a meeting during time slot $s$}\\ 0 & \text{otherwise}\end{cases}\]

Each team meeting is scheduled exactly once. This is simply \[\sum_s \color{darkred}x_{t,s}=1\>\>\forall t\] 

Minimizing the number of times an individual is double booked is a bit difficult to use as an objective. This is because checking whether a discrete variable is 2 or more requires some thought. It is easier to maximize the number of meetings each individual can attend. We can say: \[\begin{align} \max &\sum_{i,s}\color{darkred}y_{i,s} \\ & \color{darkred}y_{i,s}\le \sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}&&\forall i,s \end{align}\]

This will implicitly minimize the double bookings!

To summarize, my complete model is: \[\begin{align}\max &\sum_{i,s}\color{darkred}y_{i,s} \\ & \sum_s \color{darkred}x_{t,s}=1&&\forall t \\ & \color{darkred}y_{i,s}\le \sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}&&\forall i,s \\ & \color{darkred}x_{t,s} \in \{0,1\} \\ &\color{darkred}y_{i,s} \in \{0,1\} \end{align}\]

We could relax \(\color{darkred}y_{i,s}\) to be continuous between 0 and 1. The results look like:


----     46 VARIABLE x.L  schedule team meetings at time slot

        timeslot1   timeslot2   timeslot3

team1           1
team2                       1
team3           1
team4                                   1
team5                                   1
team6                       1


----     46 VARIABLE y.L  individual is in meeting at time slot

               timeslot1   timeslot2   timeslot3

individual1            1           1
individual2            1
individual3            1                       1
individual4                        1           1
individual5            1
individual6            1
individual7                        1
individual8                        1           1
individual10           1           1           1
individual11           1           1           1
individual12                       1           1
individual13           1           1
individual14           1           1           1
individual15           1           1
individual17           1           1           1
individual18                                   1
individual19           1                       1
individual20           1                       1

With a bit of postprocessing, we can see how many double bookings we have:

----     51 PARAMETER meetingcount  number of meetings

               timeslot1   timeslot2   timeslot3

individual1            1           1
individual2            1
individual3            1                       1
individual4                        1           1
individual5            1
individual6            2
individual7                        1
individual8                        1           1
individual10           1           1           1
individual11           1           2           1
individual12                       1           1
individual13           1           1
individual14           1           1           1
individual15           1           1
individual17           2           1           1
individual18                                   1
individual19           1                       1
individual20           1                       1

Here meeting count is calculated as  \(\sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x^*_{t,s}\).

We see three instances where an individual has to miss a meeting. The double bookings are nicely distributed over different individuals. This is more or less by accident. We can observe solutions with one single person having a lot of missed meetings.

We note that individuals missing in the team membership data, \(\color{darkblue}m_{i,t}\), are indeed not in the solution.
 

Minimizing overbookings


The previous model maximizes the number of meetings each individual can attend. Here, I want to minimize the number of times an individual has to miss a meeting because 2 or more of its meetings are scheduled on the same day. In terms of the original post: minimize the number of overlaps.

We can use something like: 

 \[\begin{align}\min &\sum_{i,s}\color{darkred}c^{\ge 2}_{i,s} \\ & \sum_s \color{darkred}x_{t,s}=1&&\forall t \\ & \color{darkred}c^1_{i,s}+\color{darkred}c^{\ge 2}_{i,s}= \sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}&&\forall i,s \\ & \color{darkred}x_{t,s} \in \{0,1\} \\ &\color{darkred}c^1_{i,s} \in [0,1], \color{darkred}c^{\ge 2}_{i,s}\ge 0 \end{align}\]

So we split the meet count \(\sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}\) into two parts. \(\color{darkred}c^1_{i,s}\) holds the part up to 1, while \(\color{darkred}c^2_{i,s}\) takes care of the counts greater than or equal to 2. We minimize the second part. That part represents the number of overlaps. The variables \(\color{darkred}c^1_{i,s}\) and \(\color{darkred}c^{\ge 2}_{i,s}\) are automatically integer-valued, so I made them continuous in the model. 

A slight improvement is to drop \(\color{darkred}c^1_{i,s}\) and write:

\[\begin{align}\min &\sum_{i,s}\color{darkred}c^{\ge 2}_{i,s} \\ & \sum_s \color{darkred}x_{t,s}=1&&\forall t \\ & 1 +\color{darkred}c^{\ge 2}_{i,s}\ge \sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}&&\forall i,s \\ & \color{darkred}x_{t,s} \in \{0,1\} \\ & \color{darkred}c^{\ge 2}_{i,s}\ge 0 \end{align}\]

A simpler interpretation of \(\color{darkred}c^{\ge 2}_{i,s}\) is: \[\color{darkred}c^{\ge 2}_{i,s} = \max\{0,\sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}-1\}\]

There are many optimal solutions, so don't expect to get exactly the same schedule.  However, you should see the same number of missed meetings. Here is a solution:


----     86 VARIABLE x.L  schedule team meetings at time slot

        timeslot1   timeslot2   timeslot3

team1                                   1
team2                       1
team3                       1
team4           1
team5           1
team6                       1


----     86 VARIABLE y2.L  rest of y (part that represents >=2)

               timeslot2

individual11           1
individual14           1
individual17           1


----     86 PARAMETER meetingcount  number of meetings

               timeslot1   timeslot2   timeslot3

individual1                        1           1
individual2                        1
individual3            1           1
individual4            1           1
individual5                        1
individual6                        1           1
individual7                        1
individual8            1           1
individual10           1           1           1
individual11           1           2           1
individual12           1           1
individual13                       1           1
individual14           1           2
individual15                       1           1
individual17           1           2           1
individual18           1
individual19           1                       1
individual20           1           1


Indeed, we have again 3 double bookings.

Conclusions


  • Writing down the mathematical model is a useful exercise. Coding is much easier afterward. A math model is a blue print for the code.
  • Instead of minimizing double bookings, we can maximize the number of meetings attended. Both give similar solutions.
  • This is a nice little example with some interesting modeling angles. With a little bit of thought, we can create quite compact models. 
  • Of course this problem is small enough for complete enumeration. Question: how many ways are there to assign 6 team meetings to three days? 
  • The models have a lot of symmetry. E.g., we can fix the first team (or the largest team) to time slot 1.

References


Appendix: GAMS Model

 

 

$onText

 

 

I'm rusty on constraint optimization and am looking for help in this particular

use case. There are individuals who are each member to several teams. This is a

fixed many-to-many relationship and is determined a-priori. There are 3 time

slots where the teams can be scheduled to conduct a business meeting, but if an

individual is a member of more than one team which are both meeting at a given

time slot, they'll only be able to attend one. The objective is to schedule the

teams into the time slots, minimizing the number of overlaps of individuals.

 

https://or.stackexchange.com/questions/11115/google-or-tools-assignment-minimizing-overlapping-members

 

$offText

 

 

*---------------------------------------------------------------

* data

*---------------------------------------------------------------

 

 

set

  'individuals' /individual1*individual20/

  'teams' /team1*team6/

  'time slot' /timeslot1*timeslot3/

  member(i,t) 'team membership'

;

member(i,t) = uniform(0,1)<0.25;

display member;

 

* we miss some individuals in member set

parameter numindiv(*) 'number of individuals';

numindiv('in set i') = card(i);

numindiv('team members') = sum(i$sum(member(i,t),1),1);

option numindiv:0;

display numindiv;

 

 

*---------------------------------------------------------------

* model 1: maximize number meetings for each individual

*---------------------------------------------------------------

 

binary Variables

   x(t,s) 'schedule team meetings at time slot'

   y(i,s) 'individual is in meeting at time slot'

;

 

variable total 'total number of meeting participation';

 

equations

    obj 'objective'

    schedule_teams(t) 'exactly one time'

    schedule_indiv(i,s) 'bound'

;

 

schedule_teams(t).. sum(s,x(t,s)) =e= 1;

schedule_indiv(i,s).. y(i,s) =l= sum(member(i,t),x(t,s));

obj.. total =e= sum((i,s),y(i,s));

 

model schedule /all/;

solve schedule maximizing total using mip;

 

option x:0,y:0;

display x.l,y.l;

 

parameter meetingcount(i,s) 'number of meetings';

meetingcount(i,s) = sum(member(i,t), x.l(t,s));

option meetingcount:0;

display meetingcount;

  

*---------------------------------------------------------------

* model 2: minimize number of double bookings

*---------------------------------------------------------------

 

positive variables

   y1(i,s) 'first part of y (0 or 1)'

   y2(i,s) 'rest of y (part that represents >=2)'

;

y1.up(i,s) = 1;

variable overbookings;

 

Equations

   nummeet 'number of meetings'

   obj2 'minimize double bookings'

;

nummeet(i,s).. y1(i,s) + y2(i,s) =e= sum(member(i,t),x(t,s));

obj2.. overbookings =e= sum((i,s),y2(i,s));

 

model schedule2 /obj2,nummeet,schedule_teams/;

solve schedule2 minimizing overbookings using mip;

 

meetingcount(i,s) = sum(member(i,t), x.l(t,s));

option y2:0;

display x.l,y2.l,meetingcount;

 

 

*---------------------------------------------------------------

* model 3: variant of model 2

*---------------------------------------------------------------

 

Equations

   nummeet2 'number of meetings (bound)'

;

nummeet2(i,s).. 1 + y2(i,s) =g= sum(member(i,t),x(t,s));

 

model schedule3 /obj2,nummeet2,schedule_teams/;

solve schedule3 minimizing overbookings using mip;

 

meetingcount(i,s) = sum(member(i,t), x.l(t,s));

display x.l,y2.l,meetingcount;

 

2 comments:

  1. Thanks
    I solved it in Googlw OR-Tools
    https://or.stackexchange.com/a/11165/4775

    ReplyDelete
    Replies
    1. Your code has an upper bound of 2 on the number of duplicates. (I.e. this problem with just 1 time period will be infeasible). I assume this code was copy-pasted which is almost always a recipe for disaster. Better than copy-paste is to copy the concepts but enter the code by typing. It forces you to pay attention to the details.

      Delete