The kidney exchange problem

Suppose someone needs a kidney transplant and a family member is willing to donate one. If the donor and recipient are incompatible (because of blood types, tissue mismatch, and so on), the transplant cannot happen. Now suppose two donor-recipient pairs A and B are in this situation, but donor A is compatible with recipient B and donor B is compatible with recipient A. Then two transplants can take place in a two-way swap in which donor A gives to recipient B and donor B gives to recipient A. More generally, with

$n$ such incompatible donor-recipient pairs you can sometimes do an

$n$ -way swap. In 2009, a 13-way swap took place. A shorter swap played a prominent role in a 2008 episode of the television show Grey's Anatomy. This problem is of particular interest to the Operations Research community, because the Alliance for Paired Donation (together with teams from Boston College, Stanford, and MIT) was a finalist for the very prestigious 2014 Franz Edelman Award for Achievement in Operations Research and the Management Sciences.

Network Model

The problem can be modeled by using a network as follows. Each node corresponds to an incompatible donor-recipient pair, and there is a directed arc from node

$i$ to node

$j$ if the donor from node

$i$ is compatible with the recipient from node

$j$ . Each arc has a weight that represents the quality of the match, and the goal is to find a maximum weight node-disjoint union of directed cycles. You want the union to be node-disjoint so that no kidney is donated more than once, and you want cycles so that the donor from node

$i$ gives up a kidney if and only if the recipient from node

$i$ receives a kidney.

Without any other constraints, the problem could be solved as a linear assignment problem, perhaps by using the LINEAR_ASSIGNMENT statement in PROC OPTNET or the SOLVE WITH NETWORK statement in PROC OPTMODEL. But doing so would allow arbitrarily long cycles in the solution. Because of practical considerations (such as travel) and to mitigate risk, each cycle must have no more than

$L$ links. The kidney exchange problem is to find a maximum-weight node-disjoint union of short directed cycles.

Let us generate an example data set to represent an exchange network with (approximately) 100 incompatible donor-recipient pairs. The following DATA step creates a random graph on nodes with link probability and Uniform(0,1) weight:

/* random graph on n nodes with arc probability p and uniform(0,1) weight */
%let n = 100;
%let p = 0.02;
data ArcData;
   do i = 0 to &n - 1;
      do j = 0 to &n - 1;
         if i eq j then continue;
         else if ranuni(1) < &p then do;
            weight = ranuni(2);
            output;
         end;
      end;
   end;
run;

Integer Programming Model

Define an index set

$M = \{1,\dots,|N|/2\}$ of candidate disjoint unions of short cycles (called matchings). Let

$x_{ijm}$ be a binary variable, which, if set to 1, indicates that arc

$(i,j)$ is in a matching

$m$ . Let

$y_{im}$ be a binary variable that, if set to 1, indicates that node

$i$ is covered by matching

$m$ . In addition, let

$s_i$ be a binary slack variable that, if set to 1, indicates that node

$i$ is not covered by any matching.

The kidney donor exchange can be formulated as a MILP as follows:

$\begin{align*} &\text{maximize} & \sum_{(i,j) \in A} \sum_{m \in M} w_{ij} x_{ijm}\\ &\text{subject to} & \sum_{m \in M} y_{im} + s_i & = 1 && i \in N && \text{(Packing)}\\ & & \sum_{(i,j) \in A} x_{ijm} & = y_{im} && i \in N, \ m \in M && \text{(Donate)} \\ & & \sum_{(i,j) \in A} x_{ijm} & = y_{jm} && j \in N, \ m \in M && \text{(Receive)} \\ & & \sum_{(i,j) \in A} x_{ijm} & \leq L && m \in M && \text{(Cardinality)} \\ & & x_{ijm} & \in \left\{0,1\right\} && (i,j) \in A, \ m \in M \\ & & y_{im} & \in \left\{0,1\right\} && i \in N, \ m \in M \\ & & s_{i} & \in \left\{0,1\right\} && i \in N \end{align*}$
In this formulation, the Packing constraints ensure that each node is covered by at most one matching. The Donate and Receive constraints enforce the condition that if node

$i$ is covered by matching

$m$ , then the matching

$m$ must use exactly one arc that leaves node

$i$ (Donate) and one arc that enters node

$i$ (Receive). Conversely, if node

$i$ is not covered by matching

$m$ , then no arcs that enter or leave node

$i$ can be used by matching

$m$ . The Cardinality constraints enforce the condition that the number of arcs in matching

$m$ must not exceed

$L$ . Because it is not necessary to cover each incompatible donor-recipient pair (node), the Packing constraints can be modeled by using set partitioning constraints and the slack variable

$s$ .

Using PROC OPTMODEL, we can set up and solve the problem as follows:

%let max_length = 10;
proc optmodel;
   set <num,num> ARCS;
   num weight {ARCS};
   read data ArcData into ARCS=[i j] weight;
   set NODES = union {<i,j> in ARCS} {i,j};
   set MATCHINGS = 1..card(NODES)/2;
 
   /* UseNode[i,m] = 1 if node i is used in matching m, 0 otherwise */
   var UseNode {NODES, MATCHINGS} binary;
   /* UseArc[i,j,m] = 1 if arc (i,j) is used in matching m, 0 otherwise */
   var UseArc {ARCS, MATCHINGS} binary;
   /* Slack[i]=1 if node i is not covered by any matching */
   var Slack {NODES} binary;
 
   /* maximize total weight of arcs used */
   max TotalWeight
      = sum {<i,j> in ARCS, m in MATCHINGS} weight[i,j] * UseArc[i,j,m];
 
   /* each node appears in at most one matching */
   con node_packing {i in NODES}:
      sum {m in MATCHINGS} UseNode[i,m] + Slack[i] = 1;
   /* at most one donee for each donor */
   con donate {i in NODES, m in MATCHINGS}:
      sum {<(i),j> in ARCS} UseArc[i,j,m] = UseNode[i,m];
   /* at most one donor for each donee */
   con receive {j in NODES, m in MATCHINGS}:
      sum {<i,(j)> in ARCS} UseArc[i,j,m] = UseNode[j,m];
   /* exclude long matchings */
   con cardinality {m in MATCHINGS}:
      sum {<i,j> in ARCS} UseArc[i,j,m] <= &max_length;
 
   solve with milp / maxtime=600;

Solving this problem with direct MILP techniques can be difficult. After ten minutes, we stop the solver with a good solution, but with no proof of optimality.

NOTE: The presolved problem has 5848 variables, 2728 constraints, 
      and 21459 constraint coefficients.
NOTE: The MILP solver is called.
          Node  Active    Sols    BestInteger      BestBound      Gap    Time
-- snip --
             0       1       4      3.0229574     29.8635658   89.88%       3
NOTE: The MILP solver added 64 cuts with 3134 cut coefficients at the root.
            11      11       6      8.5362428     29.8635658   71.42%       5
            86      79      12     25.9768908     29.8635658   13.01%      12
-- snip --
         13300    9177      14     26.0202871     29.3516198   11.35%     596
         13379    9231      14     26.0202871     29.3516198   11.35%     599
NOTE: CPU time limit reached.
NOTE: Objective of the best integer solution found = 26.020287142.

Dynamic Column Generation (Decomposition)

If we relax the Packing constraint, we can see that the resulting model has a decomposable structure (separable for each matching

$m$ ). In addition, the matching identifier is arbitrary (each matching problem is identical). This is a good situation for using the DECOMP feature (we have seen this type of structure before, in the Wedding Planner Problem). Let us explicitly define the (subproblem) blocks of the decomposition by using the .block constraint suffix, and solve using DECOMP, as follows:

   /* decompose by matching (aggregate formulation) */
   for {i in NODES, m in MATCHINGS} donate[i,m].block = m;
   for {j in NODES, m in MATCHINGS} receive[j,m].block = m;
   for {m in MATCHINGS} cardinality[m].block = m;
   solve with milp / presolver=basic decomp=();

In this case, the PRESOLVER=BASIC option ensures that the model maintains its specified symmetry, enabling the algorithm to use the aggregate formulation and Ryan-Foster branching. Now we can solve the problem to optimality quickly (in 17 seconds).

NOTE: All blocks are identical and the master model is set partitioning.
NOTE: The Decomposition algorithm is using an aggregate formulation and 
      Ryan-Foster branching.
NOTE: The problem has a decomposable structure with 48 blocks. 
      The largest block covers 2.06% of the constraints in the problem.
NOTE: The decomposition subproblems cover 9216 (99.32%) variables 
      and 6096 (98.98%) constraints.
      Iter         Best       Master         Best       LP       IP  CPU Real
                  Bound    Objective      Integer      Gap      Gap Time Time
-- snip --
         2     388.4996       9.2503       9.2503   97.62%   97.62%    0    0
         3     359.1937      10.9240      10.9240   96.96%   96.96%    0    0
-- snip --
        44      26.7804      26.7804      23.4755    0.00%   12.34%    4    3
NOTE: Starting branch and bound.
         Node  Active   Sols         Best         Best      Gap    CPU   Real
                                  Integer        Bound            Time   Time
            0       1     10      23.4755      26.7804   12.34%      4      4
            5       7     11      24.7054      26.5360    6.90%     10      8
-- snip --
           20       0     14      26.0203      26.0203    0.00%     20     17
NOTE: The Decomposition algorithm used 4 threads.
NOTE: The Decomposition algorithm time is 17.27 seconds.
NOTE: Optimal within relative gap.
NOTE: Objective = 26.020287142.

Static Column Generation

The DECOMP method is an automated implementation of a specific variant of a general algorithm known as dynamic column generation. At each major step of the algorithm, an integer program is solved to generate candidate columns that satisfy the subproblem constraints (for each matching). Then a master formulation takes combinations of these column to find a solution that satisfies the master constraints (in this case, the Packing constraints). In certain situations, if the number of feasible columns is small enough and there is an efficient method for generating all of these columns in advance, then using static column generation can work very well. The feasible columns of the Kidney Exchange Problem are (short) cycles in a graph. For reasonably sized graphs, all of these cycles can be generated efficiently in advance.

As part of its solver toolbox, PROC OPTMODEL provides a suite of network algorithms. One such algorithm can generate all short cycles in a graph, as follows.

   /* generate all cycles with 2 <= length <= max_length */
   solve with NETWORK /
      graph_direction = directed
      links           = (include=ARCS)
      cycle           = (mode=all_cycles minlength=2 maxlength=&max_length)
      out             = (cycles=ID_ORDER_NODE);

This step is very fast (2 seconds).

NOTE: The number of nodes in the input graph is 97.
NOTE: The number of links in the input graph is 194.
NOTE: Processing cycle detection.
NOTE: The graph has 224 cycles.
NOTE: Processing cycle detection used 2.15 (cpu: 2.17) seconds.

Given the set of cycles, we can now formulate a mixed integer linear program (MILP) to maximize the total cycle weight. Let

$C$ be the set of cycles of appropriate length,

$N_c$ be the set of nodes in cycle

$c$ ,

$A_c$ be the set of links in cycle

$c$ , and

$w_{ij}$ be the link weight for link

$(i,j)$ . Define a binary decision variable

$x_{c}$ . Set

$x_{c}$ to 1 if cycle

$c$ is used in the solution; otherwise, set it to 0. Then the following MILP defines the problem that you want to solve (to maximize the quality of the kidney exchange):

$\begin{align*} &\text{maximize} &\sum_{c \in C} \left(\sum_{(i,j) \in A_c} w_{ij}\right) x_c \\ &\text{subject to} &\sum_{c \in C: i \in N_c} x_c & \le 1 && i \in N && \text{(Packing)} \\ & & x_c & \in \{0,1\} && c \in C \end{align*}$
The constraint (Packing) ensures that each node (incompatible pair) in the graph is intersected at most once. That is, a donor can donate a kidney only once.

This is done in PROC OPTMODEL as follows:

   /* extract <cid,from,to> triples from <cid,order,node> triples */
   set <num,num,num> ID_FROM_TO init {};
   num last init ., from, to;
   for {<cid,order,node> in ID_ORDER_NODE} do;
      from = last;
      to   = node;
      last = to;
      if order ne 1 then ID_FROM_TO = ID_FROM_TO union {<cid,from,to>};
   end;
   set CYCLES = setof {<c,i,j> in ID_FROM_TO} c;
   set ARCS_c {c in CYCLES} = setof {<(c),i,j> in ID_FROM_TO} <i,j>;
   set NODES_c {c in CYCLES} = union {<i,j> in ARCS_c[c]} {i,j};
   set NODES = union {c in CYCLES} NODES_c[c];
   num cycle_weight {c in CYCLES} = sum {<i,j> in ARCS_c[c]} weight[i,j];
 
   /* UseCycle[c] = 1 if cycle c is used, 0 otherwise */
   var UseCycle {CYCLES} binary;
   /* maximize total weight of cycles used */
   max TotalWeight
      = sum {c in CYCLES} cycle_weight[c] * UseCycle[c];
   /* each node appears in at most one cycle */
   con node_packing {i in NODES}:
      sum {c in CYCLES: i in NODES_c[c]} UseCycle[c] <= 1;
 
   solve with milp;

The standard MILP solver solves this problem easily (in 0 seconds).

NOTE: The presolved problem has 224 variables, 28 constraints, 
      and 1266 constraint coefficients.
NOTE: The MILP solver is called.
          Node  Active    Sols    BestInteger      BestBound      Gap    Time
-- snip --
             0       1       6     23.3432816     26.0202969   10.29%       0
             0       1       7     26.0202872     26.0202872    0.00%       0
NOTE: The MILP solver added 9 cuts with 1111 cut coefficients at the root.
NOTE: Optimal.
NOTE: Objective = 26.020287205.

In 2012, Alvin E. Roth and Lloyd S. Shapley won the Nobel Prize in Economics for their contributions to the theory of stable matchings and the design of markets. An application of their work was the Kidney Exchange Problem described in this blog. A full presentation of the work done by the Alliance for Paired Donation (together with Alvin Roth and teams from Boston College, Stanford, and MIT) can be viewed here.

Acknowledgement: Rob Pratt

2 Comments

Matthew Saltzman on June 27, 2015 11:24 am

Apparently, in real life there are also some altruistic donors who aren't coupled with corresponding incompatible recipients. These donors would appear in your graph as nodes with no inbound arcs and with outbound arcs to compatible recipients. Exchanges involving these donors don't (can't) require closed cycles; the last recipient in the chain doesn't need to be coupled with a donor, so these exchanges would appear in a solution as (short) paths.

- Rob Pratt on June 28, 2015 2:36 pm
  
  You can model this situation by introducing dummy arcs with zero weight back to the altruistic node. The resulting problem then still involves only cycles. Similarly, a recipient without a donor can be modeled by a node that has outbound arcs with zero weight (but only to altruistic nodes).

Blogs