Suppose someone needs a kidney transplant and a family member is willing to donate one. If the donor and recipient are incompatible (because of blood types, tissue mismatch, and so on), the transplant cannot happen. Now suppose two donor-recipient pairs A and B are in this situation, but donor A is compatible with recipient B and donor B is compatible with recipient A. Then two transplants can take place in a two-way swap in which donor A gives to recipient B and donor B gives to recipient A. More generally, with such incompatible donor-recipient pairs you can sometimes do an -way swap. In 2009, a 13-way swap took place. A shorter swap played a prominent role in a 2008 episode of the television show Grey's Anatomy. This problem is of particular interest to the Operations Research community, because the Alliance for Paired Donation (together with teams from Boston College, Stanford, and MIT) was a finalist for the very prestigious 2014 Franz Edelman Award for Achievement in Operations Research and the Management Sciences.
Without any other constraints, the problem could be solved as a linear assignment problem, perhaps by using the LINEAR_ASSIGNMENT statement in PROC OPTNET or the SOLVE WITH NETWORK statement in PROC OPTMODEL. But doing so would allow arbitrarily long cycles in the solution. Because of practical considerations (such as travel) and to mitigate risk, each cycle must have no more than links. The kidney exchange problem is to find a maximum-weight node-disjoint union of short directed cycles.
Let us generate an example data set to represent an exchange network with (approximately) 100 incompatible donor-recipient pairs. The following DATA step creates a random graph on nodes with link probability and Uniform(0,1) weight:
The kidney donor exchange can be formulated as a MILP as follows:
In this formulation, the Packing constraints ensure that each node is covered by at most one matching. The Donate and Receive constraints enforce the condition that if node is covered by matching , then the matching must use exactly one arc that leaves node (Donate) and one arc that enters node (Receive). Conversely, if node is not covered by matching , then no arcs that enter or leave node can be used by matching . The Cardinality constraints enforce the condition that the number of arcs in matching must not exceed . Because it is not necessary to cover each incompatible donor-recipient pair (node), the Packing constraints can be modeled by using set partitioning constraints and the slack variable .
Using PROC OPTMODEL, we can set up and solve the problem as follows:
Solving this problem with direct MILP techniques can be difficult. After ten minutes, we stop the solver with a good solution, but with no proof of optimality.
In this case, the PRESOLVER=BASIC option ensures that the model maintains its specified symmetry, enabling the algorithm to use the aggregate formulation and Ryan-Foster branching. Now we can solve the problem to optimality quickly (in 17 seconds).
As part of its solver toolbox, PROC OPTMODEL provides a suite of network algorithms. One such algorithm can generate all short cycles in a graph, as follows.
This step is very fast (2 seconds).
The constraint (Packing) ensures that each node (incompatible pair) in the graph is intersected at most once. That is, a donor can donate a kidney only once.
This is done in PROC OPTMODEL as follows:
The standard MILP solver solves this problem easily (in 0 seconds).
Acknowledgement: Rob Pratt
Network Model
The problem can be modeled by using a network as follows. Each node corresponds to an incompatible donor-recipient pair, and there is a directed arc from node to node if the donor from node is compatible with the recipient from node . Each arc has a weight that represents the quality of the match, and the goal is to find a maximum weight node-disjoint union of directed cycles. You want the union to be node-disjoint so that no kidney is donated more than once, and you want cycles so that the donor from node gives up a kidney if and only if the recipient from node receives a kidney.Without any other constraints, the problem could be solved as a linear assignment problem, perhaps by using the LINEAR_ASSIGNMENT statement in PROC OPTNET or the SOLVE WITH NETWORK statement in PROC OPTMODEL. But doing so would allow arbitrarily long cycles in the solution. Because of practical considerations (such as travel) and to mitigate risk, each cycle must have no more than links. The kidney exchange problem is to find a maximum-weight node-disjoint union of short directed cycles.
Let us generate an example data set to represent an exchange network with (approximately) 100 incompatible donor-recipient pairs. The following DATA step creates a random graph on nodes with link probability and Uniform(0,1) weight:
/* random graph on n nodes with arc probability p and uniform(0,1) weight */ %let n = 100; %let p = 0.02; data ArcData; do i = 0 to &n - 1; do j = 0 to &n - 1; if i eq j then continue; else if ranuni(1) < &p then do; weight = ranuni(2); output; end; end; end; run; |
Integer Programming Model
Define an index set of candidate disjoint unions of short cycles (called matchings). Let be a binary variable, which, if set to 1, indicates that arc is in a matching . Let be a binary variable that, if set to 1, indicates that node is covered by matching . In addition, let be a binary slack variable that, if set to 1, indicates that node is not covered by any matching.The kidney donor exchange can be formulated as a MILP as follows:
In this formulation, the Packing constraints ensure that each node is covered by at most one matching. The Donate and Receive constraints enforce the condition that if node is covered by matching , then the matching must use exactly one arc that leaves node (Donate) and one arc that enters node (Receive). Conversely, if node is not covered by matching , then no arcs that enter or leave node can be used by matching . The Cardinality constraints enforce the condition that the number of arcs in matching must not exceed . Because it is not necessary to cover each incompatible donor-recipient pair (node), the Packing constraints can be modeled by using set partitioning constraints and the slack variable .
Using PROC OPTMODEL, we can set up and solve the problem as follows:
%let max_length = 10; proc optmodel; set <num,num> ARCS; num weight {ARCS}; read data ArcData into ARCS=[i j] weight; set NODES = union {<i,j> in ARCS} {i,j}; set MATCHINGS = 1..card(NODES)/2; /* UseNode[i,m] = 1 if node i is used in matching m, 0 otherwise */ var UseNode {NODES, MATCHINGS} binary; /* UseArc[i,j,m] = 1 if arc (i,j) is used in matching m, 0 otherwise */ var UseArc {ARCS, MATCHINGS} binary; /* Slack[i]=1 if node i is not covered by any matching */ var Slack {NODES} binary; /* maximize total weight of arcs used */ max TotalWeight = sum {<i,j> in ARCS, m in MATCHINGS} weight[i,j] * UseArc[i,j,m]; /* each node appears in at most one matching */ con node_packing {i in NODES}: sum {m in MATCHINGS} UseNode[i,m] + Slack[i] = 1; /* at most one donee for each donor */ con donate {i in NODES, m in MATCHINGS}: sum {<(i),j> in ARCS} UseArc[i,j,m] = UseNode[i,m]; /* at most one donor for each donee */ con receive {j in NODES, m in MATCHINGS}: sum {<i,(j)> in ARCS} UseArc[i,j,m] = UseNode[j,m]; /* exclude long matchings */ con cardinality {m in MATCHINGS}: sum {<i,j> in ARCS} UseArc[i,j,m] <= &max_length; solve with milp / maxtime=600; |
NOTE: The presolved problem has 5848 variables, 2728 constraints, and 21459 constraint coefficients. NOTE: The MILP solver is called. Node Active Sols BestInteger BestBound Gap Time -- snip -- 0 1 4 3.0229574 29.8635658 89.88% 3 NOTE: The MILP solver added 64 cuts with 3134 cut coefficients at the root. 11 11 6 8.5362428 29.8635658 71.42% 5 86 79 12 25.9768908 29.8635658 13.01% 12 -- snip -- 13300 9177 14 26.0202871 29.3516198 11.35% 596 13379 9231 14 26.0202871 29.3516198 11.35% 599 NOTE: CPU time limit reached. NOTE: Objective of the best integer solution found = 26.020287142.
Dynamic Column Generation (Decomposition)
If we relax the Packing constraint, we can see that the resulting model has a decomposable structure (separable for each matching ). In addition, the matching identifier is arbitrary (each matching problem is identical). This is a good situation for using the DECOMP feature (we have seen this type of structure before, in the Wedding Planner Problem). Let us explicitly define the (subproblem) blocks of the decomposition by using the .block constraint suffix, and solve using DECOMP, as follows:/* decompose by matching (aggregate formulation) */ for {i in NODES, m in MATCHINGS} donate[i,m].block = m; for {j in NODES, m in MATCHINGS} receive[j,m].block = m; for {m in MATCHINGS} cardinality[m].block = m; solve with milp / presolver=basic decomp=(); |
NOTE: All blocks are identical and the master model is set partitioning. NOTE: The Decomposition algorithm is using an aggregate formulation and Ryan-Foster branching. NOTE: The problem has a decomposable structure with 48 blocks. The largest block covers 2.06% of the constraints in the problem. NOTE: The decomposition subproblems cover 9216 (99.32%) variables and 6096 (98.98%) constraints. Iter Best Master Best LP IP CPU Real Bound Objective Integer Gap Gap Time Time -- snip -- 2 388.4996 9.2503 9.2503 97.62% 97.62% 0 0 3 359.1937 10.9240 10.9240 96.96% 96.96% 0 0 -- snip -- 44 26.7804 26.7804 23.4755 0.00% 12.34% 4 3 NOTE: Starting branch and bound. Node Active Sols Best Best Gap CPU Real Integer Bound Time Time 0 1 10 23.4755 26.7804 12.34% 4 4 5 7 11 24.7054 26.5360 6.90% 10 8 -- snip -- 20 0 14 26.0203 26.0203 0.00% 20 17 NOTE: The Decomposition algorithm used 4 threads. NOTE: The Decomposition algorithm time is 17.27 seconds. NOTE: Optimal within relative gap. NOTE: Objective = 26.020287142.
Static Column Generation
The DECOMP method is an automated implementation of a specific variant of a general algorithm known as dynamic column generation. At each major step of the algorithm, an integer program is solved to generate candidate columns that satisfy the subproblem constraints (for each matching). Then a master formulation takes combinations of these column to find a solution that satisfies the master constraints (in this case, the Packing constraints). In certain situations, if the number of feasible columns is small enough and there is an efficient method for generating all of these columns in advance, then using static column generation can work very well. The feasible columns of the Kidney Exchange Problem are (short) cycles in a graph. For reasonably sized graphs, all of these cycles can be generated efficiently in advance.As part of its solver toolbox, PROC OPTMODEL provides a suite of network algorithms. One such algorithm can generate all short cycles in a graph, as follows.
/* generate all cycles with 2 <= length <= max_length */ solve with NETWORK / graph_direction = directed links = (include=ARCS) cycle = (mode=all_cycles minlength=2 maxlength=&max_length) out = (cycles=ID_ORDER_NODE); |
NOTE: The number of nodes in the input graph is 97. NOTE: The number of links in the input graph is 194. NOTE: Processing cycle detection. NOTE: The graph has 224 cycles. NOTE: Processing cycle detection used 2.15 (cpu: 2.17) seconds.Given the set of cycles, we can now formulate a mixed integer linear program (MILP) to maximize the total cycle weight. Let be the set of cycles of appropriate length, be the set of nodes in cycle , be the set of links in cycle , and be the link weight for link . Define a binary decision variable . Set to 1 if cycle is used in the solution; otherwise, set it to 0. Then the following MILP defines the problem that you want to solve (to maximize the quality of the kidney exchange):
The constraint (Packing) ensures that each node (incompatible pair) in the graph is intersected at most once. That is, a donor can donate a kidney only once.
This is done in PROC OPTMODEL as follows:
/* extract <cid,from,to> triples from <cid,order,node> triples */ set <num,num,num> ID_FROM_TO init {}; num last init ., from, to; for {<cid,order,node> in ID_ORDER_NODE} do; from = last; to = node; last = to; if order ne 1 then ID_FROM_TO = ID_FROM_TO union {<cid,from,to>}; end; set CYCLES = setof {<c,i,j> in ID_FROM_TO} c; set ARCS_c {c in CYCLES} = setof {<(c),i,j> in ID_FROM_TO} <i,j>; set NODES_c {c in CYCLES} = union {<i,j> in ARCS_c[c]} {i,j}; set NODES = union {c in CYCLES} NODES_c[c]; num cycle_weight {c in CYCLES} = sum {<i,j> in ARCS_c[c]} weight[i,j]; /* UseCycle[c] = 1 if cycle c is used, 0 otherwise */ var UseCycle {CYCLES} binary; /* maximize total weight of cycles used */ max TotalWeight = sum {c in CYCLES} cycle_weight[c] * UseCycle[c]; /* each node appears in at most one cycle */ con node_packing {i in NODES}: sum {c in CYCLES: i in NODES_c[c]} UseCycle[c] <= 1; solve with milp; |
NOTE: The presolved problem has 224 variables, 28 constraints, and 1266 constraint coefficients. NOTE: The MILP solver is called. Node Active Sols BestInteger BestBound Gap Time -- snip -- 0 1 6 23.3432816 26.0202969 10.29% 0 0 1 7 26.0202872 26.0202872 0.00% 0 NOTE: The MILP solver added 9 cuts with 1111 cut coefficients at the root. NOTE: Optimal. NOTE: Objective = 26.020287205.In 2012, Alvin E. Roth and Lloyd S. Shapley won the Nobel Prize in Economics for their contributions to the theory of stable matchings and the design of markets. An application of their work was the Kidney Exchange Problem described in this blog. A full presentation of the work done by the Alliance for Paired Donation (together with Alvin Roth and teams from Boston College, Stanford, and MIT) can be viewed here.
Acknowledgement: Rob Pratt
2 Comments
Apparently, in real life there are also some altruistic donors who aren't coupled with corresponding incompatible recipients. These donors would appear in your graph as nodes with no inbound arcs and with outbound arcs to compatible recipients. Exchanges involving these donors don't (can't) require closed cycles; the last recipient in the chain doesn't need to be coupled with a donor, so these exchanges would appear in a solution as (short) paths.
You can model this situation by introducing dummy arcs with zero weight back to the altruistic node. The resulting problem then still involves only cycles. Similarly, a recipient without a donor can be modeled by a node that has outbound arcs with zero weight (but only to altruistic nodes).