
Network Model
The problem can be modeled by using a network as follows. Each node corresponds to an incompatible donor-recipient pair, and there is a directed arc from node i to node j if the donor from node i is compatible with the recipient from node j. Each arc has a weight that represents the quality of the match, and the goal is to find a maximum weight node-disjoint union of directed cycles. You want the union to be node-disjoint so that no kidney is donated more than once, and you want cycles so that the donor from node i gives up a kidney if and only if the recipient from node i receives a kidney.Without any other constraints, the problem could be solved as a linear assignment problem, perhaps by using the LINEAR_ASSIGNMENT statement in PROC OPTNET or the SOLVE WITH NETWORK statement in PROC OPTMODEL. But doing so would allow arbitrarily long cycles in the solution. Because of practical considerations (such as travel) and to mitigate risk, each cycle must have no more than L links. The kidney exchange problem is to find a maximum-weight node-disjoint union of short directed cycles.
Let us generate an example data set to represent an exchange network with (approximately) 100 incompatible donor-recipient pairs. The following DATA step creates a random graph on nodes with link probability and Uniform(0,1) weight:
/* random graph on n nodes with arc probability p and uniform(0,1) weight */ %let n = 100; %let p = 0.02; data ArcData; do i = 0 to &n - 1; do j = 0 to &n - 1; if i eq j then continue; else if ranuni(1) < &p then do; weight = ranuni(2); output; end; end; end; run; |
Integer Programming Model
Define an index set M={1,…,|N|/2} of candidate disjoint unions of short cycles (called matchings). Let xijm be a binary variable, which, if set to 1, indicates that arc (i,j) is in a matching m. Let yim be a binary variable that, if set to 1, indicates that node i is covered by matching m. In addition, let si be a binary slack variable that, if set to 1, indicates that node i is not covered by any matching.The kidney donor exchange can be formulated as a MILP as follows:
maximize∑(i,j)∈A∑m∈Mwijxijmsubject to∑m∈Myim+si=1i∈N(Packing)∑(i,j)∈Axijm=yimi∈N, m∈M(Donate)∑(i,j)∈Axijm=yjmj∈N, m∈M(Receive)∑(i,j)∈Axijm≤Lm∈M(Cardinality)xijm∈{0,1}(i,j)∈A, m∈Myim∈{0,1}i∈N, m∈Msi∈{0,1}i∈N
In this formulation, the Packing constraints ensure that each node is covered by at most one matching. The Donate and Receive constraints enforce the condition that if node i is covered by matching m, then the matching m must use exactly one arc that leaves node i (Donate) and one arc that enters nodei (Receive). Conversely, if node i is not covered by matching m, then no arcs that enter or leave node i can be used by matching m. The Cardinality constraints enforce the condition that the number of arcs in matching m must not exceed L. Because it is not necessary to cover each incompatible donor-recipient pair (node), the Packing constraints can be modeled by using set partitioning constraints and the slack variable s.
Using PROC OPTMODEL, we can set up and solve the problem as follows:
%let max_length = 10; proc optmodel; set <num,num> ARCS; num weight {ARCS}; read data ArcData into ARCS=[i j] weight; set NODES = union {<i,j> in ARCS} {i,j}; set MATCHINGS = 1..card(NODES)/2; /* UseNode[i,m] = 1 if node i is used in matching m, 0 otherwise */ var UseNode {NODES, MATCHINGS} binary; /* UseArc[i,j,m] = 1 if arc (i,j) is used in matching m, 0 otherwise */ var UseArc {ARCS, MATCHINGS} binary; /* Slack[i]=1 if node i is not covered by any matching */ var Slack {NODES} binary; /* maximize total weight of arcs used */ max TotalWeight = sum {<i,j> in ARCS, m in MATCHINGS} weight[i,j] * UseArc[i,j,m]; /* each node appears in at most one matching */ con node_packing {i in NODES}: sum {m in MATCHINGS} UseNode[i,m] + Slack[i] = 1; /* at most one donee for each donor */ con donate {i in NODES, m in MATCHINGS}: sum {<(i),j> in ARCS} UseArc[i,j,m] = UseNode[i,m]; /* at most one donor for each donee */ con receive {j in NODES, m in MATCHINGS}: sum {<i,(j)> in ARCS} UseArc[i,j,m] = UseNode[j,m]; /* exclude long matchings */ con cardinality {m in MATCHINGS}: sum {<i,j> in ARCS} UseArc[i,j,m] <= &max_length; solve with milp / maxtime=600; |
NOTE: The presolved problem has 5848 variables, 2728 constraints, and 21459 constraint coefficients. NOTE: The MILP solver is called. Node Active Sols BestInteger BestBound Gap Time -- snip -- 0 1 4 3.0229574 29.8635658 89.88% 3 NOTE: The MILP solver added 64 cuts with 3134 cut coefficients at the root. 11 11 6 8.5362428 29.8635658 71.42% 5 86 79 12 25.9768908 29.8635658 13.01% 12 -- snip -- 13300 9177 14 26.0202871 29.3516198 11.35% 596 13379 9231 14 26.0202871 29.3516198 11.35% 599 NOTE: CPU time limit reached. NOTE: Objective of the best integer solution found = 26.020287142.
Dynamic Column Generation (Decomposition)
If we relax the Packing constraint, we can see that the resulting model has a decomposable structure (separable for each matching m). In addition, the matching identifier is arbitrary (each matching problem is identical). This is a good situation for using the DECOMP feature (we have seen this type of structure before, in the Wedding Planner Problem). Let us explicitly define the (subproblem) blocks of the decomposition by using the .block constraint suffix, and solve using DECOMP, as follows:/* decompose by matching (aggregate formulation) */ for {i in NODES, m in MATCHINGS} donate[i,m].block = m; for {j in NODES, m in MATCHINGS} receive[j,m].block = m; for {m in MATCHINGS} cardinality[m].block = m; solve with milp / presolver=basic decomp=(); |
NOTE: All blocks are identical and the master model is set partitioning. NOTE: The Decomposition algorithm is using an aggregate formulation and Ryan-Foster branching. NOTE: The problem has a decomposable structure with 48 blocks. The largest block covers 2.06% of the constraints in the problem. NOTE: The decomposition subproblems cover 9216 (99.32%) variables and 6096 (98.98%) constraints. Iter Best Master Best LP IP CPU Real Bound Objective Integer Gap Gap Time Time -- snip -- 2 388.4996 9.2503 9.2503 97.62% 97.62% 0 0 3 359.1937 10.9240 10.9240 96.96% 96.96% 0 0 -- snip -- 44 26.7804 26.7804 23.4755 0.00% 12.34% 4 3 NOTE: Starting branch and bound. Node Active Sols Best Best Gap CPU Real Integer Bound Time Time 0 1 10 23.4755 26.7804 12.34% 4 4 5 7 11 24.7054 26.5360 6.90% 10 8 -- snip -- 20 0 14 26.0203 26.0203 0.00% 20 17 NOTE: The Decomposition algorithm used 4 threads. NOTE: The Decomposition algorithm time is 17.27 seconds. NOTE: Optimal within relative gap. NOTE: Objective = 26.020287142.
Static Column Generation
The DECOMP method is an automated implementation of a specific variant of a general algorithm known as dynamic column generation. At each major step of the algorithm, an integer program is solved to generate candidate columns that satisfy the subproblem constraints (for each matching). Then a master formulation takes combinations of these column to find a solution that satisfies the master constraints (in this case, the Packing constraints). In certain situations, if the number of feasible columns is small enough and there is an efficient method for generating all of these columns in advance, then using static column generation can work very well. The feasible columns of the Kidney Exchange Problem are (short) cycles in a graph. For reasonably sized graphs, all of these cycles can be generated efficiently in advance.As part of its solver toolbox, PROC OPTMODEL provides a suite of network algorithms. One such algorithm can generate all short cycles in a graph, as follows.
/* generate all cycles with 2 <= length <= max_length */ solve with NETWORK / graph_direction = directed links = (include=ARCS) cycle = (mode=all_cycles minlength=2 maxlength=&max_length) out = (cycles=ID_ORDER_NODE); |
NOTE: The number of nodes in the input graph is 97. NOTE: The number of links in the input graph is 194. NOTE: Processing cycle detection. NOTE: The graph has 224 cycles. NOTE: Processing cycle detection used 2.15 (cpu: 2.17) seconds.Given the set of cycles, we can now formulate a mixed integer linear program (MILP) to maximize the total cycle weight. Let C be the set of cycles of appropriate length, Nc be the set of nodes in cycle c, Ac be the set of links in cycle c, and wij be the link weight for link (i,j). Define a binary decision variable xc. Set xc to 1 if cycle c is used in the solution; otherwise, set it to 0. Then the following MILP defines the problem that you want to solve (to maximize the quality of the kidney exchange):
maximize∑c∈C(∑(i,j)∈Acwij)xcsubject to∑c∈C:i∈Ncxc≤1i∈N(Packing)xc∈{0,1}c∈C
The constraint (Packing) ensures that each node (incompatible pair) in the graph is intersected at most once. That is, a donor can donate a kidney only once.
This is done in PROC OPTMODEL as follows:
/* extract <cid,from,to> triples from <cid,order,node> triples */ set <num,num,num> ID_FROM_TO init {}; num last init ., from, to; for {<cid,order,node> in ID_ORDER_NODE} do; from = last; to = node; last = to; if order ne 1 then ID_FROM_TO = ID_FROM_TO union {<cid,from,to>}; end; set CYCLES = setof {<c,i,j> in ID_FROM_TO} c; set ARCS_c {c in CYCLES} = setof {<(c),i,j> in ID_FROM_TO} <i,j>; set NODES_c {c in CYCLES} = union {<i,j> in ARCS_c[c]} {i,j}; set NODES = union {c in CYCLES} NODES_c[c]; num cycle_weight {c in CYCLES} = sum {<i,j> in ARCS_c[c]} weight[i,j]; /* UseCycle[c] = 1 if cycle c is used, 0 otherwise */ var UseCycle {CYCLES} binary; /* maximize total weight of cycles used */ max TotalWeight = sum {c in CYCLES} cycle_weight[c] * UseCycle[c]; /* each node appears in at most one cycle */ con node_packing {i in NODES}: sum {c in CYCLES: i in NODES_c[c]} UseCycle[c] <= 1; solve with milp; |
NOTE: The presolved problem has 224 variables, 28 constraints, and 1266 constraint coefficients. NOTE: The MILP solver is called. Node Active Sols BestInteger BestBound Gap Time -- snip -- 0 1 6 23.3432816 26.0202969 10.29% 0 0 1 7 26.0202872 26.0202872 0.00% 0 NOTE: The MILP solver added 9 cuts with 1111 cut coefficients at the root. NOTE: Optimal. NOTE: Objective = 26.020287205.In 2012, Alvin E. Roth and Lloyd S. Shapley won the Nobel Prize in Economics for their contributions to the theory of stable matchings and the design of markets. An application of their work was the Kidney Exchange Problem described in this blog. A full presentation of the work done by the Alliance for Paired Donation (together with Alvin Roth and teams from Boston College, Stanford, and MIT) can be viewed here.
Acknowledgement: Rob Pratt
2 Comments
Apparently, in real life there are also some altruistic donors who aren't coupled with corresponding incompatible recipients. These donors would appear in your graph as nodes with no inbound arcs and with outbound arcs to compatible recipients. Exchanges involving these donors don't (can't) require closed cycles; the last recipient in the chain doesn't need to be coupled with a donor, so these exchanges would appear in a solution as (short) paths.
You can model this situation by introducing dummy arcs with zero weight back to the altruistic node. The resulting problem then still involves only cycles. Similarly, a recipient without a donor can be modeled by a node that has outbound arcs with zero weight (but only to altruistic nodes).