So you have decided to use Benders' decomposition. Be prepared for what comes next!!!

9th January 2015


Benders' decomposition is a well known decomposition approach that is widely applied to optimisation problems. When J. F. Benders conceived of this idea, the original approach was proposed to solve mixed-integer programming problems with complicating variables. These complicating variables are those, that when fixed, render the remaining problem "easy". As you may presume, there is a specific structure that is particularly amenable to this decomposition approach.

In general, it is not too difficult to find suitable problems to which Benders' decomposition can be applied. However, efficiently solving these problems using this particular decomposition approach is not as straight forward as it may first appear. For this blog I would like to discuss some of my experiences applying Benders' decomposition and provide some tips that could help for other applications.

My applications

My first project as a PhD student involved the development of the recoverable robust tail assignment problem. This was an interesting project to investigate, involving a mathematical formulation that integrated the planning and recovery tail assignment problems. Upon writing down my first attempt at a problem formulation and showing it to my supervisor, Gary Froyland, his first comments were "It looks like you could apply Benders' decomposition". This was the start of my love-hate relationship with this decomposition approach.

That one comment by my supervisor had a very large flow-on effect. One unintended outcome was that I ended up making Benders' decomposition a focus of my PhD thesis and subsequently applied it to other projects I have been involved in. The work completed as part of my thesis involved the application of Benders' decomposition in conjunction with column generation. I don't state this because it is particularly novel or ground-breaking, but because this integration of solution approaches greatly affects the resulting implementation. In another project while working as a postdoc at UNSW I developed a model that was just too large to fit into the variable structures provided by MATLAB. So I decided the best solution to that problem was to apply Benders' decomposition, hence reducing the number of variables in each individual subproblem. I felt that in this case it was absolutely necessary to use some decomposition technique, but I am not too sure whether that should have been Benders'. However, it worked and was successful.

I have to say that I have a strong love-hate relationship with Benders' decomposition. I have used it successfully in many applications, but it always takes a lot of work to be successful. When reading a textbook it can appear like Benders' decomposition is the solution to all your troubles. I have found that deciding to use Benders' is just the start, it is what happens next that will decided whether it is a valuable solution approach. But I hope that I will be able to provide a few tips on what can be done in order to make the job a little easier.

The main issues

There are two main issues that I see with Benders' decomposition: the convergence of the resulting solution algorithm and the implementation. There is a lot that can be done in regards to the former of these issues, which will form the main part of this blog. The latter, however, is a little more tricky. I was once told by a professor that coding and debugging a problem solved by Benders' decomposition is incredibly difficult. This is because you start with an implementation to solve the problem, which is needs to be debugged, then you break it into two parts -- the master and subproblem -- each which need to be debugged, finally the linking between the two also needs to be debugged. If you decided to solve the master and subproblem by column generation, debugging becomes even more onerous. The lesson here is: be very careful when you are implementing a solution algorithm using Benders' decomposition and have very good debugging and version control practices.

Regarding the first of the Benders' decomposition issues presented previously, there are many strategies that can be employed. In the following sections I will briefly discuss some of the techniques successfully applied for my applications. Please note: none of these are a silver bullet. It is likely that some, all or none of these will help with your application.

Two- and Three-phase methods

I really like this algorithm. The three-phase method was developed and implemented across a number of applications, most notably Cordeau et. al. (2001), Mercier et. al. (2005) and Papadakos (2009). It is a really simple and effective idea that helps improve the rate of convergence for the Benders' decomposition solution algorithm. The idea is as follows: 1) relax all integrality constraints in the master and subproblems, solve the relaxed problem to optimality, 2) reintroduce the integrality constraints to the master problem, solve this new problem, retaining all cuts from Phase 1, to optimality, 3) reintroduce integrality constraints to the subproblems and then solve each. At the end of Phase 3 there are two decisions, i) if the subproblems are feasible, then the algorithm has completed, ii) if the subproblems are infeasible, then you add a cut forbidding the master problem solution, remove integrality constraints from the subproblems and return to Phase 2.

The three-phase method is a heuristic approach, since only feasibility of the subproblems is requried. However, it is possible to compute an optimality gap at the end of Phase 3 to determine the accuracy.

The two-phase method was an approach that was applied by myself. The difference from the three-phase method is that integral feasibility of the subproblems is not required. For the application I was considering, integrality in the final phase was not necessary. So it really depends on whether your application as to whether the final stage is required or not.

Pareto-optimal cuts

This is a little gem that works very well without much additional implementation work. The Pareto-optimal cut approach that I have applied is the Magnanti-Wong method (Magnanti and Wong (1981)). The Magnanti-Wong method exploits the fact that for a degenerate subproblem there exists multiple optimal dual solutions. Hence, it is possible to select the dual solution that is the closest to the interior of the master problem polyhedron. Many applications employing Benders' decomposition have implemented a method to generate Pareto optimal cuts. The most prominent examples are those that have also employed the Three-phase method (Cordeau et. al. (2001), Mercier et. al. (2005) and Papadakos (2009)). Additionally, Santoso et. al. (2005) demonstrate its use for a supply chain network design application.

For my applications employing Benders' decomposition, I have generally found the Magnanti-Wong method very useful in improving the convergence solution algorithm. The biggest improvement is seen early in the solution process, where large increases in the lower bound (in a minimisation problem) is observed between iterations. This enhancement approach is generally the first that I will consider when applying Benders' decomposition to a problem.

Trust Region

The trust region approach is a very simple idea that can result in significant reductions in the solution runtime. In general, this approach creates a more restricted version of the original problem through the introduction of additional constraints. The constraints restrict the feasible region to a neighbourhood surrounding a given feasible solution. However, it must be stated that this approach can result in the algorithm terminating with local optimal solutions. Hence, care must be taken when applying this enhancement.

A good example applying a trust region to enhance Benders' decomposition is given by Santoso et. al. (2005). In paper, the trust region is applied in one of two proposed approaches, namely with an additional constraint limiting the "switching" of binary variables. The authors state that to avoid the issues of local optimality the restriction provided by the trust region must be relaxed over the course of the solution algorithm.

My research has also involved employing a trust region approach. In my paper Maher et. al. (2014), I discuss another implementation issue regarding suboptimal master problem solutions. In this case, I state that suboptimal master problem solutions do not affect the accuracy of the solution approach. In fact, it can aid in reducing the number of master problem iterations.

Parallelisation

The final enhancement approach that I would like to discuss it parallelisation of the solution algorithm. For a long time (it can't be too long since I have not long completed my PhD), I was opposed to the use of parallel programming techniques. My criticism was that people reported the use of parallelisation as a contribution of the research. I did not really agree, since this was a computational approach and not really progressing the mathematical programming world. I especially felt this in regards to Benders' decomposition because the solution algorithm is very amenable to parallelisation. Namely, each of the subproblems can be solved on separate compute nodes, drastically reducing the overall solution runtimes.

I was convinced, however, that the use of parallel programming in mathematical programming applications is the way the world is going (or has gone). So, I decided to use parallelisation for a Benders' decomposition application. The end result was expected, and it could be argued that this enhancement was necessary to achieve any meaningful results.

In general, the parallelisation of Benders' decomposition is fairly straight forward. However, the implementation still calls for problem specific enhancements. Questions such as the following need to be asked: in what order do you solve the subproblems? do you only resolve the master problem when all subproblems have been solved? do you concurrently solve the master and subproblems? The method of application can greatly affect the performance of the algorithm with many possible gains to be made.

The wrap-up

My opinion is that enhancements to Benders' decomposition is almost always necessary. That said, it may not be necessary to implement each and every enhancement, so it will pay to choose wisely. In one of my papers I have implemented each of the discussed enhancement techniques, but I did not start with them all. I also have to point to Santoso et. al. (2005), because the authors document a number of different enhancement approaches, some of which are not discussed here. The work of Santoso et. al. (2005) demonstrates the difficulty associated with applying Benders' decomposition and the work that is required to achieve an efficient solution approach. One important question remains, is Benders' decomposition the best approach available, or should other decomposition techniques be investigated?

  • J.-F. Cordeau, G. Stojkovic, F. Soumis, and J. Desrosiers. Benders’ decomposition for simultaneous aircraft routing and crew scheduling. Transportation Science, 35(4):375-388, 2001.
  • T. Magnanti and R. Wong. Accelerating Benders’ decomposition: algorithmic enhancement and model selection criteria. Operations Research, 29(3):464–484, 1981.
  • S. J. Maher, G. Desaulniers and F. Soumis. Recoverable robust single day aircraft maintenance routing problem. Computers & Operations Research, 51:130-145, 2014.
  • A. Mercier, J. Cordeau, and F. Soumis. A computational study of Benders’ decomposition for the integrated aircraft routing and crew scheduling problem. Computers & Operations Research, 32(6):1451–1476, 2005.
  • N. Papadakos. Integrated airline scheduling. Computers & Operations Research, 36(1):176–195, 2009.
  • T. Santoso, S. Ahmed, M. Goetschalckx, and A. Shapiro. A stochastic programming approach for supply chain network design under uncertainty. European Journal of Operational Research, 167(1):96–115, 2005.