ML4CO Competition logo

Introduction

The Machine Learning for Combinatorial Optimization (ML4CO) NeurIPS 2021 competition aims at improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components with machine learning models. The competition's main scientific question is the following: is machine learning a viable option for improving traditional combinatorial optimization solvers on specific problem distributions, when historical data is available ?

NeurIPS logo

While most combinatorial optimization solvers are presented as general-purpose, one-size-fits-all algorithms, the ML4CO competition focuses on the design of application-specific algorithms from historical data. This general problem captures a highly practical scenario relevant to many application areas, where a practitioner repeatedly solves problem instances from a specific distribution, with redundant patterns and characteristics. For example, managing a large-scale energy distribution network requires solving very similar CO problems on a daily basis, with a fixed power grid structure while only the demand changes over time. This change of demand is hard to capture by hand-engineered expert rules, and ML-enhanced approaches offer a possible solution to detect typical patterns in the demand history. Other examples include crew scheduling problems that have to be solved daily or weekly with minor variations, or vehicle routing where the traffic conditions change over time, but the overall transportation network does not.

SCIP logo

The competition features three challenges for ML, each corresponding to a specific control task arising in the open-source solver SCIP, and exposed through a unified OpenAI-gym API based on the Python library Ecole. For each challenge, participants will be evaluated on three problem benchmarks originating from diverse application areas, each represented as a collection of mixed-integer linear program (MILP) instances. Benchmark-specific submissions are encouraged (algorithms or trained ML models), but of course generic submissions that work for all three benchmarks are allowed.

Note: while we encourage solutions derived from the machine learning and reinforcement learning paradigms, any algorithmic solution respecting the competition's API is accepted.

Ecole logo

Important dates: the competition is scheduled to start on July 1st, 2021, when we release the training datasets, and will end on October 31st, 2021.

This event is sponsored by Compute Canada, Calcul Québec and Westgrid who graciously provide the infrastructure and compute ressources to run the competition.

Compute Canada logo
Calcul Québec logo
Westgrid logo

Challenges

We propose three distinct challenges, each corresponding to a specific control task arising in traditional solvers. Participants can compete in any subset of those challenges, and one winner will be declared for each.

  • Primal task

    Produce feasible solutions, in order to minimize the primal integral over time.

  • Dual task

    Select branching variables, in order to minimize dual the dual integral over time.

  • Configuration task

    Choose the best solver parameters, in order to minimize the primal-dual integral over time.

Primal-dual bound solving curves
Primal and dual bounds evolution vs solving time (for a minimization MILP instance). Note that this figure shows the primal and dual integrals minus \(T\mathbf{c}^\top \mathbf{x}^\star\), which takes a constant value for a particular instance.

Primal task - Finding feasible solutions

The primal task deals with finding good primal solutions at the root node of the branch-and-bound tree. To that end, the environment (SCIP solver) will not perform any branching but will enter an infinite loop at the root node, which will collect the agent's solutions, evaluate their feasibility, and update the overall best solution reached so far, thus lowering the current primal bound (upper bound). The metric of interest for this task is the primal integral, which takes into account the speed at which the primal bound decreases over time. To model a realistic scenario, each problem instance will have been preprocessed by SCIP (problem reduction, cutting planes etc.), and the root linear program (LP) relaxation will have been solved before the participants are asked to produce feasible solutions. To prevent SCIP from searching for primal solutions by itself, all primal heuristics will be deactivated. In order to compute this metric unambiguously, even when no solution has been found yet, we provide an initial primal bound (trivial solution value) for each instance, which is to be given to SCIP at the beginning of the solving process in the form of a user objective limit.

Environment
The solver is stopped at the root node after the root LP has been solved and enters an infinite loop. At each transition, the environment evaluates the given solution candidate and updates the primal bound. Nothing else happens. The solver remains in the SOLVING stage until the episode terminates (time limit reached, or when the problem is solved).
Action
A solution candidate for the current problem (variable assignment values, x).
Metric
Primal integral.
Time limit
\(T=5\) minutes.
Relevant literature:

Dual task - Branching

The dual task deals with obtaining tight optimality guarantees (dual bounds) via branching. Making good branching decisions is regarded as a critical component of modern branch-and-bound solvers, yet has received little theoretical understanding to this day (see Lodi et al., 2017). In this task, the environment will run a full-fledged branch-and-cut algorithm with SCIP, and the participants will only control the solver's branching decisions. The metric of interest is the dual integral, which considers the speed at which the dual bound increases over time. Also, all primal heuristics will be deactivated, so that the focus is only on proving optimality via branching.

Environment
Traditional branch-and-bound algorithm. The solver stops after each node is processed (LP solved), receives a branching decision, performs branching, and selects the next open node to process. The solver remains in the SOLVING stage until the episode terminates (time limit reached, or when the problem is solved).
Action
One of the current node's branching candidates (non-fixed integer variables). Only single-variable branching is allowed.
Metric
Dual integral.
Time limit
\(T=15\) minutes.
Relevant literature:

Configuration task - Choosing solver parameters

The configuration task deals with deciding on a good parameterization of the solver for a given problem instance. The environment required for this task is more straightforward than for the two previous ones since it involves only a single decision for the agents (i.e., contextual bandit problem). Participants are allowed to tune any of the existing parameters of SCIP (except parameters regarding the computation of time). They can choose between providing a fixed set of parameters that work well on average for each problem benchmark or producing instance-specific parameterizations based on each instance's characteristics. The metric of interest for this task is the primal-dual gap integral, which combines both improvements from the dual and from the primal side over time. In order to compute this metric unambiguously, even when no primal or dual bound exists, we provide both an initial primal bound (trivial solution value) and an initial dual bound (pre-computed root LP solution value) for each instance.

Environment
The solver loads a problem instance and immediately stops. It then receives a parameterization, applies it, and pursues with the solving process until it terminates (time limit reached, or when the problem is solved). During the first and only transition, the solver is in the PROBLEM stage. As such, participants will not be able to extract advanced features from the solver, such as those relying on the LP being solved.
Action
A set of SCIP parameters, in the form of a key/value dictionary.
Metric
Primal-dual gap integral.
Time limit
\(T=15\) minutes.
Relevant literature

Metrics

Each of the three challenges, the primal task, the dual task and the configuration task, is associated with a specific evaluation metric that reflects a different objective. Here we describe how each metric is computed over a single problem instance, while the final goal of the participants is to optimize this metric in expectation over a hidden collection of test instances.

Because our evaluation metrics are time-dependent, all evaluations will be run on the same hardware setup, including one CPU core and one GPU card, which will be communicated at the start of the competition. For practical reasons, for each task, a maximum time budget \(T\) is given to process each test instance (5, 15 and 15 minutes respectively for the primal, dual and configuration tasks), after which the environment necessarily terminates. Participants should therefore focus on taking both good and fast decisions.

In the following we consider instances of mixed-integer linear programs (MILPs) expressed as follows: \[ \begin{eqnarray} \underset{\mathbf{x}}{\operatorname{arg\,min}} \quad \mathbf{c}^\top\mathbf{x} && \\ \text{subject to} \quad \mathbf{A}^\top\mathbf{x} & \leq & \mathbf{b} \text{,} \\ \mathbf{x} & \in & \mathbb{Z}^p \times \mathbb{R}^{n-p} \text{,} \end{eqnarray} \] where \(\mathbf{c} \in \mathbb{R}^n\) denotes the coefficients of the linear objective, \(\mathbf{A} \in \mathbb{R}^{m \times n}\) and \(\mathbf{b} \in \mathbb{R}^m\) respectively denote the coefficients and upper bounds of the linear constraints, while \(n\) is the total number of variables, \(p \leq n\) is the number of integer-constrained variables, and \(m\) the number of linear constraints.

Primal integral

This objective measures the area under the curve of the solver's primal bound (a.k.a. global upper bound), which corresponds to the value of the best feasible solution found so far. By providing better feasible solutions over time, the value of the primal bound decreases. With a time limit \(T\), the primal integral expresses as: \[ \int_{t=0}^{T} \mathbf{c}^\top \mathbf{x}^\star_t\;\mathrm{d}t - T\mathbf{c}^\top\mathbf{x}^\star \text{,} \] where \(\mathbf{x}^\star_t\) is the best feasible solution found at time \(t\) (so that \(\mathbf{c}^\top \mathbf{x}^\star_t\) is the primal bound at time \(t\)), and \(T\mathbf{c}^\top\mathbf{x}^\star\) is an instance-specific constant that depends on the optimal solution \(\mathbf{x}^\star\). The primal integral is to be minimized, and takes an optimal value of 0.

Note: to compute this metric unambiguously, a trivial initial solution \(x^\star_0\) is always provided to SCIP at the beginning of the solving process. Also, the constant term \(\mathbf{c}^\top\mathbf{x}^\star\) can be safely ignored at training time when participants train their control policy, as the learning problem is equivalent. At test time however, when we will evaluate the participant submissions, this constant term (or a proper substitute) will be incorporated in the reported metric.

Primal integral illustration
Dual integral

This objective measures the area over the curve of the solver's dual bound (a.k.a. global lower bound), which usually corresponds to a solution of a valid relaxation of the MILP. By branching, the LP relaxations corresponding to the branch-and-bound tree leaves get tightened, and the dual bound increases over time. With a time limit \(T\), the dual integral expresses as: \[ T\mathbf{c}^\top\mathbf{x}^\star - \int_{t=0}^{T} \mathbf{z}^\star_t\;\mathrm{d}t \text{,} \] where \(\mathbf{z}^\star_t\) is the best dual bound at time \(t\), and \(T\mathbf{c}^\top\mathbf{x}^\star\) is an instance-specific constant that depends on the optimal solution value \(\mathbf{c}^\top\mathbf{x}^\star\). The dual integral is to be minimized, and takes an optimal value of 0.

Note: in the context of branching, this metric is unambiguous to compute, as the root node of the tree always provides an initial dual bound \(\mathbf{z}^*_0\) at the beginning of branching. Here again, the constant term \(\mathbf{c}^\top\mathbf{x}^\star\) can be safely ignored for training, but it will be incorporated (or a proper substitute) in the metrics that we will report when evaluating the participants submissions.

Dual integral illustration
Primal-dual gap integral

This objective measures the area between two curves, the solver's primal bound and the solver's dual bound. As such, this metric benefits both from improvements obtained from the primal side (finding good feasible solutions), and on the dual side (producing a tight optimality certificate). With a time limit \(T\), the primal-dual gap integral expresses as: \[ \int_{t=0}^{T} \mathbf{c}^\top \mathbf{x}^\star_t - \mathbf{z}^\star_t\;\mathrm{d}t \text{.} \] The primal-dual gap integral is to be minimized, and take an optimal value of 0.

Note: in the context of algorithm configuration, an initial value is required for the two curves at time \(t=0\). Therefore we always provide both an initial trivial solution \(x^\star_0\) and a valid initial dual bound \(z^\star_0\) to SCIP for this task.

Primal-dual gap integral illustration

Datasets

For each individual challenge, participants will be evaluated on three problem benchmarks from diverse application areas. Participants can provide a different decision-making code (algorithmic solution or trained ML model) for each of the benchmarks, or a single code that works for all benchmarks.

A problem benchmark consists in a collection of mixed-integer linear program (MILP) instances in the standard MPS file format. Each benchmark is split into a training and a test set, originating from the same problem distribution. While the training instances are made public at the beginning of the competition for participants to train their models, the test instances will be kept hidden for evaluation purposes and will only be revealed at the end of the competition.

Note: for each problem benchmark we suggest a pre-established split of the training instances into two distinct collections: train and valid. This is only a suggestion, and all instance files in both those collections can be likely considered as training instances. There is no restriction regarding how participants should use the provided instances during training.

The first two problem benchmarks are inspired by real-life applications of large-scale systems at Google, while the third benchmark is an anonymous problem inspired by a real-world, large-scale industrial application.

Problem benchmark 1: Balanced Item Placement
This problem deals with spreading items (e.g., files or processes) across containers (e.g., disks or machines) utilizing them evenly. Items can have multiple copies, but at most, one copy can be placed in a single bin. The number of items that can be moved is constrained, modeling the real-life situation of a live system for which some placement already exists. Each problem instance is modeled as a MILP, using a multi-dimensional multi-knapsack formulation. This dataset contains 10000 training instances (pre-split into 9900 train and 100 valid instances).
Problem benchmark 2: Workload Apportionment
This problem deals with apportioning workloads (e.g., data streams) across as few workers (e.g., servers) as possible. The apportionment is required to be robust to any one worker's failure. Each instance problem is modeled as a MILP, using a bin-packing with apportionment formulation. This dataset contains 10000 training instances (pre-split into 9900 train and 100 valid instances).
Problem benchmark 3: Anonymous Problem
The MILP instances corresponding to this benchmark are assembled from a public dataset, whose origin is kept secret to prevent cheating. Reverse-engineering for the purpose of recovering the test set is explicitly forbidden (see rules). This dataset contains 118 training instances (pre-split into 98 train and 20 valid instances).

Rules

The spirit of the competition is as follows:

  • we encourage contributions from both the OR and ML communities by explicitly allowing non-ML submissions;
  • we encourage student participation with a dedicated student leaderboard;
  • we encourage original contributions by specifically making the use of highly-engineered commercial solvers prohibited.

Participants must adhere to the following rules:

Eligibility
  1. Participants can work in teams.
  2. Participants can submit codes based on a machine learning model, a human-designed algorithm, or any combination of the two.
  3. Participants are not allowed to use commercial software in their code. The use of solvers such as CPLEX, Gurobi, or FICO XPress is explicitly forbidden.
  4. Student-only teams are eligible to a separate student leaderboard, in addition to the main leaderboard of the competition. A graduated PhD is not considered a student.
Data
  1. The use of supplementary existing open-source datasets, e.g., for pre-training, is permitted provided they are credited.
  2. The use of private, proprietary datasets is not permitted.
  3. In order to prevent cheating, the evaluation data sets will be kept hidden to the participants during the competition.
  4. Participants are free and encouraged to extract original features from the solver for training and evaluation, though the Python interface of SCIP, PySCIPOpt.
Submission
  1. The submitted codes must be written in Python, and must run in the evaluation environment which we provide.
  2. Parallel implementations using multiple processes or multiple threads are allowed, although the hardware on which the codes will be evaluated is single-core.
  3. The evaluation environment does not have access to any external network, to prevent any information leak.
  4. Users may make use of open source libraries given proper attribution. At the end of the competition, we encourage all code to be open-sourced so the results can be reproduced.
  5. The evaluation environment will provide a basic ML ecosystem (Tensorflow, Pytorch, Scikit-learn). Additional libraries will be added to the evaluation environment upon reasonable requests.
  6. At the end of the competition, the winners are expected to provide a description of the method they used.
Cheating prevention
  1. Participants are forbidden to interact with the SCIP solver in any other way than the one intended, that is, extracting information or providing a control decision related to the task at hand. Any interaction with SCIP that results in changes of the solver's internal state will be considered cheating.
  2. Some datasets have been anonymized, as they rely on publicly accessible data. Any attempt at reverse-engineering for the purpose of recovering information about the test set is forbidden.
  3. Any instance of cheating, defined as violation of the rules above or any other attempt to circumvent the spirit and intent of the competition, as determined by the organizers, will result in the invalidation of all of the offending team’s submissions and a disqualification from the competition.

Getting started

Registration form: register your team and subscribe to the competition's mailing list.

GitHub repository: get started and access the datasets, started packages, and technical documentation.

Leaderboard

Student leaderboard
To be announced.
Global leaderboard
To be announced.