Introduction
The Machine Learning for Combinatorial Optimization (ML4CO) NeurIPS 2021 competition aims at improving stateoftheart combinatorial optimization solvers by replacing key heuristic components with machine learning models. The competition's main scientific question is the following: is machine learning a viable option for improving traditional combinatorial optimization solvers on specific problem distributions, when historical data is available ?
While most combinatorial optimization solvers are presented as generalpurpose, onesizefitsall algorithms, the ML4CO competition focuses on the design of applicationspecific algorithms from historical data. This general problem captures a highly practical scenario relevant to many application areas, where a practitioner repeatedly solves problem instances from a specific distribution, with redundant patterns and characteristics. For example, managing a largescale energy distribution network requires solving very similar CO problems on a daily basis, with a fixed power grid structure while only the demand changes over time. This change of demand is hard to capture by handengineered expert rules, and MLenhanced approaches offer a possible solution to detect typical patterns in the demand history. Other examples include crew scheduling problems that have to be solved daily or weekly with minor variations, or vehicle routing where the traffic conditions change over time, but the overall transportation network does not.
The competition features three challenges for ML, each corresponding to a specific control task arising in the opensource solver SCIP, and exposed through a unified OpenAIgym API based on the Python library Ecole. For each challenge, participants will be evaluated on three problem benchmarks originating from diverse application areas, each represented as a collection of mixedinteger linear program (MILP) instances. Benchmarkspecific submissions are encouraged (algorithms or trained ML models), but of course generic submissions that work for all three benchmarks are allowed.
Note: while we encourage solutions derived from the machine learning and reinforcement learning paradigms, any algorithmic solution respecting the competition's API is accepted.
Important dates: the competition is scheduled to start on July 1st, 2021, when we release the training datasets, and will end on October 31st, 2021.
This event is sponsored by Compute Canada, Calcul Québec and Westgrid who graciously provide the infrastructure and compute ressources to run the competition.
Challenges
We propose three distinct challenges, each corresponding to a specific control task arising in traditional solvers. Participants can compete in any subset of those challenges, and one winner will be declared for each.

Primal task
Produce feasible solutions, in order to minimize the primal integral over time.

Dual task
Select branching variables, in order to minimize dual the dual integral over time.

Configuration task
Choose the best solver parameters, in order to minimize the primaldual integral over time.
Primal and dual bounds evolution vs solving time
(for a minimization MILP instance).
Note that this figure
shows the primal and dual integrals minus
\(T\mathbf{c}^\top \mathbf{x}^\star\), which takes a
constant value for a particular instance.
Primal task  Finding feasible solutions
The primal task deals with finding good primal solutions at the root node of the branchandbound tree. To that end, the environment (SCIP solver) will not perform any branching but will enter an infinite loop at the root node, which will collect the agent's solutions, evaluate their feasibility, and update the overall best solution reached so far, thus lowering the current primal bound (upper bound). The metric of interest for this task is the primal integral, which takes into account the speed at which the primal bound decreases over time. To model a realistic scenario, each problem instance will have been preprocessed by SCIP (problem reduction, cutting planes etc.), and the root linear program (LP) relaxation will have been solved before the participants are asked to produce feasible solutions. To prevent SCIP from searching for primal solutions by itself, all primal heuristics will be deactivated. In order to compute this metric unambiguously, even when no solution has been found yet, we provide an initial primal bound (trivial solution value) for each instance, which is to be given to SCIP at the beginning of the solving process in the form of a user objective limit.
 Environment
 The solver is stopped at the root node after the root LP has been solved and enters an infinite loop. At each transition, the environment evaluates the given solution candidate and updates the primal bound. Nothing else happens. The solver remains in the SOLVING stage until the episode terminates (time limit reached, or when the problem is solved).
 Action
 A solution candidate for the current problem (variable assignment values, x).
 Metric
 Primal integral.
 Time limit
 \(T=5\) minutes.
 Relevant literature:

 Learning Combinatorial Optimization Algorithms over Graphs , Dai et al., NeurIPS'17
 Reinforcement Learning for Solving the Vehicle Routing Problem , Nazari et al., NeurIPS'18
 Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search , Li et al., NeurIPS'18
 Solving Mixed Integer Programs Using Neural Networks , Nair et al., 2020
Dual task  Branching
The dual task deals with obtaining tight optimality guarantees (dual bounds) via branching. Making good branching decisions is regarded as a critical component of modern branchandbound solvers, yet has received little theoretical understanding to this day (see Lodi et al., 2017). In this task, the environment will run a fullfledged branchandcut algorithm with SCIP, and the participants will only control the solver's branching decisions. The metric of interest is the dual integral, which considers the speed at which the dual bound increases over time. Also, all primal heuristics will be deactivated, so that the focus is only on proving optimality via branching.
 Environment
 Traditional branchandbound algorithm. The solver stops after each node is processed (LP solved), receives a branching decision, performs branching, and selects the next open node to process. The solver remains in the SOLVING stage until the episode terminates (time limit reached, or when the problem is solved).
 Action
 One of the current node's branching candidates (nonfixed integer variables). Only singlevariable branching is allowed.
 Metric
 Dual integral.
 Time limit
 \(T=15\) minutes.
 Relevant literature:

 Learning to Branch in Mixed Integer Programming , Khalil et al., AAAI'16
 On Learning and Branching: a Survey , Lodi et al., TOP, 2017
 Learning to Branch , Balcan et al., ICML'18
 Exact Combinatorial Optimization with Graph Convolutional Neural Networks , Gasse et al., NeurIPS'19
 Hybrid Models for Learning to Branch , Gupta et al., NeurIPS'20
 Solving Mixed Integer Programs Using Neural Networks , Nair et al., 2020
Configuration task  Choosing solver parameters
The configuration task deals with deciding on a good parameterization of the solver for a given problem instance. The environment required for this task is more straightforward than for the two previous ones since it involves only a single decision for the agents (i.e., contextual bandit problem). Participants are allowed to tune any of the existing parameters of SCIP (except parameters regarding the computation of time). They can choose between providing a fixed set of parameters that work well on average for each problem benchmark or producing instancespecific parameterizations based on each instance's characteristics. The metric of interest for this task is the primaldual gap integral, which combines both improvements from the dual and from the primal side over time. In order to compute this metric unambiguously, even when no primal or dual bound exists, we provide both an initial primal bound (trivial solution value) and an initial dual bound (precomputed root LP solution value) for each instance.
 Environment
 The solver loads a problem instance and immediately stops. It then receives a parameterization, applies it, and pursues with the solving process until it terminates (time limit reached, or when the problem is solved). During the first and only transition, the solver is in the PROBLEM stage. As such, participants will not be able to extract advanced features from the solver, such as those relying on the LP being solved.
 Action
 A set of SCIP parameters, in the form of a key/value dictionary.
 Metric
 Primaldual gap integral.
 Time limit
 \(T=15\) minutes.
 Relevant literature

 Sequential modelbased optimization for general algorithm configuration , Hutter et al., LION'11
Metrics
Each of the three challenges, the primal task, the dual task and the configuration task, is associated with a specific evaluation metric that reflects a different objective. Here we describe how each metric is computed over a single problem instance, while the final goal of the participants is to optimize this metric in expectation over a hidden collection of test instances.
Because our evaluation metrics are timedependent, all evaluations will be run on the same hardware setup, including one CPU core and one GPU card, which will be communicated at the start of the competition. For practical reasons, for each task, a maximum time budget \(T\) is given to process each test instance (5, 15 and 15 minutes respectively for the primal, dual and configuration tasks), after which the environment necessarily terminates. Participants should therefore focus on taking both good and fast decisions.
In the following we consider instances of mixedinteger linear programs (MILPs) expressed as follows: \[ \begin{eqnarray} \underset{\mathbf{x}}{\operatorname{arg\,min}} \quad \mathbf{c}^\top\mathbf{x} && \\ \text{subject to} \quad \mathbf{A}^\top\mathbf{x} & \leq & \mathbf{b} \text{,} \\ \mathbf{x} & \in & \mathbb{Z}^p \times \mathbb{R}^{np} \text{,} \end{eqnarray} \] where \(\mathbf{c} \in \mathbb{R}^n\) denotes the coefficients of the linear objective, \(\mathbf{A} \in \mathbb{R}^{m \times n}\) and \(\mathbf{b} \in \mathbb{R}^m\) respectively denote the coefficients and upper bounds of the linear constraints, while \(n\) is the total number of variables, \(p \leq n\) is the number of integerconstrained variables, and \(m\) the number of linear constraints.
 Primal integral

This objective measures the area under the curve of the solver's primal bound (a.k.a. global upper bound), which corresponds to the value of the best feasible solution found so far. By providing better feasible solutions over time, the value of the primal bound decreases. With a time limit \(T\), the primal integral expresses as: \[ \int_{t=0}^{T} \mathbf{c}^\top \mathbf{x}^\star_t\;\mathrm{d}t  T\mathbf{c}^\top\mathbf{x}^\star \text{,} \] where \(\mathbf{x}^\star_t\) is the best feasible solution found at time \(t\) (so that \(\mathbf{c}^\top \mathbf{x}^\star_t\) is the primal bound at time \(t\)), and \(T\mathbf{c}^\top\mathbf{x}^\star\) is an instancespecific constant that depends on the optimal solution \(\mathbf{x}^\star\). The primal integral is to be minimized, and takes an optimal value of 0.
Note: to compute this metric unambiguously, a trivial initial solution \(x^\star_0\) is always provided to SCIP at the beginning of the solving process. Also, the constant term \(\mathbf{c}^\top\mathbf{x}^\star\) can be safely ignored at training time when participants train their control policy, as the learning problem is equivalent. At test time however, when we will evaluate the participant submissions, this constant term (or a proper substitute) will be incorporated in the reported metric.
 Dual integral

This objective measures the area over the curve of the solver's dual bound (a.k.a. global lower bound), which usually corresponds to a solution of a valid relaxation of the MILP. By branching, the LP relaxations corresponding to the branchandbound tree leaves get tightened, and the dual bound increases over time. With a time limit \(T\), the dual integral expresses as: \[ T\mathbf{c}^\top\mathbf{x}^\star  \int_{t=0}^{T} \mathbf{z}^\star_t\;\mathrm{d}t \text{,} \] where \(\mathbf{z}^\star_t\) is the best dual bound at time \(t\), and \(T\mathbf{c}^\top\mathbf{x}^\star\) is an instancespecific constant that depends on the optimal solution value \(\mathbf{c}^\top\mathbf{x}^\star\). The dual integral is to be minimized, and takes an optimal value of 0.
Note: in the context of branching, this metric is unambiguous to compute, as the root node of the tree always provides an initial dual bound \(\mathbf{z}^*_0\) at the beginning of branching. Here again, the constant term \(\mathbf{c}^\top\mathbf{x}^\star\) can be safely ignored for training, but it will be incorporated (or a proper substitute) in the metrics that we will report when evaluating the participants submissions.
 Primaldual gap integral

This objective measures the area between two curves, the solver's primal bound and the solver's dual bound. As such, this metric benefits both from improvements obtained from the primal side (finding good feasible solutions), and on the dual side (producing a tight optimality certificate). With a time limit \(T\), the primaldual gap integral expresses as: \[ \int_{t=0}^{T} \mathbf{c}^\top \mathbf{x}^\star_t  \mathbf{z}^\star_t\;\mathrm{d}t \text{.} \] The primaldual gap integral is to be minimized, and take an optimal value of 0.
Note: in the context of algorithm configuration, an initial value is required for the two curves at time \(t=0\). Therefore we always provide both an initial trivial solution \(x^\star_0\) and a valid initial dual bound \(z^\star_0\) to SCIP for this task.
Datasets
For each individual challenge, participants will be evaluated on three problem benchmarks from diverse application areas. Participants can provide a different decisionmaking code (algorithmic solution or trained ML model) for each of the benchmarks, or a single code that works for all benchmarks.
A problem benchmark consists in a collection of mixedinteger linear program (MILP) instances in the standard MPS file format. Each benchmark is split into a training and a test set, originating from the same problem distribution. While the training instances are made public at the beginning of the competition for participants to train their models, the test instances will be kept hidden for evaluation purposes and will only be revealed at the end of the competition.
Note: for each problem benchmark we suggest a preestablished split of the training instances into two distinct collections: train and valid. This is only a suggestion, and all instance files in both those collections can be likely considered as training instances. There is no restriction regarding how participants should use the provided instances during training.
The first two problem benchmarks are inspired by reallife applications of largescale systems at Google, while the third benchmark is an anonymous problem inspired by a realworld, largescale industrial application.
 Problem benchmark 1: Balanced Item Placement
 This problem deals with spreading items (e.g., files or processes) across containers (e.g., disks or machines) utilizing them evenly. Items can have multiple copies, but at most, one copy can be placed in a single bin. The number of items that can be moved is constrained, modeling the reallife situation of a live system for which some placement already exists. Each problem instance is modeled as a MILP, using a multidimensional multiknapsack formulation. This dataset contains 10000 training instances (presplit into 9900 train and 100 valid instances).
 Problem benchmark 2: Workload Apportionment
 This problem deals with apportioning workloads (e.g., data streams) across as few workers (e.g., servers) as possible. The apportionment is required to be robust to any one worker's failure. Each instance problem is modeled as a MILP, using a binpacking with apportionment formulation. This dataset contains 10000 training instances (presplit into 9900 train and 100 valid instances).
 Problem benchmark 3: Anonymous Problem
 The MILP instances corresponding to this benchmark are assembled from a public dataset, whose origin is kept secret to prevent cheating. Reverseengineering for the purpose of recovering the test set is explicitly forbidden (see rules). This dataset contains 118 training instances (presplit into 98 train and 20 valid instances).
Rules
The spirit of the competition is as follows:
 we encourage contributions from both the OR and ML communities by explicitly allowing nonML submissions;
 we encourage student participation with a dedicated student leaderboard;
 we encourage original contributions by specifically making the use of highlyengineered commercial solvers prohibited.
Participants must adhere to the following rules:
 Eligibility

 Participants can work in teams.
 Participants can submit codes based on a machine learning model, a humandesigned algorithm, or any combination of the two.
 Participants are not allowed to use commercial software in their code. The use of solvers such as CPLEX, Gurobi, or FICO XPress is explicitly forbidden.
 Studentonly teams are eligible to a separate student leaderboard, in addition to the main leaderboard of the competition. A graduated PhD is not considered a student.
 Data

 The use of supplementary existing opensource datasets, e.g., for pretraining, is permitted provided they are credited.
 The use of private, proprietary datasets is not permitted.
 In order to prevent cheating, the evaluation data sets will be kept hidden to the participants during the competition.
 Participants are free and encouraged to extract original features from the solver for training and evaluation, though the Python interface of SCIP, PySCIPOpt.
 Submission

 The submitted codes must be written in Python, and must run in the evaluation environment which we provide.
 Parallel implementations using multiple processes or multiple threads are allowed, although the hardware on which the codes will be evaluated is singlecore.
 The evaluation environment does not have access to any external network, to prevent any information leak.
 Users may make use of open source libraries given proper attribution. At the end of the competition, we encourage all code to be opensourced so the results can be reproduced.
 The evaluation environment will provide a basic ML ecosystem (Tensorflow, Pytorch, Scikitlearn). Additional libraries will be added to the evaluation environment upon reasonable requests.
 At the end of the competition, the winners are expected to provide a description of the method they used.
 Cheating prevention

 Participants are forbidden to interact with the SCIP solver in any other way than the one intended, that is, extracting information or providing a control decision related to the task at hand. Any interaction with SCIP that results in changes of the solver's internal state will be considered cheating.
 Some datasets have been anonymized, as they rely on publicly accessible data. Any attempt at reverseengineering for the purpose of recovering information about the test set is forbidden.
 Any instance of cheating, defined as violation of the rules above or any other attempt to circumvent the spirit and intent of the competition, as determined by the organizers, will result in the invalidation of all of the offending team’s submissions and a disqualification from the competition.
Getting started
Registration form: register your team and subscribe to the competition's mailing list.
GitHub repository: get started and access the datasets, started packages, and technical documentation.
Leaderboard
 Student leaderboard
 To be announced.
 Global leaderboard
 To be announced.