*Result*: Strong Optimal Classification Trees.
*Further Information*
*Decision trees are among the most interpretable and popular machine learning models that are used routinely in applications ranging from revenue management to medicine. Traditional heuristic methods, although fast, lack modeling flexibility for incorporating constraints such as fairness and do not guarantee optimality. Recent efforts aim to overcome these limitations using mixed-integer optimization (MIO) for better modeling flexibility and optimality, but speed remains an issue. In "Strong Optimal Classification Trees," Aghaei, Gómez, and Vayanos use integer optimization and polyhedral theory to create an MIO-based formulation with a strong LO relaxation resulting in a 29% speed-up in training time compared with state-of-the-art MIO-based formulations, as well as up to an 8% improvement in out-of-sample accuracy. Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning optimal binary classification trees with univariate splits. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed-integer optimization (MIO) technology. Yet, existing MIO-based approaches from the literature do not leverage the power of MIO to its full extent: they rely on weak formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose an intuitive flow-based MIO formulation for learning optimal binary classification trees. Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees. Moreover, we show that our formulation has a stronger linear optimization relaxation than existing methods in the case of binary data. We exploit the decomposable structure of our formulation and max-flow/min-cut duality to derive a Benders' decomposition method to speed-up computation. We propose a tailored procedure for solving each decomposed subproblem that provably generates facets of the feasible set of the MIO as constraints to add to the main problem. We conduct extensive computational experiments on standard benchmark data sets on which we show that our proposed approaches are 29 times faster than state-of-the-art MIO-based techniques and improve out-of-sample performance by up to 8%. [ABSTRACT FROM AUTHOR]
Copyright of Operations Research is the property of INFORMS: Institute for Operations Research & the Management Sciences and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*
*Full text is not displayed to guests* *Login for full access*