*Result*: An adaptive model-based deep reinforcement learning approach for matching-while-learning problem in shared manufacturing platforms.
*Further Information*
*Shared manufacturing (SMfg) is an innovative production paradigm that integrates distributed manufacturing resources and services via a cloud platform. Acting as a market intermediary, the SMfg platform dynamically matches manufacturing demands with qualified resources. However, the matching process encounters distinctive challenges arising from manufacturing complexity and data scarcity, which limit the direct application of existing matching methods from e-retailing and e-hailing contexts. Considering these challenges, we study the problem of matching limited resources with sequential orders on SMfg platforms under incomplete information, while learning from realised matching outcomes to improve future decisions. We formulate the problem as a Markov decision process (MDP) and develop an adaptive model-based deep reinforcement learning (DRL) method for this matching-while-learning problem. The main contributions are: (1) a complex dynamic matching formulation integrating unknown arrival probabilities, uncertain matching compatibility, and queued manufacturing capacities into a structured MDP; (2) a domain-specific heuristic rule that reduces the dimensionality of the state and action spaces, enabling scalable DRL; and (3) an adaptive environment model to alleviate data scarcity, with a scheme for generating simulated transition sequences of adaptive length. Numerical experiments based on a real-world case study demonstrate that the proposed approach significantly outperforms existing algorithms in SMfg, particularly in large-scale scenarios.<bold>Highlights</bold>A novel model-based deep reinforcement learning method for dynamic matching in SMfg.A heuristic rule with guaranteed performance for aggregating state and action spaces.An augmented environment model with adaptive adjustment of simulated trajectory lengthAchieving higher revenue compared to multiple state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Production Research is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*