*Result*: Evolve-learn-adjust for <italic>seru</italic> production system.
*Further Information*
*<italic>Seru</italic> production system (SPS) as an innovative production mode, has the advantages of good flexibility, and high efficiency. SPS includes two coupled NP-hard subproblems: <italic>seru</italic> formation and <italic>seru</italic> scheduling, making it challenging for existing methods to solve efficiently. We develop an innovative paradigm, called evolve-learn-adjust to efficiently address SPS, which combines the advantages of evolutionary algorithms, deep reinforcement learning and problem-specific heuristic. Specifically, we employ multi-population evolutionary algorithms to optimise <italic>seru</italic> formation, making the optimisation process more stable. Subsequently, we design a novel Transformer-based DRL to learn the optimal <italic>seru</italic> scheduling policy, which enables the method to adapt to unseen large-scale instances. Finally, a problem-specific heuristic is employed to adjust the solution generated by DRL, improving overall solution quality at low computational cost. To further accelerate training convergence while achieving superior generalisation, we develop a multi-environment parallel training strategy to train DRL. Extensive experiments validate the effectiveness of the innovative paradigm, particularly for large-scale instances. On real-world instances derived from JD.com, with scales of up to 50 workers and 3000 batches, our method surpasses existing algorithms with performance improvements ranging from 90.25% to 97.61%. Finally, we leverage explainable artificial intelligence to transform the DRL agent's ‘black-box’ policy into trustworthy managerial insights. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Production Research is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*