*Result*: An end-to-end decentralised scheduling framework based on deep reinforcement learning for dynamic distributed heterogeneous flowshop scheduling.
*Further Information*
*Heterogeneity among factories in distributed manufacturing significantly expands the solution space, complicating optimisation. Traditional centralised scheduling methods lack the scalability to adapt to varying factory scales. This paper proposes an end-to-end decentralised scheduling framework based on deep reinforcement learning (DRL) for dynamic distributed heterogeneous permutation flowshop scheduling problem (DDHPFSP) with random job arrivals. The framework utilises a multi-agent architecture, where each factory operates as an independent agent, enabling efficient, robust, and scalable scheduling. Specifically, the DDHPFSP is formulated as a partially observable Markov decision process (POMDP), with a state space reflecting heterogeneity and permutation characteristics and a new tailored reward function addressing sparse rewards and high reward variance. An end-to-end policy network with dual-layer architecture is developed, incorporating a feature extraction network to capture intrinsic relationships between jobs and heterogeneous factories, enhancing the agent's self-learning and policy evolution. Moreover, a backward swap search (BSS) method based on greedy heuristics optimises the pre-scheduling plan during the online phase with minimal computation time. Experimental results demonstrate the framework outperforms the best comparison methods by 39.76% on 540 baseline instances and 59.95% on 2430 generalisation instances. Furthermore, the framework's effectiveness improves by 68.9% with the introduction of the BSS method. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Production Research is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*
*Full text is not displayed to guests* *Login for full access*