*Result*: The beer game bullwhip effect mitigation: a deep reinforcement learning approach.
*Further Information*
*This article investigates the application of reinforcement learning (RL) methods to optimise a four-echelon linear supply chain model with stochastic demand. The proposed supply chain configuration is largely based on the production-distribution supply chain of the MIT Supply Chain Beer Game. We show that RL can significantly improve ordering efficiency and overall supply chain performance. The model environment is adapted for the OpenAI 'gymnasium' interface with the usage of reward shaping (reward engineering) in the model training process. The algorithm employs two reward function components: costs and order variance metric. We evaluate the effectiveness of RL against Order-Up-To inventory management policies for several supply chain configurations and assess the impact on the overall supply chain stability. An algorithm based on a recurrent proximal policy optimisation (RPPO) is effective for the beer game setup and outperforms Order-Up-To approaches. This RL algorithm generates different ordering patterns and tends to narrow the action space for the agent and thus, to mitigate the bullwhip effect in a more effective way. Our findings suggest that an improvement in the reduction of the bullwhip effect impact is present even if only one agent in the supply chain uses the algorithm as an ordering policy. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Production Research is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*
*Full text is not displayed to guests* *Login for full access*