*Result*: Actor-critic driven deep reinforcement learning for optimising agri-food supply chain.

Title:
Actor-critic driven deep reinforcement learning for optimising agri-food supply chain.
Authors:
Shukla, Aditya1 (AUTHOR), Kakde, Shubham Tanaji1 (AUTHOR), Mitra, Rony2 (AUTHOR) mitrarony92@gmail.com, Mandal, Jasashwi2 (AUTHOR), Tiwari, Manoj Kumar3 (AUTHOR)
Source:
International Journal of Production Research. Dec2025, Vol. 63 Issue 24, p9913-9932. 20p.
Database:
Business Source Premier

*Further Information*

*The agri-food supply chain is a complex network enclosing various stakeholders, from farmers to consumers, with multifaceted interactions and dependencies. Traditional supply chain management approaches often need help adapting to dynamic environments and optimising decision-making processes. Deep reinforcement learning is employed by integrating value-based and policy-based models, enhanced by advanced learning techniques, to tackle these challenges. This paper explores applying Deep Reinforcement Learning (DRL) approaches, including Q-learning, Deep Q-Learning (DQL), and the Actor-Critic method, to optimise the efficiency of the agri-food supply chain. The actor-critic model significantly enhances decision-making processes across various supply chain stages by improving efficiency and increasing profit margins. A specific scenario of sugar processing and distribution is incorporated, considering real-world scenarios to validate our model. DRL methods optimise sugar production, storage and distribution, ensuring timely deliveries and enhancing profitability. The models address fluctuating demand and transportation logistics challenges, resulting in a more streamlined and responsive sugar distribution network. The findings reveal that Actor-Critic and DQL methods significantly outperform traditional Q-learning considering product profitability, offering unique advantages in handling complex state-action spaces. [ABSTRACT FROM AUTHOR]

Copyright of International Journal of Production Research is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*

*Full text is not displayed to guests*