Bicocca Open Archive

This study conducts a comprehensive analysis of deep reinforcement learning (DRL) algorithms applied to supply chain inventory management (SCIM), which consists of a sequential decision-making problem focussed on determining the optimal quantity of products to produce and ship across multiple capacitated local warehouses over a specific time horizon. In detail, we formulate the problem as a Markov decision process for a divergent two-echelon inventory control system characterised by stochastic and seasonal demand, also presenting a balanced allocation rule designed to prevent backorders in the first echelon. Through numerical experiments, we evaluate the performance of state-of-the-art DRL algorithms and static inventory policies in terms of both cost minimisation and training time while varying the number of local warehouses and product types and the length of replenishment lead times. Our results reveal that the Proximal Policy Optimization algorithm consistently outperforms other algorithms across all experiments, proving to be a robust method for tackling the SCIM problem. Furthermore, the (s, Q)-policy stands as a solid alternative, offering a compromise in performance and computational efficiency. Lastly, this study presents an open-source software library that provides a customisable simulation environment for addressing the SCIM problem, utilising a wide range of DRL algorithms and static inventory policies.

Stranieri, F., Stella, F., Kouki, C. (2024). Performance of deep reinforcement learning algorithms in two-echelon inventory control systems. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 62(17), 6211-6226 [10.1080/00207543.2024.2311180].

Performance of deep reinforcement learning algorithms in two-echelon inventory control systems

Stranieri, F;Stella, F;Kouki, C

2024

Abstract

This study conducts a comprehensive analysis of deep reinforcement learning (DRL) algorithms applied to supply chain inventory management (SCIM), which consists of a sequential decision-making problem focussed on determining the optimal quantity of products to produce and ship across multiple capacitated local warehouses over a specific time horizon. In detail, we formulate the problem as a Markov decision process for a divergent two-echelon inventory control system characterised by stochastic and seasonal demand, also presenting a balanced allocation rule designed to prevent backorders in the first echelon. Through numerical experiments, we evaluate the performance of state-of-the-art DRL algorithms and static inventory policies in terms of both cost minimisation and training time while varying the number of local warehouses and product types and the length of replenishment lead times. Our results reveal that the Proximal Policy Optimization algorithm consistently outperforms other algorithms across all experiments, proving to be a robust method for tackling the SCIM problem. Furthermore, the (s, Q)-policy stands as a solid alternative, offering a compromise in performance and computational efficiency. Lastly, this study presents an open-source software library that provides a customisable simulation environment for addressing the SCIM problem, utilising a wide range of DRL algorithms and static inventory policies.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				artificial intelligence; deep learning; inventory control systems; inventory control policies; inventory management; reinforcement learning;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				1-mar-2024
			
	Data di pubblicazione
	
				2024
			
	Rivista
	
				INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH
			
	Numero del volume
	
				62
			
	Fascicolo
	
				17
			
	Pagina iniziale
	
				6211
			
	Pagina finale
	
				6226
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1080/00207543.2024.2311180
			
	Fulltext
	
				reserved
			
	Citazione
	
				Stranieri, F., Stella, F., Kouki, C. (2024). Performance of deep reinforcement learning algorithms in two-echelon inventory control systems. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 62(17), 6211-6226 [10.1080/00207543.2024.2311180].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Stranieri-2024-Int J of Prod Res-VoR.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Tutti i diritti riservati Dimensione 2.14 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.14 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/465098

Citazioni

17

16

Social impact