Abstract
This study leverages simulation-optimisation with a Reinforcement Learning (RL) model to analyse the routing behaviour of delivery vehicles (DVs). We conceptualise the system as a stochastic k-armed bandit problem, representing a sequential interaction between a learner (the DV) and its surrounding environment. Each DV is assigned a random number of customers and an initial delivery route. If a loading zone is unavailable, the RL model is used to select a delivery strategy, thereby modifying its route accordingly. The penalty is gauged by the additional trucking and walking time incurred compared to the originally planned route. Our methodology is tested on a simulated network featuring realistic traffic conditions and a fleet of DVs employing four distinct lastmile delivery strategies. The results of our numerical experiments underscore the advantages of providing DVs with an RL-based decision support system for en-route decision-making, yielding benefits to the overall efficiency of the transport network.
Original language | English |
---|---|
Article number | 2337216 |
Journal | Transportmetrica B: Transport Dynamics |
Volume | 12 |
Issue number | 1 |
DOIs | |
Publication status | Published - Apr 2024 |
Keywords
- last-mile delivery
- urban logistics
- reinforcement learning
- loading zone
- simulation-optimisation
Fingerprint
Dive into the research topics of 'A reinforcement learning framework for improving parking decisions in last-mile delivery'. Together they form a unique fingerprint.Datasets
-
Replication Data for: A reinforcement learning framework for improving parking decisions in last-mile delivery
Muriel, J. (Creator), Zhang, L. (Creator), Fransoo, J. C. (Creator) & Villegas, J. (Creator), DataverseNL, 14 Nov 2024
DOI: 10.34894/xpyg7a, https://dataverse.nl/citation?persistentId=doi:10.34894/XPYG7A
Dataset