Published on:
The increasing heterogeneity and dynamism of cloud–edge-IoT infrastructures demand scalable, intelligent scheduling strategies that go beyond traditional centralised approaches. This paper presents DARO, a distributed, asynchronous scheduling framework that leverages multi-agent reinforcement learning (MARL) to enable autonomous, cooperative task allocation across the cloud continuum. DARO integrates natively with Kubernetes, introducing a decentralised, auction-based mechanism in which node-local agents submit bids for incoming tasks based on partial observations and a learned policy. The agents are trained using a value decomposition method (QMIX) under the centralised training-decentralised execution paradigm. We formalise the problem as a decentralised partially observable Markov decision process (Dec-POMDP), design a multi-factor reward function to guide learning, and implement the system within a high-fidelity Kubernetes simulator. The experimental evaluation demonstrates that the agents effectively learn balanced and resource-efficient task placement strategies. DARO demonstrates strong potential to serve as a robust scheduling layer for dynamic and large-scale distributed environments.
DOI: 10.5220/0014920000004039