A Gossip Protocol for Dynamic Resource Management in Large Cloud Environments


We address the problem of dynamic resource management for a large-scale cloud environment. Our contribution includes outlining a distributed middleware architecture and presenting one of its key elements: a gossip protocol that (1) ensures fair resource allocation among sites/applications, (2) dynamically adapts the allocation to load changes and (3) scales both in the number of physical machines and sites/applications. We formalize the resource allocation problem as that of dynamically maximizing the cloud utility under CPU and memory constraints. We first present a protocol that computes an optimal solution without considering memory constraints and prove correctness and convergence properties. Then, we extend that protocol to provide an efficient heuristic solution for the complete problem, which includes minimizing the cost for adapting an allocation. The protocol continuously executes on dynamic, local input and does not require global synchronization, as other proposed gossip protocols do. We evaluate the heuristic protocol through simulation and find its performance to be well-aligned with our design goals.