| Abstract |
|
Product Distribution (PD) theory is a new framework for
controlling Multi-Agent Systems (MASs). First we review
one motivation of PD theory, as the information-theoretic
extension of conventional full-rationality game theory to the
case of bounded rational agents. In this extension the equilibrium
of the game is the optimizer of a Lagrangian of the
(probability distribution of) the joint state of the agents. Accordingly
we can consider a team game havbing a shared
utility which is a performance measure of the behavior of
the MAS. For such a scenario the game is at equilibrium —
the Lagrangian is optimized — when the joint distribution
of the agents optimizes the systems expected performance.
One common way to find that equilibrium is to have each
agent run a reinforcement learning algorithm. Here we investigate
the alternative of exploiting PD theory to run gradient
descent on the Lagrangian. We present computer experiments
validating some of the predictions of PD theory
for how best to do that gradient descent. We also demonstrate
how PD theory can improve performance even when
we are not allowed to rerun the MAS from different initial
conditions, a requirement implicit in some previous work.
|
Additional Information
|
Citation:
Chiu Fan Lee, David H. Wolpert,
"Product Distribution Theory for Control of Multi-Agent Systems,"
aamas,
pp. 522-529,
Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2 (AAMAS'04),
2004
|