Compare human-guided RL agents

To experience the effect of human guidance in agent training, you can try out examples we provide of agents trained with baseline RL and various human-guided RL algorithms. In this example you will be loading the model weights of 4 agent. The first is an untrained random agent, the other three each are trained for 10 minutes using 1) DDPG 2) Deep TAMER 3) GUIDE. Deep TAMER and GUIDE are state-of-the-art human-guided RL algorithms, and these models are guided by real human-trainers. Observe the behavior of these agents by running the following commands:

Random:

python crew_algorithms/ddpg/eval.py envs=hide_and_seek_1v1 exp_path='examples/ddpg' eval_weights=[0]

DDPG:

python crew_algorithms/ddpg/eval.py envs=hide_and_seek_1v1 exp_path='examples/ddpg' eval_weights=[5]

Deep TAMER:

python crew_algorithms/deep_tamer/eval.py envs=hide_and_seek_1v1 exp_path='examples/deep_tamer' eval_weights=[5]

GUIDE:

python crew_algorithms/ddpg/eval.py envs=hide_and_seek_1v1 exp_path='examples/guide' eval_weights=[5]