Skip to content

Running Human-guided Reinforcement Learning

A direct way for humans to interact with AI agents to provide feedback on the behavior of the agents. Human-guided Reinforcement Learning focuses on how to enable more effecient agent learning by integrating human guidance. CREW provides interfaces of a variety of feedback types. For instance, Deep TAMER collects discrete binary-valued human feedback and assign it to state-action pairs. To run Deep TAMER on the Bowling environment, navigate under crew-algorithms and activate the crew conda environment. We provide two examples of human-guided RL: 1) GUIDE 2) Deep TAMER.

Running GUIDE

GUIDE is a general framework that can be incorporated with any off-policy RL algorithm. For Instance, to run GUIDE with a DDPG backbone:

python crew_algorithms/ddpg envs=bowling collector.frames_per_batch=120 batch_size=120 train_batches=30 hf=True 

The script will launch a server instance of the Bowling game. Then please manually open up the game builds folder crew-dojo/Builds and click on the Bowling server game file under Bowling-Standalone{platform}-Client. This will start a client instance of the game. Log in with an any username and click connect, select the current match, and click Join. You will see a synchronized view of the game in the client instance. On the upper-right corner in Active Clients, click on the AI player. Now you will be able to provide feedback to the agent through the feedback buttons on the right-hand side. Feedback signals collected through this interface will be sent to python and will be used for training. GUIDE uses continuous feedback, so simple hover your mouse over the continuous feedback window to provide feedback for every time step.

You can also chose to provide feedback through a different machine connected to the same local network. Simply open a client instance on the new machine, enter the IP address of the machine hosting the game in the login menu, and the rest is the same.

Running Deep TAMER

python crew_algorithms/deep_tamer envs=bowling