TAMER Integration with Unity ML-Agents for 3D Mountain Car Simulation
Overview
This project, completed for an Upwork client, focused on integrating the TAMER (Training an Agent Manually via Evaluative Reinforcement) framework with Unity ML-Agents to create an advanced 3D version of the Mountain Car problem. The TAMER framework introduces a human-in-the-loop learning approach where agents learn from human-provided feedback rather than predefined reward functions, enabling more intuitive training for complex behaviors. For more information on TAMER, refer to the original TAMER paper.
The objective of this project was to train an AI-driven car to build momentum and escape a valley within a dynamically increasing difficulty environment. The integration required the development of custom Python trainers tailored to TAMER's specific settings, facilitating precise control and enhancing the reinforcement learning process.
Project Details
Challenge
The project addressed several key challenges:
- Framework Integration: Combining TAMER with Unity ML-Agents to leverage human-in-the-loop feedback for reinforcement learning.
- 3D Environment Development: Creating a realistic 3D Mountain Car simulation where the agent must build momentum to escape the valley.
- Custom Trainer Development: Developing bespoke Python trainers based on Unity ML-Agents to accommodate TAMER's unique requirements.
- Scalability and Complexity: Increasing the environmental difficulty to test and improve the agent's learning capabilities.
- Documentation and Reproducibility: Ensuring comprehensive documentation for future replication and extension of the project.
Solution
To tackle these challenges, the project was divided into structured steps:
-
Environment Creation and Initial RL Agent Implementation:
- Built the 3D Mountain Car environment using Unity, focusing on realistic physics and obstacle placement.
- Developed the first reinforcement learning agent using Unity ML-Agents, training it to reach a target position within the valley.
- Provided detailed documentation outlining the environment setup, agent configuration, and training procedures to facilitate future reproducibility and collaboration.
-
TAMER Framework Integration and Custom Trainer Development:
- Implemented the TAMER framework to incorporate human-provided reward signals, enhancing the agent's learning process through evaluative feedback.
- Rewrote custom trainers in Python based on Unity ML-Agents' trainers to meet the specific requirements of TAMER, ensuring seamless integration and effective training.
- Increased the complexity of the environment by introducing additional obstacles and dynamic elements, challenging the agent to adapt and build momentum under more demanding conditions.
- Enhanced documentation to include instructions on modifying classical RL algorithms into TAMER-compatible versions, enabling easier future modifications and extensions.
Technologies Used
- Unity ML-Agents: Core framework for developing and training reinforcement learning agents within Unity environments.
- TAMER Framework: A human-in-the-loop reinforcement learning framework that allows agents to learn from human-provided feedback.
- Python: Utilized for developing custom trainers and integrating TAMER with Unity ML-Agents.
- Custom Python Trainers: Tailored training scripts to accommodate TAMER's specific settings and requirements.
- Git: Version control system for managing project code and documentation.
- TensorBoard: Used for visualizing training progress and agent performance metrics.
Results
The integration of TAMER with Unity ML-Agents for the 3D Mountain Car simulation yielded significant outcomes:
- Effective Momentum Building: The agent successfully learned to build momentum to escape the valley, demonstrating the efficacy of TAMER in guiding reinforcement learning agents through human feedback.
- Scalable Complexity Handling: The environment's increased complexity allowed for robust testing and improvement of the agent's adaptability and learning capabilities.
- Custom Trainer Success: The development of custom Python trainers facilitated seamless integration of TAMER with Unity ML-Agents, enhancing the overall training efficiency and control.
- Comprehensive Documentation: Detailed documentation ensured that the project could be easily replicated and extended by other researchers or developers, promoting future collaboration and advancements.
Conclusion
This project demonstrated the successful integration of the TAMER framework with Unity ML-Agents to develop a 3D Mountain Car simulation. By focusing on momentum building within a dynamically increasing difficulty environment and developing custom Python trainers, the project provided valuable insights into human-in-the-loop reinforcement learning. The scalable environment design and comprehensive documentation lay the groundwork for future research and advancements in AI-driven simulations with human-in-the-loop approaches.
This project was developed by Louis Gauthier at Digiwave. For more information about our services, visit Digiwave's Portfolio.