Multi-Agent Battle Simulation in Unity ML-Agents
Overview
We developed this project for Hybrid Vision, leveraging Unity ML-Agents and reinforcement learning to transform an existing open-source multi-agent battle simulation. Originally designed with a single combat-focused behavior, we expanded the system to support dynamic behavior switching, creating a foundation for advancing AI safety and interpretability.
Project Details
Challenge
Hybrid Vision required an enhanced AI simulation that could model complex interactions and support anomaly detection within agent behaviors. Key challenges included:
- Expanding Behavior Capabilities: Integrating multiple behaviors into a previously combat-only environment.
- Unified Learning Model: Creating a single reinforcement learning model capable of handling diverse behaviors using a centralized control network.
- High-Fidelity Agent Control: Managing two types of complex bots, each with distinct models for high-level decision-making and low-level joint control.
- Behavior Anomaly Detection: Providing a foundation for future tools aimed at identifying unexpected or misbehaving agents within the simulation.
Solution
To address these challenges, we implemented a series of enhancements:
- Single Brain Hypernetwork: A unified neural network using a hypernetwork structure to support seamless behavior changes.
- Behavior Signal Integration: Implemented a flexible input signal for behavior control.
- High-Level Model Training and Expansion: Developed and trained a high-level decision-making model from scratch, incorporating new sensors to support a wider range of actions and ensure robust adaptability within the simulation environment.
Technologies Used
- Unity ML-Agents: Core framework for reinforcement learning-based simulation in Unity.
- TensorBoard: Visualization tool for tracking training progress and neural activations.
- Custom Unity Sensors: Enhanced sensors to capture detailed environmental data to inform agent decisions.
- ONNX Runtime, PyTorch: Utilized for post-simulation inference and training advanced models to study and evaluate agent behavior.
Development Process
The project was completed over approximately a month, following a structured approach:
- Design and Planning: Defined new behaviors, reward structure, and initial architecture for the project.
- Compatibility Fixes and Stability Improvements: Updated an older 2021 version of the project to ensure compatibility with the latest Unity version. This involved resolving crashes, fixing environmental issues, and adapting outdated code to support the latest Unity ML-Agents and reinforcement learning frameworks.
- Behavior Integration: Integrated multiple behaviors into the simulation environment, allowing agents to adapt dynamically.
- Training and Tuning: Retrained models from scratch due to incompatibility of older neural network models, using extensive tuning to meet the client's requirements. We leveraged PPO and imitation learning to optimize agent adaptability across multiple behaviors.
- Monitoring System Setup: Developed a logging system to capture essential data on behavior changes and neural activations for further analysis.
- Documentation and Handover: Delivered detailed documentation to enable Hybrid Vision to manage, analyze, and expand the simulation independently.
Results
Our solution successfully delivered:
- Enhanced Behavior Switching: Agents demonstrated smooth transitions between behaviors, adapting effectively to real-time environmental cues.
- Unified Neural Model: Managed multiple behaviors within a single network using a hypernetwork approach, simplifying the overall architecture while supporting more interpretable and predictable agent behavior.
- In-Depth Monitoring: Comprehensive data logs provided detailed insights into agent interactions and supported AI safety evaluation.
- Scalable Framework: The flexible architecture allows for easy expansion, supporting additional behaviors or conditions as needed, with a focus on interpretability in AI-driven environments.
Conclusion
Through this project, we provided Hybrid Vision with a versatile simulation platform that supports complex behavior analysis and lays the groundwork for anomaly detection. By incorporating dynamic behavior switching, real-time monitoring, and a scalable neural network model, we have created a robust environment that meets the needs of today while allowing for future expansion and innovation.
This project was developed by Louis Gauthier at Digiwave. For more information about our services, visit Digiwave's Portfolio.