BCG X Datathon: Traffic Prediction for Delivery Optimization

Achieved 1st place at the BCG X Datathon, developing a traffic prediction model for delivery optimization using machine learning and business insights.

AI/ML Solutions

Date: December 6, 2024

Tags:AI/ML Solutions

Client: BCG X

Team: Louis Gauthier, Clément Florval

BCG X Datathon: Traffic Prediction for Delivery Optimization

Overview

In December 2024, our team—comprising Louis Gauthier and Clément Florval of Digiwave, in collaboration with three other students—participated in the BCG X Datathon, organized by Boston Consulting Group in collaboration with CentraleSupélec. We proudly secured 1st place by developing a predictive model for traffic conditions in Paris, helping LivraisonCo, a delivery company, optimize its delivery schedules and reduce costs. This project required combining business insights with advanced machine learning techniques, all within a tight deadline of one week.

Project Details

Objective

The primary goal was to predict:

  • Hourly traffic flow (Débit horaire): The number of vehicles passing through a specific road segment in one hour.
  • Lane occupancy rate (Taux d'occupation): The percentage of time a road segment is occupied by vehicles.

These predictions enable LivraisonCo to define optimal delivery time slots and allocate resources effectively, thereby reducing delays and minimizing logistical costs.

Data Sources

We leveraged the following datasets:

  • Traffic Sensor Data: Historical traffic data provided by OpenData Paris for three key road segments in Paris:
    1. Champs-Élysées
    2. Rue de la Convention
    3. Rue Saint-Antoine
  • External Data:
    • Weather conditions
    • Public holidays and school vacations
    • Temporal features (e.g., hour of the day, day of the week)

The dataset spanned over a decade (since 2010), requiring significant preprocessing to clean, align timestamps, and handle missing or inconsistent values.

Methodology

Our approach was structured into five main phases, reflecting the presentation structure from the Datathon:

  1. Introduction and Context:

    • Defined the mission from LivraisonCo to optimize delivery slots using traffic predictions.
    • Identified the problem: reducing delays and minimizing logistical costs through accurate traffic forecasting.
  2. Data Collection and Preparation:

    • Data Sources: Utilized traffic sensor data from Paris and enriched it with external factors such as weather and public holidays.
    • Data Volume: Processed over 129,000 hourly data points spanning more than 14 years.
    • Data Cleaning: Addressed approximately 6% missing data due to sensor issues by reindexing and interpolation.
    • Feature Engineering: Developed additional features like DayOfWeek, Hour, and Weather Conditions.
  3. Exploratory Data Analysis (EDA):

    • Traffic Patterns: Identified peak traffic hours during morning (7-10 AM) and evening (5-7 PM) rush hours, significant reductions during weekends, and strong monthly seasonality.
    • Data Insights: Visualized traffic flow trends and detected anomalies, ensuring data quality for modeling.
  4. Model Development and Validation:

    • Predictive Modeling: Evaluated multiple models to identify the best-performing approach:
      • Amazon Chronos: An advanced time-series model optimized for handling enriched datasets and covariates such as holidays and peak hours.
      • Meta's Prophet: A robust model for forecasting time series data with strong seasonal effects.
      • XGBoost: A powerful gradient boosting framework known for its performance in regression tasks.
      • Baseline Model: A simple heuristic model predicting traffic based on the same hour traffic from the previous week.
    • Training and Prediction: Trained each model to forecast traffic conditions over a 5-day horizon (120 hours).
    • Performance Metrics: Evaluated model accuracy using RMSE and Weighted Quantile Loss (WQL), ensuring robust predictions aligned with real traffic trends.
  5. Optimization and Implementation Roadmap:

    • Delivery Slot Optimization: Leveraged traffic predictions to define optimal delivery time slots, minimizing delays and logistical costs.
    • Resource Allocation: Developed strategies to allocate delivery personnel efficiently based on predicted traffic conditions.

Challenges and Solutions

  • Data Gaps: Overcame missing data by implementing interpolation techniques and ensuring continuity in the dataset.
  • Feature Integration: Successfully integrated multiple external factors, enhancing the model's predictive capabilities.
  • Model Selection: Evaluated multiple models (Chronos, Prophet, XGBoost) to determine the most accurate and reliable approach for traffic prediction.
  • Real-Time Application: Proposed the development of a dynamic dashboard for real-time traffic visualization and automated decision-making.

Results

Model Performance

Our best predictions, using Chronos, achieved:

  • RMSE: 129.69
  • WQL: -0.0906

These metrics demonstrated the model’s ability to capture traffic patterns effectively.

Key Achievements

  • 1st Place: Secured the top position at the BCG X Datathon, outperforming the other participating teams.
  • Strategic Roadmap: Delivered a comprehensive plan for integrating the predictive model into LivraisonCo’s operations, ensuring scalability and continuous improvement.

True vs Predicted Values

Key Features

  • Advanced Data Integration: Combined historical traffic data with real-time external factors to enhance prediction accuracy.
  • Model Diversity: Evaluated multiple modeling approaches (Chronos, Prophet, XGBoost) to ensure the selection of the most effective model.
  • Baseline Comparison: Established a baseline model to provide context for the performance improvements achieved by advanced models.
  • Scalable Framework: Utilized AutoGluon for rapid model retraining and scalability to accommodate future data expansions.
  • Business-Centric Approach: Ensured that predictive insights directly translated into operational efficiencies and cost savings for LivraisonCo.

Technologies Used

  • Python: For data preprocessing, modeling, and visualization.
  • AutoGluon: Machine learning framework for time-series predictions.
  • Chronos Models: Pretrained time-series models by Amazon optimized for our dataset.
  • Meta's Prophet: Time-series forecasting model with strong seasonal capabilities.
  • XGBoost: Gradient boosting framework for regression tasks.
  • Seaborn and Matplotlib: For data visualization.
  • Pandas and NumPy: For data manipulation and analysis.

Presentation Highlights

During our presentation at the Datathon, we outlined our approach and key findings:

  1. Introduction and Context:

    • Presented the mission from LivraisonCo to optimize delivery slots using traffic predictions.
    • Defined the problem: reducing delays and minimizing logistical costs through accurate traffic forecasting.
  2. Methodological Approach and EDA:

    • Detailed our data collection from Paris traffic sensors and external sources.
    • Showcased EDA results highlighting traffic patterns and data quality insights.
  3. Modeling and Results:

    • Introduced multiple models (Chronos, Prophet, XGBoost) and discussed their performance metrics.
    • Illustrated prediction accuracy with comparative visualizations of actual vs. predicted traffic data.
    • Highlighted the superior performance of the Chronos model over others and the baseline.
  4. Optimization Roadmap:

    • Outlined steps to translate traffic predictions into operational optimizations, including resource allocation and delivery scheduling.
    • Proposed a phased deployment strategy for integrating the model into LivraisonCo’s systems.
  5. Automation and Deployment:

    • Emphasized the importance of automating the predictive model through a dynamic dashboard.
    • Presented a roadmap for deploying the solution in three phases: automation, real-world testing, and full-scale deployment.
  6. Conclusion and Future Directions:

    • Highlighted the strategic benefits of our solution for LivraisonCo.
    • Discussed potential future enhancements, including real-time data integration and expansion to additional road segments.

Notebook and Repository

The main notebook used for this project is available here: Chronos Notebook.

Conclusion

Winning the 1st place at the BCG X Datathon underscores our team’s expertise in data science and our ability to deliver impactful solutions under pressure. Our traffic prediction model not only met the technical requirements but also provided a strategic foundation for optimizing LivraisonCo’s delivery operations.

Future Directions

We proposed a roadmap to:

  • Develop models for resource allocation, delivery clustering and predicting the number of deliveries.
  • Integrate real-time traffic data for dynamic scheduling.
  • Expand predictions to include more road segments and other cities.
  • Continuously evaluate and incorporate additional modeling techniques to further enhance prediction accuracy.

This project was developed by Louis Gauthier and Clément Florval of Digiwave, in collaboration with three other students from CentraleSupélec for the BCG X Datathon. For more information about our services, visit Digiwave's Portfolio.

Ready to Work with Us?

Let's collaborate to bring your ideas to life. Get in touch with us to discuss your project requirements.