Project information

Summary

SOON-RL is a Reinforcement Learning (RL) project designed to optimize manufacturing workshop production within a multi-agent environment. The goal was to train an AI system to efficiently manage a sequence of specialized machines to fulfill production orders in the minimum amount of time, dynamically adapting to disruptions like machine failures.

I developed a custom simulation environment using Python, Gym, and SimPy. To support complex multi-agent training, I transitioned the architecture from Stable Baselines to Ray RLlib, allowing multiple machines to learn collaboratively. By leveraging Ray Tune for rigorous hyperparameter and reward optimization, the final trained agents successfully learned to route production and reconfigure machines to achieve the absolute mathematical minimum number of production steps.

Technical Details & Scenarios

The core challenge was orchestrating a sequence of machines (e.g., producing part A1 → B1 → C1) that needed to work collaboratively. Each machine agent had to decide between four discrete actions at any given time: Wait, Reconfigure, Fetch Parts, or Produce, based on real-time storage levels, recipes, and the statuses of other machines.

To ensure the RL agents were robust, they were trained and evaluated across three increasingly complex scenarios:

  • Scenario 1 (Dynamic Reconfiguration): Fulfilling standard sequential orders, requiring agents to learn when to reconfigure themselves to produce a new type of part when order requirements changed.
  • Scenario 2 (Fault Tolerance): Simulating sudden mechanical breakdowns mid-production. The surviving machines had to autonomously reconfigure their tasks to compensate for the offline machine and prevent bottlenecks.
  • Scenario 3 (Parallel Processing): Handling simultaneous orders of varying lengths. The agents learned to dynamically shift resources from a completed short order to assist with a longer, ongoing order to minimize total operational time.
  • Project Background & Evolution

    This project originally began as my Bachelor's thesis, initially developed in C# using Unity and Unity ML-Agents with a custom UI for configuring workshop layouts. Because of the project's success, I was brought on as a Research Assistant at HE-Arc to continue its development, successfully migrating the framework to Python for deeper algorithmic control and advanced multi-agent scaling.