Open Access
ARTICLE
An Empirical Framework for Evaluating Reinforcement Learning in Automated Optimization Systems
Issue Vol. 2 No. 01 (2025): Volume 02 Issue 01 --- Section Articles
Abstract
The integration of Reinforcement Learning (RL) into automation represents a paradigm shift in solving complex optimization problems across various industries6. While RL has demonstrated significant potential, its practical application is often hampered by a lack of standardized evaluation frameworks, making it difficult for practitioners to select appropriate algorithms for specific tasks7. This study introduces and executes a comprehensive empirical investigation to systematically evaluate the performance of leading RL algorithms across a diverse set of simulated automation environments8. We designed three high-fidelity simulation suites mimicking critical optimization tasks in manufacturing (production scheduling, inventory management), energy systems (microgrid management, HVAC control), and robotics (motion planning, multi-robot coordination)9. Within these environments, we benchmarked a portfolio of algorithms, including Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), and Multi-Agent Deep Deterministic Policy Gradient (MADDPG), against key performance indicators: task efficiency, sample complexity, scalability, and robustness to environmental stochasticity10. Our results reveal a nuanced performance landscape where no single algorithm dominates across all domains11. For instance, while PPO demonstrated superior stability and performance in continuous control tasks prevalent in robotics and HVAC systems, DQN-based variants excelled in discrete action spaces typical of scheduling and inventory problems12. Multi-agent algorithms showed profound efficiency gains in cooperative tasks but suffered from higher training complexity13. The findings underscore a critical trade-off between algorithm complexity, sample efficiency, and task-specific performance14. This research provides a foundational empirical baseline, offering actionable insights for deploying RL in real-world automation and highlighting critical areas for future research, particularly in enhancing transfer learning, safety, and interpretability to bridge the persistent gap between simulation and practical deployment15.
Keywords
References
[1] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018. 338
[2] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015. 339
[3] C. Li, P. Zheng, Y. Yin, B. Wang, and L. Wang, “Deep reinforcement learning in smart manufacturing: A review and prospects,” CIRP Journal of Manufacturing Science and Technology, vol. 40, pp. 75–101, 2023. 340
[4] A. Perera and P. Kamalaruban, “Applications of reinforcement learning in energy systems,” Renewable and Sustainable Energy Reviews, vol. 137, p. 110618, 2021. 341
[5] J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1238–1274, 2013. 342
[6] A. Esteso, D. Peidro, J. Mula, and M. D´ıaz-Madronero, “Reinforcement learning applied to production planning and control,” International Journal of Production Research, vol. 61, no. 16, pp. 5772–5789, 2023. 343
[7] R. Nian, J. Liu, and B. Huang, “A review on reinforcement learning: Introduction and applications in industrial process control,” Computers & Chemical Engineering, vol. 139, p. 106886, 2020. 344
[8] R. N. Boute, J. Gijsbrechts, W. Van Jaarsveld, and N. Vanvuchelen, “Deep reinforcement learning for inventory control: A roadmap,” European Journal of Operational Research, vol. 298, no. 2, pp. 401–412, 2022. 345
[9] C. Blum and A. Roli, “Metaheuristics in combinatorial optimization: Overview and conceptual comparison,” ACM computing surveys (CSUR), vol. 35, no. 3, pp. 268–308, 2003. 346
[10] Y. Li, “Deep reinforcement learning: An overview,” arXiv preprint arXiv:1701.07274, 2017. 347
[11] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017. 348
[12] C. D. Hubbs, C. Li, N. V. Sahinidis, I. E. Grossmann, and J. M. Wassick, “A deep reinforcement learning approach for chemical production scheduling,”Computers & Chemical Engineering, vol. 141, p. 106982, 2020. 349
[13] D. Shi, W. Fan, Y. Xiao, T. Lin, and C. Xing, “Intelligent scheduling of discrete automated production line via deep reinforcement learning,” International journal of production research, vol. 58, no. 11, pp. 3362–3380, 2020. 350
[14] F. Guo, Y. Li, A. Liu, and Z. Liu, “A reinforcement learning method to scheduling problem of steel production process,” in Journal of Physics: Conference Series, vol. 1486, no. 7. IOP Publishing, 2020, p. 072035. 351
[15] M. Mowbray, D. Zhang, and E. A. D. R. Chanona, “Distributional reinforcement learning for scheduling of chemical production processes,” arXiv preprint arXiv:2203.00636, 2022. 352
[16] N. N. Sultana, H. Meisheri, V. Baniwal, S. Nath, B. Ravindran, and H. Khadilkar, “Reinforcement learning for multi-product multi-node inventory management in supply chains,” arXiv preprint arXiv:2006.04037, 2020. 353
[17] B. J. De Moor, J. Gijsbrechts, and R. N. Boute, “Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management,” European Journal of Operational Research, vol. 301, no. 2, pp. 535–545, 2022. 354
[18] M. Khirwar, K. S. Gurumoorthy, A. A. Jain, and S. Manchenahally, “Cooperative multi-agent reinforcement learning for inventory management,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2023, pp. 619–634. 355
[19] R. Leluc, E. Kadoche, A. Bertoncello, and S. Gourvenec, “Marlim: Multi-agent reinforcement learning for inventory management,” arXiv preprint arXiv:2308.01649, 2023. 356
[20] O. Ogunfowora and H. Najjaran, “Reinforcement and deep reinforcement learning-based solutions for machine maintenance planning, scheduling policies, and optimization,” Journal of Manufacturing Systems, vol. 70, pp. 244–263, 2023. 357
[21] N. Yousefi, S. Tsianikas, and D. W. Coit, “Reinforcement learning for dynamic condition-based maintenance of a system with individually repairable components,” Quality Engineering, vol. 32, no. 3, pp. 388–408, 2020. 358
[22] ——, “Dynamic maintenance model for a repairable multi-component system using deep reinforcement learning,” Quality Engineering, vol. 34, no. 1, pp. 16–35, 2022. 359
[23] P. Andrade, C. Silva, B. Ribeiro, and B. F. Santos, “Aircraft maintenance check scheduling using reinforcement learning,” Aerospace, vol. 8, no. 4, p. 113, 2021. 360
[24] J. Thomas, M. P. Hernandez, A. K. Parlikad, and R. Piechocki, “Network maintenance planning via multi-agent reinforcement learning,” in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2021, pp. 2289–2295. 361
[25] Z. J. Viharos and R. Jakab, “Reinforcement learning for statistical process control in manufacturing,” Measurement, vol. 182, p. 109616, 2021. 362
[26] A. Kuhnle, M. C. May, L. Schafer, and G. Lanza, “Explainable reinforcement learning in production control of job shop manufacturing system,” International Journal of Production Research, vol. 60, no. 19, pp. 5812–5834, 2022. 363
[27] M. Mowbray, R. Smith, E. A. Del Rio-Chanona, and D. Zhang, “Using process data to generate an optimal control policy via apprenticeship and reinforcement learning,” AIChE Journal, vol. 67, no. 9, p. e17306, 2021. 364
[28] Y. Li, J. Du, and W. Jiang, “Reinforcement learning for process control with application in semiconductor manufacturing,” IISE Transactions, pp. 1–15, 2023. 365
[29] D. Azuatalam, W.-L. Lee, F. de Nijs, and A. Liebman, “Reinforcement learning for whole-building hvac control and demand response,” Energy and AI, vol. 2, p. 100020, 2020. 366
[30] D. Jang, L. Spangher, M. Khattar, U. Agwan, and C. Spanos, “Using meta reinforcement learning to bridge the gap between simulation and experiment in energy demand response,” in Proceedings of the Twelfth ACM International Conference on Future Energy Systems, 2021, pp. 483–487. 367
[31] M. Ahrarinouri, M. Rastegar, and A. R. Seifi, “Multiagent reinforcement learning for energy management in residential buildings,” IEEE Transactions on Industrial Informatics, vol. 17, no. 1, pp. 659–666, 2020. 368
[32] R. Lu, R. Bai, Z. Luo, J. Jiang, M. Sun, and H.-T. Zhang, “Deep reinforcement learning-based demand response for smart facilities energy management,” IEEE Transactions on Industrial Electronics, vol. 69, no. 8, pp. 8554–8565, 2021. 369
[33] R. Lu, Y.-C. Li, Y. Li, J. Jiang, and Y. Ding, “Multi-agent deep reinforcement learning based demand response for discrete manufacturing systems energy management,” Applied Energy, vol. 276, p. 115473, 2020. 370
[34] X. Zhang, R. Lu, J. Jiang, S. H. Hong, and W. S. Song, “Testbed implementation of reinforcement learning-based demand response energy management system,” Applied energy, vol. 297, p. 117131, 2021. 371
[35] T. A. Nakabi and P. Toivanen, “Deep reinforcement learning for energy management in a microgrid with flexible demand,” Sustainable Energy, Grids and Networks, vol. 25, p. 100413, 2021. 372
[36] R. Hu and A. Kwasinski, “Energy management for microgrids using a reinforcement learning algorithm,” in 2021 IEEE Green Energy and Smart Systems Conference (IGESSC). IEEE, 2021, pp. 1–6. 373
[37] B. Zhang, Z. Chen, and A. M. Ghias, “Deep reinforcement learning-based energy management strategy for a microgrid with flexible loads,” in 2023 International Conference on Power Energy Systems and Applications (ICoPESA). IEEE, 2023, pp. 187–191. 374
[38] W. Zhang, H. Qiao, X. Xu, J. Chen, J. Xiao, K. Zhang, Y. Long, and Y. Zuo, “Energy management in microgrid based on deep reinforcement learning with expert knowledge,” in International Workshop on Automation, Control, and Communication Engineering (IWACCE 2022), vol. 12492. SPIE, 2022, pp. 275–284. 375
[39] A. Shojaeighadikolaei, A. Ghasemi, A. G. Bardas, R. Ahmadi, and M. Hashemi, “Weather-aware data-driven microgrid energy management using deep reinforcement learning,” in 2021 North American Power Symposium (NAPS). IEEE, 2021, pp. 1–6. 376
[40] Y. Du and F. Li, “Intelligent multi-microgrid energy management based on deep neural network and model-free reinforcement learning,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1066–1076, 2019. 377
[41] T. Yang, L. Zhao, W. Li, and A. Y. Zomaya, “Reinforcement learning in sustainable energy and electric systems: A survey,” Annual Reviews in Control, vol. 49, pp. 145–163, 2020. 378
[42] D. Cao, W. Hu, J. Zhao, G. Zhang, B. Zhang, Z. Liu, Z. Chen, and F. Blaabjerg, “Reinforcement learning and its applications in modern power and energy systems: A review,” Journal of modern power systems and clean energy, vol. 8, no. 6, pp. 1029–1042, 2020. 379
[43] X. Chen, G. Qu, Y. Tang, S. Low, and N. Li, “Reinforcement learning for selective key applications in power systems: Recent advances and future challenges,” IEEE Transactions on Smart Grid, vol. 13, no. 4, pp. 2935–2958, 2022. 380
[44] K. Sivamayil, E. Rajasekar, B. Aljafari, S. Nikolovski, S. Vairavasundaram, and I. Vairavasundaram, “A systematic study on reinforcement learning based applications,” Energies, vol. 16, no. 3, p. 1512, 2023. 381
[45] X. Zhong, Z. Zhang, R. Zhang, and C. Zhang, “End-to-end deep reinforcement learning control for hvac systems in office buildings,” Designs, vol. 6, no. 3, p. 52, 2022. 382
[46] S. Sierla, H. Ihasalo, and V. Vyatkin, “A review of reinforcement learning applications to control of heating, ventilation and air conditioning systems,” Energies, vol. 15, no. 10, p. 3526, 2022. 383
[47] H.-Y. Liu, B. Balaji, S. Gao, R. Gupta, and D. Hong, “Safe hvac control via batch reinforcement learning,” in 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS). IEEE, 2022, pp. 181–192. 384
[48] X. Yuan, Y. Pan, J. Yang, W. Wang, and Z. Huang, “Study on the application of reinforcement learning in the operation optimization of hvac system,” in Building Simulation, vol. 14. Springer, 2021, pp. 75–87. 385
[49] M. Biemann, F. Scheller, X. Liu, and L. Huang, “Experimental evaluation of model-free reinforcement learning algorithms for continuous hvac control,” Applied Energy, vol. 298, p. 117164, 2021. 386
[50] D. Zhou, R. Jia, and H. Yao, “Robotic arm motion planning based on curriculum reinforcement learning,” in 2021 6th International Conference on Control and Robotics Engineering (ICCRE). IEEE, 2021, pp. 44–49. 387
[51] T. Yu and Q. Chang, “Reinforcement learning based user-guided motion planning for human-robot collaboration,” arXiv preprint arXiv:2207.00492, 2022. 388
[52] Y. Cao, S. Wang, X. Zheng, W. Ma, X. Xie, and L. Liu, “Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot,” Aerospace Science and Technology, vol. 136, p. 108098, 2023. 389
[53] M. Schuck, J. Br¨udigam, A. Capone, S. Sosnowski, and S. Hirche, “Dext-gen: Dexterous grasping in sparse reward environments with full orientation control,” arXiv preprint arXiv:2206.13966, 2022. 390
[54] S. Joshi, S. Kumra, and F. Sahin, “Robotic grasping using deep reinforcement learning,” in 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE). IEEE, 2020, pp. 1461–1466. 391
[55] D. Wang, H. Deng, and Z. Pan, “Mrcdrl: Multi-robot coordination with deep reinforcement learning,” Neurocomputing, vol. 406, pp. 68–76, 2020. 392
[56] X. Lan, Y. Qiao, and B. Lee, “Towards pick and place multi robot coordination using multi-agent deep reinforcement learning,” in 2021 7th International Conference on Automation, Robotics and Applications (ICARA). IEEE, 2021, pp. 85–89. 393
Open Access Journal
Submit a Paper
Propose a Special lssue
pdf