Pages_887-899
This article preprocesses environmental information and use it as input for the Proximal Policy Optimization (PPO) algorithm. The algorithm is directly trained on a model vehicle in a real environment, allowing it to control the distance between the vehicle and surrounding objects. The training converges after approximately 200 episodes, demonstrating the PPO algorithm's ability to tolerate uncertainty, noise, and interference in a real training environment to some extent. Furthermore, tests of the trained model in different scenarios reveal that even when the input information is processed and does not provide a comprehensive view of the environment, the PPO algorithm can still effectively achieve control objectives and accomplish challenging tasks.
Keywords: Reinforce Learning; Proximal Policy Optimization; Autonomous Driving; Autonomous Driving
| [ EXPORT CITATION ] | [ FULL REE. ] | [FULL TEXT] |