Application of reinforcement learning in robotic disassembly operations

Qu, Mo (2023). Application of reinforcement learning in robotic disassembly operations. University of Birmingham. Ph.D.

Preview

Qu2023PhD.pdf
Text - Accepted Version
Available under License Creative Commons Attribution.
Download (6MB) | Preview

Abstract

Disassembly is a key step in remanufacturing. To increase the level of automation in disassembly, it is necessary to use robots that can learn to perform new tasks by themselves rather than having to be manually reprogrammed every time there is a different job. Reinforcement Learning (RL) is a machine learning technique that enables the robots to learn by trial and error rather than being explicitly programmed.

In this thesis, the application of RL to robotic disassembly operations has been studied. Firstly, a literature review on robotic disassembly and the application of RL in contact-rich tasks has been conducted in Chapter 2.

To physically implement RL in robotic disassembly, the task of removing a bolt from a door chain lock has been selected as a case study, and a robotic training platform has been built for this implementation in Chapter 3. This task is chosen because it can demonstrate the capabilities of RL to pathfinding and dealing with reaction forces without explicitly specifying the target coordinates or building a force feedback controller. The robustness of the learned policies against the imprecision of the robot is studied by a proposed method to actively lower the precision of the robots. It has been found that the robot can learn successfully even when the precision is lowered to as low as ±0.5mm. This work also investigates whether learned policies can be transferred among robots with different precisions. Experiments have been performed by training a robot with a certain precision on a task and replaying the learned skills on a robot with different precision. It has been found that skills learned by a low-precision robot can perform better on a robot with higher precision, and skills learned by a high-precision robot have worse performance on robots with lower precision, as it is suspected that the policies trained on high-precision robots have been overfitted to the precise robots.

In Chapter 4, the approach of using a digital-twin-assisted simulation-to-reality transfer to accelerate the learning performance of the RL has been investigated. To address the issue of identifying the system parameters, such as the stiffness and damping of the contact models, that are difficult to measure directly but are critical for building the digital twins of the environments, system identification method is used to minimise the discrepancy between the response generated from the physical and digital environments by using the Bees Algorithm. It is found that the proposed method effectively increases RL's learning performance. It is also found that it is possible to have worse performance with the sim-to-real transfer if the reality gap is not effectively addressed. However, increasing the size of the dataset and optimisation cycles have been demonstrated to reduce the reality gap and lead to successful sim-to-real transfers.

Based on the training task described in Chapters 4 and 5, a full factorial study has been conducted to identify patterns when selecting the appropriate hyper-parameters when applying the Deep Deterministic Policy Gradient (DDPG) algorithm to the robotic disassembly task. Four hyper-parameters that directly influence the decision-making Artificial Neural Network (ANN) update have been chosen for the study, with three levels assigned to each hyper-parameter. After running 241 simulations, it is found that for this particular task, the learning rates of the actor and critic networks are the most influential hyper-parameters, while the batch size and soft update rate have relatively limited influence.

Finally, the thesis is concluded in Chapter 6 with a summary of findings and suggested future research directions.

Type of Work:

Thesis (Doctorates > Ph.D.)

Award Type:

Doctorates > Ph.D.

Supervisor(s):