Learning forward-models for robot manipulation

Jacob Mathew, Michael (2020). Learning forward-models for robot manipulation. University of Birmingham. Ph.D.

Preview

JacobMathew2020PhD.pdf
Text
Available under License All rights reserved.
Download (22MB) | Preview

Abstract

Robots with the capability to dexterously manipulate objects have the potential to revolutionise automation. Interestingly, human beings have the ability to perform complex object manipulations. They perfect their skills to manipulate objects through repeated trials. There is extensive human motor control literature which provides evidence that the repetition of a task creates forward-models of that task in the brain. These forward-models are used to predict future states of the task, anticipate necessary control actions and adapt impedance quickly to match task requirements. Evidence from motor control and some promising results in the robot research on manipulation clearly shows the need for forward-models for manipulation.

This study was started with the premise that a robot needs forward-models to perform dexterous manipulation. Initially planning a sequence of actions using a forward-model was identified as the most crucial problem in manipulation. Push manipulation planning using forward-models was the first step in this direction. However, unlike most methods in the robotic push manipulation literature, the approach was to incorporate the uncertainty of the forward-model in formulating the plan for push planning. Incorporating uncertainty helps the robot to perform risk-aware actions and stay close to the known areas of the state
space while manipulating the object. The forward-models of object dynamics were learned offline, and robot pushes were fixed-duration position-controlled actions. The experiments in simulation and real robots were successful and helped in creating several other insights for better manipulation. Two of these insights were the need to have the capability to learn the feed-forward model online and the importance of having a state-dependent stiffness controller.

The first part of the thesis presents a planner that makes use of an uncertain, learned, forward (dynamical) model to plan push manipulation. The forward-model of the system is learned by poking the object in random directions. The learned model is then utilised by a model predictive path integral controller to push the box to the required goal pose. By using path-integral control, the proposed planner can find efficient paths by sampling. The planner is agnostic to the forward-model used and produces successful results using a physics simulator, an Ensemble of Mixture Density Networks (Ensemble-MDN) or a Gaussian

Process (GP). Both ensemble-MDN and a GP can encode uncertainty not only in the push outcome but in the model itself. The work compares planning using each of these learned models to planning with a physics simulator. Two versions of the planner are implemented. The first version makes uncertainty averse push actions by minimising uncertainty cost and goal costs together. Using multiple costs makes it difficult for the optimiser to find optimal push actions. Hence the second version solves the problem in two stages. The first stage creates an uncertainty averse path, and the second stage finds push actions to follow the path found.

The second part of the thesis describes a framework which can learn forward-models online and can perform state-dependent stiffness adaptation using these forward-models. The idea of the framework is again motivated by the human control literature. During the initial trials of a novel manipulation task, humans tend to keep their arms stiff to reduce the effects of any unforeseen disturbances on the ability to perform the task accurately. After a few repetitions, humans adapt the stiffness of their arms without any significant reduction in task performance. Research in human motor control strongly indicates that humans learn and continuously revise internal models of manipulation tasks to support such adaptive behaviour.

Drawing inspiration from these findings, the proposed framework supports online learning of a time-independent forward-model of a manipulation task from a small number of examples. The proposed framework consists of two parts. The first part can create forward-models of a task through online learning. Later, the measured inaccuracies in the predictions of this model are used to dynamically update the forward-model and modify the impedance parameters of a feedback controller during task execution. Furthermore, the framework includes a hybrid force-motion controller that enables the robot to be compliant in particular directions (if required) while adapting the impedance in other directions. These capabilities are illustrated and evaluated on continuous contact tasks such as polishing a board, pulling a non-linear spring and stirring porridge.

Type of Work:

Thesis (Doctorates > Ph.D.)

Award Type:

Doctorates > Ph.D.

Supervisor(s):