Foundations and Trends® in Robotics > Vol 1 > Issue 2

Tactile Guidance for Policy Adaptation

By Brenna D. Argall, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, brennadee.argall@epfl.ch | Eric L. Sauser, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, eric.sauser@epfl.ch | Aude G. Billard, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, aude.billard@epfl.ch

 
Suggested Citation
Brenna D. Argall, Eric L. Sauser and Aude G. Billard (2011), "Tactile Guidance for Policy Adaptation", Foundations and TrendsĀ® in Robotics: Vol. 1: No. 2, pp 79-133. http://dx.doi.org/10.1561/2300000012

Publication Date: 21 Apr 2011
© 2011 B. D. Argall, E. L. Sauser and A. G. Billard
 
Subjects
Estimation Methods,  Human-Robot Interaction
 

Free Preview:

Download extract

Share

Download article
In this article:
1 Introduction 
2 The Tactile Policy Correction Algorithm 
3 Empirical Validation 
4 Discussion and Conclusions 
References 

Abstract

Demonstration learning is a powerful and practical technique to develop robot behaviors. Even so, development remains a challenge and possible demonstration limitations, for example correspondence issues between the robot and demonstrator, can degrade policy performance. This work presents an approach for policy improvement through a tactile interface located on the body of the robot. We introduce the Tactile Policy Correction (TPC) algorithm, that employs tactile feedback for the refinement of a demonstrated policy, as well as its reuse for the development of other policies. The TPC algorithm is validated on humanoid robot performing grasp positioning tasks. The performance of the demonstrated policy is found to improve with tactile corrections. Tactile guidance also is shown to enable the development of policies able to successfully execute novel, undemonstrated, tasks. We further show that different modalities, namely teleoperation and tactile control, provide information about allowable variability in the target behavior in different areas of the state space.

DOI:10.1561/2300000012
ISBN: 978-1-60198-436-4
60 pp. $55.00
Buy book (pb)
 
ISBN: 978-1-60198-437-1
60 pp. $100.00
Buy E-book (.pdf)
Table of contents:
1: Introduction
2: The Tactile Policy Correction Algorithm
3: Empirical Validation
4: Discussion and Conclusions
References

Tactile Guidance for Policy Adaptation

The development of behaviors for robot motion control is fundamental for robot operation in physical environments, yet is challenged by many factors such as sensor noise and approximate actuation models. Techniques like demonstration learning, that seed a training dataset with examples of behaviour execution by a task expert, are both powerful and practical for the development of motion control behaviors. To further endow a robot with the ability to continue learning from experience after demonstration can assist in robustness to poor demonstrators or demonstration interfaces, and also enable behavior adaptation to changes in the environment or task requirements. Tactile Guidance for Policy Adaptation introduces an approach for continuing motion control learning after demonstration that capitalizes on the availability of multiple sensor modalities through which a human teacher may transfer domain knowledge. Of particular note is that motion control corrections are provided through tactile sensors located on the body of the robot. The approach is validated on a high degree-of-freedom robot system, for which both demonstration and correction are challenging. Tactile Guidance for Policy Adaptation should be of interest to those considering the use of demonstration and machine learning for the development of robot behaviors, in particular for high degree-of-freedom humanoids, as well as to those interested in the transfer of task knowledge through multiple sensor modalities.

 
ROB-012