As autonomous robots expand their application beyond research labs and production lines, they must work in more flexible and less well defined environments. To escape the requirement for exhaustive instruction and stipulated preference ordering, a robot's operation must involve choices between alternative actions, guided by goals. We describe a robot that learns these goals from humans by considering the timeliness and context of instructions and rewards as evidence of the contours and gradients of an unknown human utility function. In turn, this underlies a choice-theory based rational preference relationship. We examine how the timing of requests, and contexts in which they arise, can lead to actions that pre-empt requests using methods we term contemporaneous entropy learning and context sensitive learning. We provide experiments on these two methods to demonstrate their usefulness in guiding a robot's actions.
机构:
NYU, Langone Hosp, Div Orthodont, Brooklyn, NY 11220 USA
NYU, Langone Hosp, Orthodont & Dentofacial Orthoped, Brooklyn, NY 11220 USANYU, Langone Hosp, Div Orthodont, Brooklyn, NY 11220 USA
机构:
Univ Fed Rio de Janeiro, Fac Educ, Lab Estudos Linguagem Leitura Escrita & Educ, Rio de Janeiro, BrazilUniv Fed Rio de Janeiro, Fac Educ, Lab Estudos Linguagem Leitura Escrita & Educ, Rio de Janeiro, Brazil
de Andrade, Ludmila Thome
Bokel Reis, Claudia M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Fed Rio de Janeiro, Fac Educ, Lab Estudos Linguagem Leitura Escrita & Educ, Rio de Janeiro, BrazilUniv Fed Rio de Janeiro, Fac Educ, Lab Estudos Linguagem Leitura Escrita & Educ, Rio de Janeiro, Brazil