Comments
Description
Transcript
Explaining Robot Actions
Session: LBR Highlights March 5–8, 2012, Boston, Massachusetts, USA Explaining Robot Actions Meghann Lomas, Robert Chevalier, E. Vincent Cross II, R. Christopher Garrett, John Hoare, and Michael Kopack Lockheed Martin Advanced Technology Laboratories 3 Executive Campus, Suite 600 Cherry Hill, NJ 08002 1 856.792.9681 {mlomas, rchevali, ecross, rgarrett, jhoare, mkopack}@atl.lmco.com ABSTRACT what action it took, what information it had about the world at the time, and the logic behind its decision-making process. Because humans and robots use very different concepts when thinking about the world, robotic actions, information, and processes are rarely directly understandable by humans. Robots typically represent the world in coordinate-based terms (e.g., a grid representation of occupied areas, or a representation of detected color blobs in image coordinates); these representations do not align with human views of the world, which tend to be semantic representations of objects in space (e.g., the chair is near the desk). Robotic planning is done using mathematical formulas whose process and output are not readily expressed in semantic terminology. Additionally, the actions produced by the planner are influenced by many factors and based on continuous mathematical models, and so are not easily discretized. To increase human trust in robots, we have developed a system that provides insight into robotic behaviors by enabling a robot to answer questions people pose about its actions (e.g., Q: “Why did you turn left there?” A: “I detected a person at the end of the hallway.”). Our focus is on generation of this explanation in human-understandable terms despite the mathematical, robotspecific representation and planning system used by the robot to make its decisions and execute its actions. We present our work to date on this topic, including system design and experiments, and discuss areas for future work. Categories and Subject Descriptors H.1.2 [User/Machine Systems] To address these challenges, we have developed the Explaining Robot Actions (ERA) system, which includes a robotic world model capable of representing semantic and physically-based information, models of planning systems, and a query mechanism for obtaining the desired information from the robotic planning system to produce real-time, humanunderstandable answers to questions about a robot’s behavior. For the purposes of this work, we focus on generating a semantic answer from robot-specific concepts and assume the use of a natural language system for parsing human questions. General Terms Design, Human Factors Keywords Robotic Actions, Natural Communications, Trust, Human-Robot Partnering, Explanations 1. INTRODUCTION As robots become more frequently used in human-inhabited environments, it is increasingly important for people to maintain an appropriate level of trust in robots. Research has shown that no matter how capable an autonomous system is, if human operators do not trust the system, they will not use it [1]. 2. RELATED WORK To improve collaboration and co-existence between humans and robots, a considerable amount of work has gone into making robots expressive (e.g., in [3], gestures related to the robot’s goals make the robot appear more intelligent and approachable). This supports increased engagement with people, but does not necessarily help clarify why or how a robot selects actions. Two key factors that contribute to human trust are predictability and a mechanism for social exchange, both of which are frequently reduced or lacking in robotic systems [2]. Without a mechanism for communicating information and intent, a robot may appear erratic and untrustworthy when, in fact, it is following a clear decision-making process. As in human partnerships, the ability to explain the information and logic behind decisions greatly increases trust by establishing an understanding of why a decision was made, and subsequently provides insight into future decisions. The idea of explaining behaviors is less developed, but has been explored in artificial intelligence (AI) research and in robotics. The area of explainable AI has addressed the challenge of explaining the state of agents in a simulation but focuses on after-action analysis to determine behavior models (e.g., [4]). For robotics—and most directly related to our work— robotic actions have been explained through visual timelines made up of action trees that describe the robot’s recent actions [5]. Our approach differs from this work in that we enable verbal communication by enriching the robotic world representation and developing a mechanism for querying the representation and planning system for task-relevant information. The focus of our work is on enabling a robot to explain its actions in human-understandable concepts and terms, including Copyright is held by the author/owners(s) HRI ‘12, March 5-8, 2012, Boston, MA, USA. ACM 978-1-4503-1063-5/12/03. 187 Session: LBR Highlights March 5–8, 2012, Boston, Massachusetts, USA 3. ERA SYSTEM teammates, this capability proved useful for developers producing and debugging the system and simulator.) The ERA system (Figure 1) is designed to take in an alreadyparsed human query about the robot’s behavior and output an answer in sentence form. The ERA system first determines what information is needed to answer the question. Based on the “subtext” of the question (i.e., the fundamental question being posed), ERA selects an answer template, which forms the framework for the robot’s response. These templates are designed to be modular and combinable (e.g., “I am in X mode because Y” where Y is another template: “I detected object A at position Z.”). To populate the template with the information specific to the question, ERA queries the world model and planning system, which have been integrated with and established as resources for the ERA component. 5. CONCLUSION AND FUTURE WORK We have presented an approach for enabling a robot to explain its behavior in human-understandable concepts. Thus far, we have designed and developed a prototype system and algorithms that intelligently query semantic information from our layered world model and construct answers to a set of questions. While proof-of-concept experiments have shown the merits of this, future work includes: 1. Supporting more specific answers. Thus far, the responses from the ERA system have focused primarily on the robot’s mode and world information. We have developed but not implemented algorithms that examine changes in the grids used to produce the cost map. These algorithms look for the most influential cells on the robot’s route and determine what grid(s) correspond to the values of these cells. By understanding how those grids are constructed, we can determine what world information is most strongly affecting the robot’s behavior. 2. Examining robot routes, not just modes or grids. We believe that examining multiple possible routes for the robot might provide a basis for explaining a route choice (e.g., “I took the left road because the right road is longer.”). To do this, we plan to generate alternate paths and compare them. Figure 1: The ERA system determines what information is needed to answer the question and queries the two-layer world model and planner. Because of their ubiquity and versatility, we assume the robot uses a cost map-based planner for low-level actions and a finite state machine high-level planner that selects the robot’s mode based on world information (integration with other planners is left for future work). To represent the world, we use our twolayer world model, which incorporates a grid-based physical layer for planning and a semantic layer for a richer object representation and communication with people (see [6] for more details). This representation provides a translation that enables a semantic response understandable by people. Each resource registers the type of information it can provide (e.g., robot mode, object attributes) which allows ERA to select the information needed to complete the chosen template. Once completed, the sentence is output as an answer to the question. 3. Increasing the flexibility of the queries and responses. Humans tend to think of actions discretely (e.g., “left turn”), but the continuous nature of robotic motion means that these discrete actions are not easily associated with timestamps in the robot’s operation. By integrating with a natural language understanding system, we plan to use context from the questions to pair a semantically described action with robotic motion. 6. REFERENCES [1] Duez, P.P. and Zuliani, M.J. and Jamieson, G.A., “Trust by design: Information requirements for appropriate trust in automation.” In Proceedings of the 2006 Conference of the Center for Adv. Studies on Collaborative Research. 2006. 4. PROOF-OF-CONCEPT EXPERIMENTS [2] Lee, J.D. and See, K.A., “Trust in automation: Designing for appropriate reliance.” Human Factors, Vol. 46, No. 1, Spring 2004, pp. 50-80. To test and demonstrate the ERA system, we established a simulated search and rescue environment in Stage, and tasked a robot with searching the environment and escorting people to the nearest exit. “Operators” asked the robot questions about its behavior using a GUI interface. Two proof-of-concept experiments were performed. The first focused on retrieving information from the world model. For example, when queried “Tell me how you work,” the robot described the type of planner it used, the grid types that combined to form the cost map, and the weights on each grid. These fields were populated with data registered by the world model and planning system. Similarly, when asked “Tell me what you’ve sensed,” the robot responded with the information in the semantic layer of the world model. [3] Takayama, L., Dooley, D., & Ju, W., “Expressing thought: Improving robot readability with animation principles.” Proceedings of the 6th International Conference on Human-Robot Interaction, pp. 69–76. Lausanne, Switzerland. 2011. [4] Core, M., Lane, H., Van Lent, M., Gomboc, D., Solomon, S., and Rosenberg, M., “Building explainable artificial intelligence systems.” In Proceedings of the National Conference on Artificial Intelligence, volume 21. 2006. [5] Brooks, D., Shultz, A., Desai, M., Kovac, P., and Yanco, H.A., “Towards state summarization for autonomous robots.” In Dialog with Robots: Papers from the AAAI Fall Symposium (FS-10-05). 2010. In the second experiment, questions were asked that required the robot to intelligently query specific elements of the world model and planning system. Possible questions ranged from specific (e.g., “Why did you turn left going out of that room?”) to general (e.g., “What are you doing?”). The robot used the ERA system and world model to respond semantically, e.g., “I am in ‘searching for people’ mode, heading to my goal, which is room 20 at [5.34, 20.11].” (N.B., in addition to supporting human [6] Lomas, M., Cross, E., Darvill, J., Garrett, R., Kopack, M., and Whitebread, K., “A robotic world model framework designed to facilitate human-robot communication.” Proceedings of the SIGdial 2011 Conference, Portland, OR. 2011. 188