情報学研究科知能情報学専攻では、学外あるいは学内の有識者をお招きして学術交流を図るイベント「知能情報学専攻コロキウム」を月1回のペースで行っております。
★当日、出席表を回覧しますので参加者は記入ご協力ください。

講演者:Shixiang (Shane) Gu
ホスト:山田 誠 准教授
日時:11月5日(火)12:30-13:30

場所:総合研究7号館 情報1講義室 (107号室)

講演タイトル:Model-Based Reinforcement Learning with Predictability Maximization (予測可能性最大化によるモデルベース強化学習の手法)

講演内容: (トークは日本語と英語を交えての予定)Intelligence is often associated with the ability to optimize the environment for maximizing one’s objectives (e.g. survival). In particular, the ability to predict the future conditioned on own actions enables intelligent agents to efficiently evaluate possible futures and choose the best one to realize. Such model-based reinforcement learning (RL) algorithms have recently shown promising results in sample-efficient learning of robotics and gaming RL environments. However, standard model-based approaches naively try to predict everything about the world, including noises that are not predictable or controllable. In this talk, I will share my recent works (temporal difference models (TDM), and dynamics-aware discovery of skills (DADS)) and discuss how goal-conditioned Q-learning and empowerment -- the ability to predictively change the world -- relate to model-based RL and can learn abstracted Markov Decision Processes (MDPs) where the predictability is inherently maximized. I’ll show that such approaches enable successful model-based planning in difficult environments where classic model-based planners fail, significantly outperforming model-free approaches in terms of sample efficiency. I’ll end with a discussion of how reachability and empowerment/mutual information connect to each other and potential directions of future research.

Bio:Shixiang (Shane) Gu is a Research Scientist at Google Brain, where he mainly works on research problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. His recent research focuses on scalable RL methods that could solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review. He completed PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where he was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During his PhD, he also interned and collaborated closely with Sergey Levine/Ilya Sutskever at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. He holds B.ASc. in Engineering Science from the University of Toronto, where he did his thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms. He is a Japan-born Chinese Canadian. Having lived in Japan, China, Canada, the US, the UK, and Germany, he goes under multiple names: ShaneGu, Shixiang Gu, 顾世翔, 顧世翔(ぐう せいしょう). 

https://ai.google/research/people/104824/