Learning Physically Simulated Tennis Skills from Broadcast Videos

被引:18
|
作者
Zhang, Haotian [1 ]
Yuan, Ye [2 ]
Makoviychuk, Viktor [2 ]
Guo, Yunrong [3 ]
Fidler, Sanja [3 ,4 ,5 ]
Peng, Xue Bin [3 ,6 ]
Fatahalian, Kayvon [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] NVIDIA, Santa Clara, CA USA
[3] NVIDIA, Toronto, ON, Canada
[4] Univ Toronto, Toronto, ON, Canada
[5] Vector Inst, Toronto, ON, Canada
[6] Simon Fraser Univ, Burnaby, BC, Canada
来源
ACM TRANSACTIONS ON GRAPHICS | 2023年 / 42卷 / 04期
关键词
physics-based character animation; imitation learning; reinforcement learning;
D O I
10.1145/3592408
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a system that learns diverse, physically simulated tennis skills from large-scale demonstrations of tennis play harvested from broadcast videos. Our approach is built upon hierarchical models, combining a low-level imitation policy and a high-level motion planning policy to steer the character in a motion embedding learned from broadcast videos. When deployed at scale on large video collections that encompass a vast set of examples of real-world tennis play, our approach can learn complex tennis shotmaking skills and realistically chain together multiple shots into extended rallies, using only simple rewards and without explicit annotations of stroke types. To address the low quality of motions extracted from broadcast videos, we correct estimated motion with physics-based imitation, and use a hybrid control policy that overrides erroneous aspects of the learned motion embedding with corrections predicted by the high-level policy. We demonstrate that our system produces controllers for physically-simulated tennis players that can hit the incoming ball to target positions accurately using a diverse array of strokes (serves, forehands, and backhands), spins (topspins and slices), and playing styles (one/two-handed backhands, left/right-handed play). Overall, our system can synthesize two physically simulated characters playing extended tennis rallies with simulated racket and ball dynamics. Code and data for this work is available at https://research.nvidia.com/labs/toronto-ai/vid2player3d/.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Classification and representation of semantic content in broadcast tennis videos
    Rea, N
    Dahyot, R
    Kokaram, A
    2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 2589 - 2592
  • [2] AN ADAPTIVE SEARCH WINDOW ALGORITHM FOR PLAYER TRACKING IN BROADCAST TENNIS VIDEOS
    Lai, Kuan-Ting
    Hsieh, Chaur-Heh
    Sun, Jenn Dong
    Jiang, Yao-Chuan
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (10A): : 6733 - 6745
  • [3] Learning to Track and Identify Players from Broadcast Sports Videos
    Lu, Wei-Lwun
    Ting, Jo-Anne
    Little, James J.
    Murphy, Kevin P.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (07) : 1704 - 1716
  • [4] SFV: Reinforcement Learning of Physical Skills from Videos
    Peng, Xue Bin
    Kanazawa, Angjoo
    Malik, Jitendra
    Abbeel, Pieter
    Levine, Sergey
    ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (06):
  • [5] SFV: Reinforcement Learning of Physical Skills from Videos
    Peng, Xue Bin
    Kanazawa, Angjoo
    Malik, Jitendra
    Abbeel, Pieter
    Levine, Sergey
    SIGGRAPH ASIA'18: SIGGRAPH ASIA 2018 TECHNICAL PAPERS, 2018,
  • [6] Deep Learning Application in Broadcast Tennis Video Annotation
    Jiang, Kan
    Izadi, Masoumeh
    Liu, Zhaoyu
    Dong, Jin Song
    2020 25TH INTERNATIONAL CONFERENCE ON ENGINEERING OF COMPLEX COMPUTER SYSTEMS (ICECCS 2020), 2020, : 53 - 62
  • [7] Discovery learning and modelling when learning skills in tennis
    Kalapoda, E
    Michalopoulou, M
    Aggelousis, N
    Taxildaris, K
    JOURNAL OF HUMAN MOVEMENT STUDIES, 2003, 45 (05): : 433 - 448
  • [8] Learning cooperative dynamic manipulation skills from human demonstration videos
    Iodice, Francesco
    Wu, Yuqiang
    Kim, Wansoo
    Zhao, Fei
    De Momi, Elena
    Ajoudani, Arash
    MECHATRONICS, 2022, 85
  • [9] Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos
    Xiong, Haoyu
    Li, Quanzhou
    Chen, Yun-Chun
    Bharadhwaj, Homanga
    Sinha, Samarth
    Garg, Animesh
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 7827 - 7834
  • [10] RoboTube: Learning Household Manipulation from Human Videos with Simulated Twin Environments
    Xiong, Haoyu
    Fu, Haoyuan
    Zhang, Jieyi
    Bao, Chen
    Zhang, Qiang
    Huang, Yongxi
    Xu, Wenqiang
    Garg, Animesh
    Lu, Cewu
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1 - 10