Learning Physically Simulated Tennis Skills from Broadcast Videos

被引:18
|
作者
Zhang, Haotian [1 ]
Yuan, Ye [2 ]
Makoviychuk, Viktor [2 ]
Guo, Yunrong [3 ]
Fidler, Sanja [3 ,4 ,5 ]
Peng, Xue Bin [3 ,6 ]
Fatahalian, Kayvon [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] NVIDIA, Santa Clara, CA USA
[3] NVIDIA, Toronto, ON, Canada
[4] Univ Toronto, Toronto, ON, Canada
[5] Vector Inst, Toronto, ON, Canada
[6] Simon Fraser Univ, Burnaby, BC, Canada
来源
ACM TRANSACTIONS ON GRAPHICS | 2023年 / 42卷 / 04期
关键词
physics-based character animation; imitation learning; reinforcement learning;
D O I
10.1145/3592408
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a system that learns diverse, physically simulated tennis skills from large-scale demonstrations of tennis play harvested from broadcast videos. Our approach is built upon hierarchical models, combining a low-level imitation policy and a high-level motion planning policy to steer the character in a motion embedding learned from broadcast videos. When deployed at scale on large video collections that encompass a vast set of examples of real-world tennis play, our approach can learn complex tennis shotmaking skills and realistically chain together multiple shots into extended rallies, using only simple rewards and without explicit annotations of stroke types. To address the low quality of motions extracted from broadcast videos, we correct estimated motion with physics-based imitation, and use a hybrid control policy that overrides erroneous aspects of the learned motion embedding with corrections predicted by the high-level policy. We demonstrate that our system produces controllers for physically-simulated tennis players that can hit the incoming ball to target positions accurately using a diverse array of strokes (serves, forehands, and backhands), spins (topspins and slices), and playing styles (one/two-handed backhands, left/right-handed play). Overall, our system can synthesize two physically simulated characters playing extended tennis rallies with simulated racket and ball dynamics. Code and data for this work is available at https://research.nvidia.com/labs/toronto-ai/vid2player3d/.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Learning Principle of Open Sports Skills and its Application in Table Tennis Teaching
    Sun, Li
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON SPORTS MEDICINE AND SPORTS MANAGEMENT (SMSM 2013), VOL 1, 2013, 1 : 46 - 49
  • [32] Students Prefer Point of View Videos When Learning Examination Skills
    Blackwell, J. E. M.
    Lloyd, P.
    Peacock, O.
    Williams, J.
    Lund, J. N.
    BRITISH JOURNAL OF SURGERY, 2015, 102 : 123 - 123
  • [33] Deep learning methodology for predicting time history of head angular kinematics from simulated crash videos
    Vikas Hasija
    Erik G. Takhounts
    Scientific Reports, 12
  • [34] Deep learning methodology for predicting time history of head angular kinematics from simulated crash videos
    Hasija, Vikas
    Takhounts, Erik G.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [35] Mastery versus invention learning: impacts on future learning of simulated procedural skills
    Ryan Brydges
    Andrea Fiume
    Lawrence Grierson
    Advances in Health Sciences Education, 2022, 27 : 441 - 456
  • [36] Mastery versus invention learning: impacts on future learning of simulated procedural skills
    Brydges, Ryan
    Fiume, Andrea
    Grierson, Lawrence
    ADVANCES IN HEALTH SCIENCES EDUCATION, 2022, 27 (02) : 441 - 456
  • [37] Learning from Narrated Instruction Videos
    Alayrac, Jean-Baptiste
    Bojanowski, Piotr
    Agrawal, Nishant
    Sivic, Josef
    Laptev, Ivan
    Lacoste-Julien, Simon
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (09) : 2194 - 2208
  • [38] Using a simulated environment to support students learning clinical skills
    Doody, O.
    Condon, M.
    NURSE EDUCATION IN PRACTICE, 2013, 13 (06) : 561 - 566
  • [39] Continual Predictive Learning from Videos
    Chen, Geng
    Zhang, Wendong
    Lu, Han
    Gao, Siyu
    Wang, Yunbo
    Long, Mingsheng
    Yang, Xiaokang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10718 - 10727
  • [40] Word Learning From Baby Videos
    Richert, Rebekah A.
    Robb, Michael B.
    Fender, Jodi G.
    Wartella, Ellen
    ARCHIVES OF PEDIATRICS & ADOLESCENT MEDICINE, 2010, 164 (05): : 432 - 437