General and Task-Oriented Video Segmentation

被引:0
|
作者
Chen, Mu [1 ]
Li, Liulei [1 ]
Wang, Wenguan [2 ]
Quan, Ruijie [2 ]
Yang, Yi [2 ]
机构
[1] Univ Technol Sydney, ReLER Lab, AAII, Ultimo, Australia
[2] Zhejiang Univ, ReLER Lab, CCAI, Hangzhou, Peoples R China
来源
关键词
Video segmentation; General solution; Task-orientation; INSTANCE; TRANSFORMER; ATTENTION; SHAPE;
D O I
10.1007/978-3-031-72667-5_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present GVSEG, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies deployment. However, such a highly homogenized framework in current design, where each element maintains uniformity, could overlook the inherent diversity among different tasks and lead to suboptimal performance. To tackle this, GVSEG: i) provides a holistic disentanglement and modeling for segment targets, thoroughly examining them from the perspective of appearance, position, and shape, and on this basis, ii) reformulates the query initialization, matching and sampling strategies in alignment with the task-specific requirement. These architecture-agnostic innovations empower GVSEG to effectively address each unique task by accommodating the specific properties that characterize them. Extensive experiments on seven gold-standard benchmark datasets demonstrate that GVSEG surpasses all existing specialized/general solutions by a significant margin on four different video segmentation tasks.
引用
收藏
页码:72 / 92
页数:21
相关论文
共 50 条
  • [31] Warnings: A task-oriented design approach
    Noyes, J
    Starr, A
    CONTEMPORARY ERGONOMICS 1998, 1998, : 306 - 310
  • [32] Task-Oriented Dialogue as Dataflow Synthesis
    Andreas, Jacob
    Bufe, John
    Burkett, David
    Chen, Charles
    Clausman, Josh
    Crawford, Jean
    Crim, Kate
    DeLoach, Jordan
    Dorner, Leah
    Eisner, Jason
    Fang, Hao
    Guo, Alan
    Hall, David
    Hayes, Kristin
    Hill, Kellie
    Ho, Diana
    Iwaszuk, Wendy
    Jha, Smriti
    Klein, Dan
    Krishnamurthy, Jayant
    Lanman, Theo
    Liang, Percy
    Lin, Christopher H.
    Lintsbakh, Ilya
    McGovern, Andy
    Nisnevich, Aleksandr
    Pauls, Adam
    Petters, Dmitrij
    Read, Brent
    Roth, Dan
    Roy, Subhro
    Rusak, Jesse
    Short, Beth
    Slomin, Div
    Snyder, Ben
    Striplin, Stephon
    Su, Yu
    Tellman, Zachary
    Thomson, Sam
    Vorobev, Andrei
    Witoszko, Izabela
    Wolfe, Jason
    Wray, Abby
    Zhang, Yuchen
    Zotov, Alexander
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (08) : 556 - 571
  • [33] A task-oriented taxonomy of visual completion
    Yin, C
    BEHAVIORAL AND BRAIN SCIENCES, 1998, 21 (06) : 780 - +
  • [34] Landmark selection for task-oriented navigation
    Lerner, Ronen
    Rivlin, Ehud
    Shimshoni, Ilan
    IEEE TRANSACTIONS ON ROBOTICS, 2007, 23 (03) : 494 - 505
  • [35] TASK-ORIENTED PROBABILISTIC ACTIVE VISION
    Guerrero, Pablo
    Ruiz-Del-Solar, Javier
    Romero, Miguel
    Angulo, Sergio
    INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2010, 7 (03) : 451 - 476
  • [36] Task-Oriented Robot Cognitive Manipulation Planning Using Affordance Segmentation and Logic Reasoning
    Wang, Zhongli
    Tian, Guohui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12172 - 12185
  • [37] TASK-ORIENTED GROUP AS A CONTEXT FOR TREATMENT
    FIDLER, GS
    AMERICAN JOURNAL OF OCCUPATIONAL THERAPY, 1969, 23 (01): : 43 - 48
  • [38] CODING NOISE IN A TASK-ORIENTED GROUP
    MACY, J
    CHRISTIE, LS
    LUCE, RD
    JOURNAL OF ABNORMAL AND SOCIAL PSYCHOLOGY, 1953, 48 (03): : 401 - 409
  • [39] Task-oriented Architecture for a Humanoid Robot
    Mu, Yan-Hua
    ADVANCES IN SCIENCE AND ENGINEERING, PTS 1 AND 2, 2011, 40-41 : 228 - 234
  • [40] Task-oriented speech and information processing
    Bhagwat, Vineet
    Shirley, Sara E.
    Stark, Jeffrey R.
    JOURNAL OF BANKING & FINANCE, 2024, 161