Customization of Closed Captions via Large Language Models

被引:1
|
作者
Chavez, Mariana Arroyo [1 ]
Thompson, Bernard [1 ]
Feanny, Molly [1 ]
Alabi, Kafayat [1 ]
Kim, Minchan [1 ]
Ming, Lu [1 ]
Glasser, Abraham [1 ]
Kushalnagar, Raja [1 ]
Vogler, Christian [2 ]
机构
[1] Gallaudet Univ, Sch Sci Technol Accessibil Math & Publ Hlth, Washington, DC 20002 USA
[2] Gallaudet Univ, Technol Access Program, Washington, DC 20002 USA
基金
美国国家科学基金会;
关键词
Captions; Subtitles; Large Language Models; Deaf; Live TV;
D O I
10.1007/978-3-031-62849-8_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study investigates the feasibility of employing artificial intelligence and large language models (LLMs) to customize closed captions/subtitles to match the personal needs of deaf and hard of hearing viewers. Drawing on recorded live TV samples, it compares user ratings of caption quality, speed, and understandability across five experimental conditions: unaltered verbatim captions, slowed-down verbatim captions, moderately and heavily edited captions via ChatGPT, and lightly edited captions by an LLM optimized for TV content by AppTek, LLC. Results across 16 deaf and hard of hearing participants show a significant preference for verbatim captions, both at original speeds and in the slowed-down version, over those edited by ChatGPT. However, a small number of participants also rated AI-edited captions as best. Despite the overall poor showing of AI, the results suggest that LLM-driven customization of captions on a per-user and per-video basis remains an important avenue for future research.
引用
收藏
页码:50 / 58
页数:9
相关论文
共 50 条
  • [1] CLAIR: Evaluating Image Captions with Large Language Models
    Chan, David M.
    Petryk, Suzanne
    Gonzalez, Joseph E.
    Darrell, Trevor
    Canny, John
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13638 - 13646
  • [2] Progressive Language Model Adaptation for Disaster Broadcasting with Closed-captions
    Oku, Takahiro
    Fujita, Yuya
    Kobayashi, Akio
    Sato, Shoei
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [3] Text Classification via Large Language Models
    Sun, Xiaofei
    Li, Xiaoya
    Li, Jiwei
    Wu, Fei
    Guo, Shangwei
    Zhang, Tianwei
    Wang, Guoyin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8990 - 9005
  • [4] Game Generation via Large Language Models
    Hu, Chengpeng
    Zhao, Yunlong
    Liu, Jialin
    2024 IEEE CONFERENCE ON GAMES, COG 2024, 2024,
  • [5] Experiencing Visual Captions: Augmented Communication with Real-time Visuals using Large Language Models
    Liu, Xingyu 'Bruce'
    Kirilyuk, Vladimir
    Yuan, Xiuxiu
    Chi, Peggy
    Chen, Xiang 'Anthony'
    Olwal, Alex
    Du, Ruofei
    ADJUNCT PROCEEDINGS OF THE 36TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE & TECHNOLOGY, UIST 2023 ADJUNCT, 2023,
  • [6] OPEN MARKET FOR CLOSED CAPTIONS
    KOKETTE, S
    LIBRARY JOURNAL, 1992, 117 (04) : 12 - 12
  • [7] Enhancing few-shot KB-VQA with panoramic image captions guided by Large Language Models
    Qiang, Pengpeng
    Tan, Hongye
    Li, Xiaoli
    Wang, Dian
    Li, Ru
    Sun, Xinyi
    Zhang, Hu
    Liang, Jiye
    NEUROCOMPUTING, 2025, 623
  • [8] Detoxifying Large Language Models via Knowledge Editing
    Wang, Mengru
    Zhang, Ningyu
    Xu, Ziwen
    Xi, Zekun
    Deng, Shumin
    Yao, Yunzhi
    Zhang, Qishen
    Yang, Linyi
    Wang, Jindong
    Chen, Huajun
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3093 - 3118
  • [9] Trend Extraction and Analysis via Large Language Models
    Soru, Tommaso
    Marshall, Jim
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
  • [10] EVALUATION OF AUTOMATICALLY GENERATED VIDEO CAPTIONS USING VISION AND LANGUAGE MODELS
    Lebron, Luis
    Graham, Yvette
    O'Connor, Noel E.
    McGuinness, Kevin
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2416 - 2420