Customization of Closed Captions via Large Language Models

被引：1

作者：

Chavez, Mariana Arroyo ^{[1
]}

Thompson, Bernard ^{[1
]}

Feanny, Molly ^{[1
]}

Alabi, Kafayat ^{[1
]}

Kim, Minchan ^{[1
]}

Ming, Lu ^{[1
]}

Glasser, Abraham ^{[1
]}

Kushalnagar, Raja ^{[1
]}

Vogler, Christian ^{[2
]}

机构：

[1] Gallaudet Univ, Sch Sci Technol Accessibil Math & Publ Hlth, Washington, DC 20002 USA

[2] Gallaudet Univ, Technol Access Program, Washington, DC 20002 USA

来源：

COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PT II, ICCHP 2024 | 2024年 / 14751卷

基金：

美国国家科学基金会;

关键词：

Captions; Subtitles; Large Language Models; Deaf; Live TV;

D O I：

10.1007/978-3-031-62849-8_7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This study investigates the feasibility of employing artificial intelligence and large language models (LLMs) to customize closed captions/subtitles to match the personal needs of deaf and hard of hearing viewers. Drawing on recorded live TV samples, it compares user ratings of caption quality, speed, and understandability across five experimental conditions: unaltered verbatim captions, slowed-down verbatim captions, moderately and heavily edited captions via ChatGPT, and lightly edited captions by an LLM optimized for TV content by AppTek, LLC. Results across 16 deaf and hard of hearing participants show a significant preference for verbatim captions, both at original speeds and in the slowed-down version, over those edited by ChatGPT. However, a small number of participants also rated AI-edited captions as best. Despite the overall poor showing of AI, the results suggest that LLM-driven customization of captions on a per-user and per-video basis remains an important avenue for future research.

引用

页码：50 / 58

页数：9

共 50 条

[1] CLAIR: Evaluating Image Captions with Large Language Models
Chan, David M.
Petryk, Suzanne
Gonzalez, Joseph E.
Darrell, Trevor
Canny, John
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13638 - 13646
[2] Progressive Language Model Adaptation for Disaster Broadcasting with Closed-captions
Oku, Takahiro
Fujita, Yuya
Kobayashi, Akio
Sato, Shoei
2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[3] Text Classification via Large Language Models
Sun, Xiaofei
Li, Xiaoya
Li, Jiwei
Wu, Fei
Guo, Shangwei
Zhang, Tianwei
Wang, Guoyin
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8990 - 9005
[4] Game Generation via Large Language Models
Hu, Chengpeng
Zhao, Yunlong
Liu, Jialin
2024 IEEE CONFERENCE ON GAMES, COG 2024, 2024,
[5] Experiencing Visual Captions: Augmented Communication with Real-time Visuals using Large Language Models
Liu, Xingyu 'Bruce'
Kirilyuk, Vladimir
Yuan, Xiuxiu
Chi, Peggy
Chen, Xiang 'Anthony'
Olwal, Alex
Du, Ruofei
ADJUNCT PROCEEDINGS OF THE 36TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE & TECHNOLOGY, UIST 2023 ADJUNCT, 2023,
[6] OPEN MARKET FOR CLOSED CAPTIONS
KOKETTE, S
LIBRARY JOURNAL, 1992, 117 (04) : 12 - 12
[7] Enhancing few-shot KB-VQA with panoramic image captions guided by Large Language Models
Qiang, Pengpeng
Tan, Hongye
Li, Xiaoli
Wang, Dian
Li, Ru
Sun, Xinyi
Zhang, Hu
Liang, Jiye
NEUROCOMPUTING, 2025, 623
[8] Detoxifying Large Language Models via Knowledge Editing
Wang, Mengru
Zhang, Ningyu
Xu, Ziwen
Xi, Zekun
Deng, Shumin
Yao, Yunzhi
Zhang, Qishen
Yang, Linyi
Wang, Jindong
Chen, Huajun
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3093 - 3118
[9] Trend Extraction and Analysis via Large Language Models
Soru, Tommaso
Marshall, Jim
18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
[10] EVALUATION OF AUTOMATICALLY GENERATED VIDEO CAPTIONS USING VISION AND LANGUAGE MODELS
Lebron, Luis
Graham, Yvette
O'Connor, Noel E.
McGuinness, Kevin
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2416 - 2420

← 1 2 3 4 5 →