Token Imbalance Adaptation for Radiology Report Generation

被引：0

作者：

Wu, Yuexin ^{[1
]}

Huang, I-Chan ^{[2
]}

Huang, Xiaolei ^{[1
]}

机构：

[1] Univ Memphis, Memphis, TN 38152 USA

[2] St Jude Childrens Res Hosp, Memphis, TN USA

来源：

CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 209 | 2023年 / 209卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. To solve the challenge, we propose the Token Imbalance Adapter (TIMER), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.

引用

页码：72 / 85

页数：14

共 50 条

[41] Cross-Modal Prototype Driven Network for Radiology Report Generation
Wang, Jun
Bhalerao, Abhir
He, Yulan
COMPUTER VISION - ECCV 2022, PT XXXV, 2022, 13695 : 563 - 579
[42] Computer-aided radiology report generation with the world wide web
Kahn, CE
Huynh, PN
RADIOLOGY, 1996, 201 : 9111 - 9111
[43] MATNet: Exploiting Multi-Modal Features for Radiology Report Generation
Shang, Caozhi
Cui, Shaoguo
Li, Tiansong
Wang, Xi
Li, Yongmei
Jiang, Jingfeng
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2692 - 2696
[44] Multi-granularity Semantic Guided Transformer for Radiology Report Generation
Song, Yu
Hua, Xiaojin
Mang, Kunli
Zan, Hongying
Lie, Runzhi
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 458 - 471
[45] METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
Wang, Zhanyu
Liu, Lingqiao
Wang, Lei
Zhou, Luping
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11558 - 11567
[46] Textual Inversion and Self-supervised Refinement for Radiology Report Generation
Luo, Yuanjiang
Li, Hongxiang
Wu, Xuan
Cao, Meng
Huang, Xiaoshuang
Zhu, Zhihong
Liao, Peixi
Chen, Hu
Zhang, Yi
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 681 - 691
[47] Energy-Based Controllable Radiology Report Generation with Medical Knowledge
Hou, Zeyi
Yan, Ruixin
Yan, Ziye
Lang, Ning
Zhou, Xiuzhuang
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 240 - 250
[48] THE PULMONARY CIRCULATION - THE RADIOLOGY OF ADAPTATION
GRAINGER, RG
CLINICAL RADIOLOGY, 1985, 36 (02) : 103 - 116
[49] Knowledge matters: Chest radiology report generation with general and specific knowledge
Yang, Shuxin
Wu, Xian
Ge, Shen
Zhou, S. Kevin
Xiao, Li
MEDICAL IMAGE ANALYSIS, 2022, 80
[50] AI-Driven Radiology Report Generation for Traumatic Brain Injuries
Bouslimi, Riadh
Trabelsi, Houda
Karaa, Wahiba Ben Abdessalem
Hedhli, Hana
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025,

← 1 2 3 4 5 →