Automated Metadata Annotation:What Is and Is Not Possible with Machine Learning

被引:0
|
作者
Mingfang Wu [1 ]
Hans Brandhorst [2 ]
MariaCristina Marinescu [3 ]
Joaquim Mor Lpez [3 ]
Margorie Hlava [4 ]
Joseph Busch [5 ]
机构
[1] Australian Research Data Commons
[2] Iconclass
[3] Barcelona Supercomputing Center
[4] Access Innovations
[5] Taxonomy
关键词
D O I
暂无
中图分类号
TP181 [自动推理、机器学习];
学科分类号
摘要
Automated metadata annotation is only as good as training dataset, or rules that are available for the domain. It's important to learn what type of data content a pre-trained machine learning algorithm has been trained on to understand its limitations and potential biases. Consider what type of content is readily available to train an algorithm—what's popular and what's available. However, scholarly and historical content is often not available in consumable, homogenized, and interoperable formats at the large volume that is required for machine learning. There are exceptions such as science and medicine, where large, well documented collections are available. This paper presents the current state of automated metadata annotation in cultural heritage and research data, discusses challenges identified from use cases, and proposes solutions.
引用
收藏
页码:122 / 138
页数:17
相关论文
共 50 条
  • [41] Android Malware Characterization Using Metadata and Machine Learning Techniques
    Martin, Ignacio
    Alberto Hernandez, Jose
    Munoz, Alfonso
    Guzman, Antonio
    SECURITY AND COMMUNICATION NETWORKS, 2018,
  • [42] Predicting Machine Learning Pipeline Runtimes in the Context of Automated Machine Learning
    Mohr, Felix
    Wever, Marcel
    Tornede, Alexander
    Huellermeier, Eyke
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (09) : 3055 - 3066
  • [43] Bridging expertise with machine learning and automated machine learning in clinical medicine
    Lee, Chien-Chang
    Park, James Yeongjun
    Hsu, Wan-Ting
    ANNALS ACADEMY OF MEDICINE SINGAPORE, 2024, 53 (03) : 129 - 131
  • [44] Beyond Homology Transfer: Deep Learning for Automated Annotation of Proteins
    Nauman, Mohammad
    Rehman, Hafeez Ur
    Politano, Gianfranco
    Benso, Alfredo
    JOURNAL OF GRID COMPUTING, 2019, 17 (02) : 225 - 237
  • [45] Learning Semantic Traversability With Egocentric Video and Automated Annotation Strategy
    Kim, Yunho
    Lee, Jeong Hyun
    Lee, Choongin
    Mun, Juhyeok
    Youm, Donghoon
    Park, Jeongsoo
    Hwangbo, Jemin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10423 - 10430
  • [46] Beyond Homology Transfer: Deep Learning for Automated Annotation of Proteins
    Mohammad Nauman
    Hafeez Ur Rehman
    Gianfranco Politano
    Alfredo Benso
    Journal of Grid Computing, 2019, 17 : 225 - 237
  • [47] Commentary: Validation of Machine Learning-Based Automated Surgical Instrument Annotation Using Publicly Available Intraoperative Video
    Bydon, Mohamad
    Durrani, Sulaman
    Mualem, William
    OPERATIVE NEUROSURGERY, 2022, 23 (03) : E158 - E159
  • [48] devCellPy is a machine learning-enabled pipeline for automated annotation of complex multilayered single-cell transcriptomic data
    Francisco X. Galdos
    Sidra Xu
    William R. Goodyer
    Lauren Duan
    Yuhsin V. Huang
    Soah Lee
    Han Zhu
    Carissa Lee
    Nicholas Wei
    Daniel Lee
    Sean M. Wu
    Nature Communications, 13
  • [49] devCellPy is a machine learning-enabled pipeline for automated annotation of complex multilayered single-cell transcriptomic data
    Galdos, Francisco X.
    Xu, Sidra
    Goodyer, William R.
    Duan, Lauren
    Huang, Yuhsin, V
    Lee, Soah
    Zhu, Han
    Lee, Carissa
    Wei, Nicholas
    Lee, Daniel
    Wu, Sean M.
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [50] Saying what it means: Semi-automated (news) media annotation
    Nack, F
    Putz, W
    MULTIMEDIA TOOLS AND APPLICATIONS, 2004, 22 (03) : 263 - 302