Edge-Optimized Model for Multimedia Classification using Linguistic Metadata

被引:0
|
作者
Bharitkar, Sunil [1 ]
Paez, Thaddeus [2 ]
机构
[1] Samsung Res Amer, Digital Media Solut Grp Audio Lab, Mountain View, CA 94043 USA
[2] Samsung Elect, Samsung Res Tijuana, Mexico City, DF, Mexico
关键词
Metadata; text analysis; on-device classification; bag-of-words; latent semantic analysis; low-rank approximation; Transformers; RetNet; LSTM; Bayesian optimization;
D O I
10.1109/ICASSPW62465.2024.10626175
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Language models are relevant for text analysis. Transfer learning enables fine-tuning pre-trained large-language model (LLM) architectures for various classification and prediction tasks. However, these fine-tuned LLMs are computationally intensive, have large memory requirements, and have high inference latency, as shown in this paper, which can prevent the deployment of such models for real-time applications on edge devices. This paper presents results from a joint optimization between a low-rank factorization of a text embedding model and a recurrent long short-term memory (LSTM) model using linguistic metadata for a seventeen-class multimedia classification problem. A comparative study shows that our approach exceeds the performance of state-of-the-art large-language models in latency and number of parameters while performing approximately with the same accuracy as larger models, enabling real-time inference on an edge device. Consequently, the model performs real-time inference on a consumer TV for multimedia classification.
引用
收藏
页码:269 / 273
页数:5
相关论文
共 50 条
  • [21] Hierarchical model for multimedia content classification
    Bharitkar, Sunil
    Lopes, Andre
    Flores, Walter
    IEEE International Conference on Consumer Electronics - Berlin, ICCE-Berlin, 2019, 2019-September : 132 - 137
  • [22] Processing of Metadata on Multimedia using ExifTool A Programming Approach in Python']Python
    Toevs, Brian
    2015 ANNUAL GLOBAL ONLINE CONFERENCE ON INFORMATION AND COMPUTER TECHNOLOGY (GOCICT), 2016, : 26 - 30
  • [23] Pothole Classification Model Using Edge Detection in Road Image
    Baek, Ji-Won
    Chung, Kyungyong
    APPLIED SCIENCES-BASEL, 2020, 10 (19):
  • [24] Unique Entropy as a model of linguistic classification
    Mintz, TH
    PROCEEDINGS OF THE TWENTY-SECOND ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 2000, : 1044 - 1044
  • [25] Knowledge expansion of metadata using script mining analysis in multimedia recommendation
    Joo-Chang Kim
    Kyung-Yong Chung
    Multimedia Tools and Applications, 2021, 80 : 34679 - 34695
  • [26] Knowledge expansion of metadata using script mining analysis in multimedia recommendation
    Kim, Joo-Chang
    Chung, Kyung-Yong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (26-27) : 34679 - 34695
  • [27] SEMANTICALLY ENHANCING MULTIMEDIA DATA WAREHOUSES Using Ontologies as Part of the Metadata
    Vanea, Andrei
    Potolea, Rodica
    ICEIS 2011: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1, 2011, : 163 - 168
  • [28] 3D head model classification using optimized EGI
    Tong, Xin
    Wong, Hau-san
    Ma, Bo
    THREE-DIMENSIONAL IMAGE CAPTURE AND APPLICATIONS VII, 2006, 6056
  • [29] Multimedia Classification Using ANN Approach
    Din, Maiya
    Ratan, Ram
    Bhateja, Ashok K.
    Bhateja, Aditi
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2012), 2014, 236 : 905 - 910
  • [30] Optimized classification model for plant diseases using generative adversarial networks
    Lamba, Shweta
    Saini, Preeti
    Kaur, Jagpreet
    Kukreja, Vinay
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (01) : 103 - 115