Edge-Optimized Model for Multimedia Classification using Linguistic Metadata

被引:0
|
作者
Bharitkar, Sunil [1 ]
Paez, Thaddeus [2 ]
机构
[1] Samsung Res Amer, Digital Media Solut Grp Audio Lab, Mountain View, CA 94043 USA
[2] Samsung Elect, Samsung Res Tijuana, Mexico City, DF, Mexico
关键词
Metadata; text analysis; on-device classification; bag-of-words; latent semantic analysis; low-rank approximation; Transformers; RetNet; LSTM; Bayesian optimization;
D O I
10.1109/ICASSPW62465.2024.10626175
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Language models are relevant for text analysis. Transfer learning enables fine-tuning pre-trained large-language model (LLM) architectures for various classification and prediction tasks. However, these fine-tuned LLMs are computationally intensive, have large memory requirements, and have high inference latency, as shown in this paper, which can prevent the deployment of such models for real-time applications on edge devices. This paper presents results from a joint optimization between a low-rank factorization of a text embedding model and a recurrent long short-term memory (LSTM) model using linguistic metadata for a seventeen-class multimedia classification problem. A comparative study shows that our approach exceeds the performance of state-of-the-art large-language models in latency and number of parameters while performing approximately with the same accuracy as larger models, enabling real-time inference on an edge device. Consequently, the model performs real-time inference on a consumer TV for multimedia classification.
引用
收藏
页码:269 / 273
页数:5
相关论文
共 50 条
  • [1] An Edge-Optimized Incremental Learning Algorithm For Audio Classification
    Tsai, Tsung-Han
    Hussain, Muhammad Awais
    Lee, Chun-Lin
    2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 504 - 504
  • [2] Edge-Optimized Neural Networks for Real-Time English Corpus Recommendation
    Han, Shuangshuang
    Zhang, Kai
    INTERNET TECHNOLOGY LETTERS, 2024,
  • [3] Edge-Optimized A-Trous Wavelets for Local Contrast Enhancement with Robust Denoising
    Hanika, Johannes
    Dammertz, Holger
    Lensch, Hendrik
    COMPUTER GRAPHICS FORUM, 2011, 30 (07) : 1879 - 1886
  • [4] Learning With Sharing: An Edge-Optimized Incremental Learning Method for Deep Neural Networks
    Hussain, Muhammad Awais
    Huang, Shih-An
    Tsai, Tsung-Han
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (02) : 461 - 473
  • [5] Multimedia semantics integration using linguistic model
    Yang, Bo
    Hurson, Ali R.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 679 - 688
  • [6] Analysis of Edge-Optimized Deep Learning Classifiers for Radar-Based Gesture Recognition
    Chmurski, Mateusz
    Zubert, Mariusz
    Bierzynski, Kay
    Santra, Avik
    IEEE ACCESS, 2021, 9 : 74406 - 74421
  • [7] Optimized Deep Learning Classification Model for Intelligent Edge devices
    Naveen, Soumyalatha
    Kounte, Manjunath R
    Journal of Engineering Science and Technology Review, 2024, 17 (03) : 88 - 94
  • [8] Lifting a Metadata Model to the Semantic Multimedia World
    Martens, Gaetan
    Verborgh, Ruben
    Poppe, Chris
    Van de Walle, Rik
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2011, 7 (01): : 199 - 208
  • [9] Metadata Retrieval Using RTCP for Multimedia Streaming
    Kum, Seung-woo
    Lim, Tae Beom
    Lee, Seok Pil
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 815 - 820
  • [10] Multimedia access control using RDF metadata
    Kodali, N
    Farkas, C
    Wijesekera, D
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: OTM 2003 WORKSHOPS, 2003, 2889 : 718 - 731