Specification of audio representations in audio-related standards: Three audio representations: channel-based, object-based, and scene-based

被引:0
|
作者
Sugimoto, Takehiro [1 ]
机构
[1] NHK Japan Broadcasting Corp, Sci & Technol Res Labs, 1-10-11 Kinuta,Setagaya Ku, Tokyo 1578510, Japan
关键词
Audio representation; Loudspeaker layout; Audio-related standard; ITU-R; MPEG;
D O I
10.1250/ast.e24.65
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Currently, there are three mainstream audio representations, namely channel-based audio, object-based audio, and scene-based audio. The features of content expression differ among these audio representations, the details of which have been specified in the International Telecommunication Union: Radiocommunication Sector (ITU-R) Recommendations. The effective use of these audio representations in accordance with what is to be expressed in the content requires a deep understanding of the technical specifications and capabilities of the audio representations. This review first traces the evolution of loudspeaker layouts developed in recent years, i.e., a history of multichannelization, which is indispensable for the understanding of audio representations. Then, the position of each audio representation among various audio-related standards is described and the method of adopting and implementing each audio representation in other audio-related standards is reviewed using the Moving Picture Experts Group (MPEG) standards as examples.
引用
收藏
页码:311 / 319
页数:9
相关论文
共 50 条
  • [41] Audio thumbnailing of popular music using chroma-based representations
    Bartsch, MA
    Wakefield, GH
    IEEE TRANSACTIONS ON MULTIMEDIA, 2005, 7 (01) : 96 - 104
  • [42] The Impact of Audio Input Representations on Neural Network based Music Transcription
    Cheuk, Kin Wai
    Agres, Kat
    Herremans, Dorien
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [43] MOVIE AUDIO SCENE RECOGNITION BASED ON WFST
    Yang, Jichen
    Cai, Min
    Li, Yanxiong
    Jin, Hai
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, : 77 - 80
  • [44] Scene determination based on video and audio features
    Lienhart, R
    Pfeiffer, S
    Effelsberg, W
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 1, 1999, : 685 - 690
  • [45] Scene Determination Based on Video and Audio Features
    Silvia Pfeiffer
    Rainer Lienhart
    Wolfgang Efflsberg
    Multimedia Tools and Applications, 2001, 15 : 59 - 81
  • [46] SVM-based audio scene classification
    Jiang, HC
    Bai, JM
    Zhang, SW
    Xu, B
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 131 - 136
  • [47] Scene determination based on video and audio features
    Pfeiffer, S
    Lienhart, R
    Efflsberg, W
    MULTIMEDIA TOOLS AND APPLICATIONS, 2001, 15 (01) : 59 - 81
  • [48] Audio elements based auditory scene segmentation
    Lu, Lie
    Cai, Rui
    Hanjalic, Alan
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 4875 - 4878
  • [49] Deep Learning Based Audio Scene Classification
    Sophiya, E.
    Jothilakshmi, S.
    COMPUTATIONAL INTELLIGENCE, CYBER SECURITY AND COMPUTATIONAL MODELS: MODELS AND TECHNIQUES FOR INTELLIGENT SYSTEMS AND AUTOMATION, 2018, 844 : 98 - 109
  • [50] Scene determination based on video and audio features
    Intel Research Labs, Santa Clara, United States
    Int Conf Multimedia Comput Syst Proc, (685-690):