Feature selection strategies for automated classification of digital media content

被引:5
|
作者
Rocha, Rocio [1 ]
Cobo, Angel [2 ]
机构
[1] Univ Cantabria, Dept Business Adm, E-39005 Santander, Spain
[2] Univ Cantabria, Dept Appl Math & Computat Sci, E-39005 Santander, Spain
关键词
automatic classification; clustering; digital media; feature selection; machine learning; text mining;
D O I
10.1177/0165551511412028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes strategies for feature selection of digital news articles that allow an effective implementation of learning algorithms for the unsupervised classification of news articles. With the appropriate selection of a small subset of features a correct identification of related news can be achieved, thus enabling organizations and individual users to keep track of current events. The paper defines a quality measure of the discriminatory power of each feature and verifies that the selection of a feature subset with higher quality values allows obtaining good classification results. A Particle Swarm Optimization (PSO) based selection method is also proposed. Both proposals are validated on two collections of press clippings collated from news search services in digital media. Experimental results reveal that good classification accuracy can be achieved with small subsets of between 3 per cent and 6 per cent of the features.
引用
收藏
页码:418 / 428
页数:11
相关论文
共 50 条
  • [1] Unsupervised group feature selection for media classification
    Zaharieva M.
    Breiteneder C.
    Hudec M.
    International Journal of Multimedia Information Retrieval, 2017, 6 (3) : 233 - 249
  • [2] Towards feature selection for digital mammogram classification
    Bajcsi, Adel
    Andreica, Anca
    Chira, Camelia
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 632 - 641
  • [3] Feature extraction and selection strategies for automated target recognition
    Greene, W. Nicholas
    Zhang, Yuhan
    Lu, Thomas T.
    Chao, Tien-Hsin
    INDEPENDENT COMPONENT ANALYSES, WAVELETS, NEURAL NETWORKS, BIOSYSTEMS, AND NANOENGINEERING VIII, 2010, 7703
  • [4] Automated feature selection procedure for particle jet classification
    Di Luca, Andrea
    Cristoforetti, Marco
    Follega, Francesco Maria
    Iuppa, Roberto
    Mascione, Daniela
    NUCLEAR PHYSICS B, 2023, 990
  • [5] Study on Method of Feature Selection in Speech Content Classification
    An, Si
    Fan, Xinghua
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (04) : 71 - 75
  • [6] Ensembles of wrappers for automated feature selection in fish age classification
    Bermejo, Sergio
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2017, 134 : 27 - 32
  • [7] Automated classification of diabetic retinopathy through reliable feature selection
    Gayathri, S.
    Gopi, Varun P.
    Palanisamy, P.
    PHYSICAL AND ENGINEERING SCIENCES IN MEDICINE, 2020, 43 (03) : 927 - 945
  • [8] Effective Automated Feature Construction and Selection for Classification of Biological Sequences
    Kamath, Uday
    De Jong, Kenneth
    Shehu, Amarda
    PLOS ONE, 2014, 9 (07):
  • [9] Automated classification of diabetic retinopathy through reliable feature selection
    S. Gayathri
    Varun P. Gopi
    P. Palanisamy
    Physical and Engineering Sciences in Medicine, 2020, 43 : 927 - 945
  • [10] Applying automated content organization technologies to digital media storage
    Williams, M. J.
    Turner, A.
    Blackham, D.
    SMPTE MOTION IMAGING JOURNAL, 2006, 115 (5-6): : 193 - 199