Feature selection strategies for automated classification of digital media content

被引:5
|
作者
Rocha, Rocio [1 ]
Cobo, Angel [2 ]
机构
[1] Univ Cantabria, Dept Business Adm, E-39005 Santander, Spain
[2] Univ Cantabria, Dept Appl Math & Computat Sci, E-39005 Santander, Spain
关键词
automatic classification; clustering; digital media; feature selection; machine learning; text mining;
D O I
10.1177/0165551511412028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes strategies for feature selection of digital news articles that allow an effective implementation of learning algorithms for the unsupervised classification of news articles. With the appropriate selection of a small subset of features a correct identification of related news can be achieved, thus enabling organizations and individual users to keep track of current events. The paper defines a quality measure of the discriminatory power of each feature and verifies that the selection of a feature subset with higher quality values allows obtaining good classification results. A Particle Swarm Optimization (PSO) based selection method is also proposed. Both proposals are validated on two collections of press clippings collated from news search services in digital media. Experimental results reveal that good classification accuracy can be achieved with small subsets of between 3 per cent and 6 per cent of the features.
引用
收藏
页码:418 / 428
页数:11
相关论文
共 50 条
  • [11] Methods for pattern selection, class-specific feature selection and classification for automated learning
    Roy, Asim
    Mackin, Patrick D.
    Mukhopadhyay, Somnath
    NEURAL NETWORKS, 2013, 41 : 113 - 129
  • [12] Extensive Survey on Feature Extraction and Feature Selection Techniques for Sentiment Classification in Social Media
    Kumar, S. Sathish
    Rajini, Aruchamy
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [13] Digital Media and Newspapers' Content Distribution Strategies: A Case Study
    Lima, Ana Maria
    Teixeira, Sandrina
    Barbosa, Belem
    VISION 2025: EDUCATION EXCELLENCE AND MANAGEMENT OF INNOVATIONS THROUGH SUSTAINABLE ECONOMIC COMPETITIVE ADVANTAGE, 2019, : 6897 - 6905
  • [14] Hybrid Ensemble Learning With Feature Selection for Sentiment Classification in Social Media
    Sharma, Sanur
    Jain, Anurag
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2020, 10 (02) : 40 - 58
  • [15] Feature selection for classification
    Department of Information Systems and Computer Science, National University of Singapore, Singapore 119260, Singapore
    Intell. Data Anal., 3 (131-156):
  • [16] Multi-objective techniques for feature selection and classification in digital mammography
    Thawkar, Shankar
    Singh, Law Kumar
    Khanna, Munish
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2021, 15 (01): : 115 - 125
  • [17] Breast cancer: A hybrid method for feature selection and classification in digital mammography
    Thawkar, Shankar
    Katta, Vijay
    Parashar, Ajay Raj
    Singh, Law Kumar
    Khanna, Munish
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2023, 33 (05) : 1696 - 1712
  • [18] Ant colony optimization for feature selection and classification of microcalcifications in digital mammograms
    Karnan, M.
    Thangavel, K.
    Sivakuar, R.
    Geetha, K.
    2006 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, VOLS 1 AND 2, 2007, : 290 - +
  • [19] Automated Hyperspectral Feature Selection and Classification of Wildlife Using Uncrewed Aerial Vehicles
    Mccraine, Daniel
    Samiappan, Sathishkumar
    Kohler, Leon
    Sullivan, Timo
    Will, David J.
    REMOTE SENSING, 2024, 16 (02)
  • [20] An automated feature selection and classification pipeline to improve explainability of clinical prediction models
    Moreno-Sanchez, Pedro A.
    2021 IEEE 9TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2021), 2021, : 527 - 534