Generic features selection for structure classification of diverse styled scholarly articles

被引:1
|
作者
Waqas, Muhammad [1 ]
Anjum, Nadeem [1 ]
机构
[1] Capital Univ Sci & Technol, Dept Comp Sci, ICT, Expressway,Kahuta Rd,Zone 5, Islamabad, Pakistan
关键词
Features Engineering; Machine Learning; Research Article; Metadata Extraction; Text mining; KNOWLEDGE; SYSTEM;
D O I
10.1007/s11042-023-16128-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The enormous growth in online research publications in diversified domains has attracted the research community to extract these valuable scientific resources by searching online digital libraries and publishers' websites. A precise search is desired to enlist most related articles by applying semantic queries to the document's metadata and the structural elements. The online search engines and digital libraries offer only keyword-based search on full-body text, which creates excessive results. Therefore, the research article's structural and metadata information has to be stored in machine comprehendible form by the online research publishers. The research community in recent years has adopted different approaches to extract structural information from research documents like rule-based heuristics and machine-learning-based approaches. Studies suggest that machine-learning-based techniques have produced optimum results for document structure extraction from publishers having diversified publication layouts. In this paper, we have proposed thirteen different logical layout structural (LLS) components. We have identified a two-staged innovative set of generic features that are associated with the LLS. This approach has given our technique an advantage against the state-of-the-art for structural classification of digital scientific articles with diversified publication styles. We have applied chi-square (chi(2)) for feature selection, and the final result has revealed that SVM (Kernal function) has produced an optimum result with an overall F-measure of 0.95.
引用
收藏
页码:16623 / 16655
页数:33
相关论文
共 50 条
  • [1] Generic features selection for structure classification of diverse styled scholarly articles
    Muhammad Waqas
    Nadeem Anjum
    Multimedia Tools and Applications, 2024, 83 : 16623 - 16655
  • [2] Selection of diverse features with a diverse regularization
    Zhong, Weichan
    Chen, Xiaojun
    Wu, Qingyao
    Yang, Min
    Huang, Joshua Zhexue
    PATTERN RECOGNITION, 2021, 120
  • [3] Addressing Imbalance Problem for Multi Label Classification of Scholarly Articles
    Hafeez, Aiman
    Ali, Tariq
    Nawaz, Asif
    Rehman, Saif Ur
    Mudasir, Azhar Imran
    Alsulami, Abdulaziz A.
    Alqahtani, Ali
    IEEE ACCESS, 2023, 11 : 74500 - 74516
  • [4] HierClasSArt: Knowledge-Aware Hierarchical Classification of Scholarly Articles
    Alam, Mehwish
    Biswas, Russa
    Chen, Yiyi
    Dessi, Danilo
    Gesese, Genet Asefa
    Hoppe, Fabian
    Sack, Harald
    WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 436 - 440
  • [5] Computation of generic features for object classification
    Hall, D
    Crowley, JL
    SCALE SPACE METHODS IN COMPUTER VISION, PROCEEDINGS, 2003, 2695 : 744 - 756
  • [6] Fuzzy classification of generic edge features
    Gao, QG
    Qing, D
    Lu, SW
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 668 - 671
  • [7] Classification Model for Scholarly Articles Based on Improved Graph Neural Network
    Xuejian H.
    Yuyang L.
    Tinghuai M.
    Data Analysis and Knowledge Discovery, 2022, 6 (10) : 93 - 102
  • [8] A modified mixture of experts network structure for ECG beats classification with diverse features
    Güler, I
    Übeyli, ED
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2005, 18 (07) : 845 - 856
  • [9] Metadiscourse variations in the generic structure of disciplinary research articles
    Kashiha, Hadi
    INTERNATIONAL REVIEW OF PRAGMATICS, 2021, 13 (02) : 193 - 212
  • [10] On the selection and classification of independent features
    Bressan, M
    Vitrià, J
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (10) : 1312 - 1317