Text-mining: Application development challenges

被引:0
|
作者
Varadarajan, S [1 ]
Kasravi, K [1 ]
Feldman, R [1 ]
机构
[1] Elect Data Syst Corp, Troy, MI 48098 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reviews the best practices and challenges for project managers and developers involved in implementing text-mining applications. With focus on rule-based information extraction, and references to actual cases, the authors share their experiences from having developed several text-mining applications in diverse industries. First, project management issues are discussed, including a process for capturing business requirements and mapping them into features and linguistic patterns, development of linguistic rules, rule development standards, performance metrics, and an evaluation methodology. Linguistic representations such as sub-syntactic, syntactic, semantic, and application-specific rules are identified. Special emphasis is placed on post-information extraction processing, such as improving the relevance of the extracted information, summarization models, techniques for handling typographical errors, resolution of temporal information, anaphora resolution, and a discussion on shallow vs. full parsing. Lastly, the paper discusses various utilities to help with the development of a text-mining application, such as feature analysis, visualization, source document pre-processing, and rule authoring tools.
引用
收藏
页码:247 / 260
页数:14
相关论文
共 50 条
  • [41] Combination of text-mining algorithms increases the performance
    Malik, Rainer
    Franke, Lude
    Siebes, Arno
    BIOINFORMATICS, 2006, 22 (17) : 2151 - 2157
  • [42] A Chain of Text-mining to Extract Information in Archaeology
    Amrani, Ahmed
    Abajian, Vicken
    Kodratoff, Yves
    Matte-Tailliez, Oriane
    2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 12 - +
  • [43] ChemicalTagger: A tool for semantic text-mining in chemistry
    Lezan Hawizy
    David M Jessop
    Nico Adams
    Peter Murray-Rust
    Journal of Cheminformatics, 3
  • [44] Comprehensive review of text-mining applications in finance
    Gupta, Aaryan
    Dengre, Vinya
    Kheruwala, Hamza Abubakar
    Shah, Manan
    FINANCIAL INNOVATION, 2020, 6 (01)
  • [45] Elsevier opens its papers to text-mining
    Richard Van Noorden
    Nature, 2014, 506 : 17 - 17
  • [46] Comprehensive review of text-mining applications in finance
    Aaryan Gupta
    Vinya Dengre
    Hamza Abubakar Kheruwala
    Manan Shah
    Financial Innovation, 6
  • [47] The future of food production ? a text-mining approach
    Bakhtin, Pavel
    Khabirova, Elena
    Kuzminov, Ilya
    Thurner, Thomas
    TECHNOLOGY ANALYSIS & STRATEGIC MANAGEMENT, 2020, 32 (05) : 516 - 528
  • [48] A Text-Mining and Bibliographic Analysis of the Economic Development Literature: 1959-2020
    Fang, Li
    URBAN SCIENCE, 2022, 6 (04)
  • [49] Green IT Practices across Industries: A Text-Mining based
    Deng, Qi
    Ji, Shaobo
    Wang, Yun
    AMCIS 2017 PROCEEDINGS, 2017,
  • [50] Drug repurposing: A bibliometric analysis by text-mining PubMed
    Baker, Nancy
    Ekins, Sean
    Williams, Antony
    Tropsha, Alexander
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253