Adapting CRISP-DM for idea mining a data mining process for generating ideas using a textual dataset

被引:0
|
作者
Ayele W.Y. [1 ]
机构
[1] Stockholm University, Department of Computer and Systems Sciences, DSV Stockholm University
关键词
CRISP-DM; CRISP-IM; Dynamic topic modeling; Idea evaluation; Idea generation; Idea mining evaluation;
D O I
10.14569/IJACSA.2020.0110603
中图分类号
学科分类号
摘要
Data mining project managers can benefit from using standard data mining process models. The benefits of using standard process models for data mining, such as the de facto and the most popular, Cross-Industry-Standard-Process model for Data Mining (CRISP-DM) are reduced cost and time. Also, standard models facilitate knowledge transfer, reuse of best practices, and minimize knowledge requirements. On the other hand, to unlock the potential of ever-growing textual data such as publications, patents, social media data, and documents of various forms, digital innovation is increasingly needed. Furthermore, the introduction of cutting-edge machine learning tools and techniques enable the elicitation of ideas. The processing of unstructured textual data to generate new and useful ideas is referred to as idea mining. Existing literature about idea mining merely overlooks the utilization of standard data mining process models. Therefore, the purpose of this paper is to propose a reusable model to generate ideas, CRISP-DM, for Idea Mining (CRISP-IM). The design and development of the CRISP-IM are done following the design science approach. The CRISP-IM facilitates idea generation, through the use of Dynamic Topic Modeling (DTM), unsupervised machine learning, and subsequent statistical analysis on a dataset of scholarly articles. The adapted CRISP-IM can be used to guide the process of identifying trends using scholarly literature datasets or temporally organized patent or any other textual dataset of any domain to elicit ideas. The ex-post evaluation of the CRISP-IM is left for future study. © Science and Information Organization.
引用
收藏
页码:20 / 32
页数:12
相关论文
共 50 条
  • [31] Data Mining Using Unguided Symbolic Regression on a Blast Furnace Dataset
    Kommenda, Michael
    Kronberger, Gabriel
    Feilmayr, Christoph
    Affenzeller, Michael
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, PT I, 2011, 6624 : 274 - +
  • [32] Generating the Traces You Need: A Conditional Generative Model for Process Mining Data
    Graziosi, Riccardo
    Ronzani, Massimiliano
    Buliga, Andrei
    Di Francescomarino, Chiara
    Folino, Francesco
    Ghidini, Chiara
    Meneghello, Francesca
    Pontieri, Luigi
    2024 6TH INTERNATIONAL CONFERENCE ON PROCESS MINING, ICPM, 2024, : 25 - 32
  • [33] Process Mining of Mining Processes: Analyzing Longwall Coal Excavation Using Event Data
    Brzychczy, Edyta
    Zuber, Agnieszka
    Aalst, Wil van der
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (05): : 3231 - 3243
  • [34] Classification of Textual E-Mail Spam Using Data Mining Techniques
    Alguliev, Rasim M.
    Aliguliyev, Ramiz M.
    Nazirova, Saadat A.
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2011, 2011
  • [35] A hybrid framework using SOM and fuzzy theory for textual classification in data mining
    Chen, YPP
    MODELLING WITH WORDS: LEARNING, FUSION, AND REASONING WITHIN A FORMAL LINGUISTIC REPRESENTATION FRAMEWORK, 2003, 2873 : 153 - 167
  • [36] Massively parallel distributed feature extraction in textual data mining using HDDITM
    Kuntraruk, J
    Pottenger, WM
    10TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS, 2001, : 363 - 370
  • [37] USING THE MARKOV CHAINS WITHIN THE "DATA MINING" PROCESS
    Liliana, Nicoleta Fratila
    Alexandra, Mihaela Dumitrescu
    CHANGE MANAGEMENT IN A DYNAMIC ENVIRONMENT, 2011, : 202 - 208
  • [38] Using data mining methods for manufacturing process control
    Vazan, P.
    Janikova, D.
    Tanuska, P.
    Kebisek, M.
    Cervenanska, Z.
    IFAC PAPERSONLINE, 2017, 50 (01): : 6178 - 6183
  • [39] Mining Process Control Data Using Machine Learning
    Nasr, Emad S. Abouel
    Al-Mubaid, Hisham
    CIE: 2009 INTERNATIONAL CONFERENCE ON COMPUTERS AND INDUSTRIAL ENGINEERING, VOLS 1-3, 2009, : 1434 - +
  • [40] Using information filtering in web data mining process
    Zhou, Xujuan
    Li, Yuefeng
    Bruza, Peter
    Wu, Sheng-Tang
    Xu, Yue
    Lau, Raymond Y. K.
    PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 163 - +