Adapting CRISP-DM for idea mining a data mining process for generating ideas using a textual dataset

被引:0
|
作者
Ayele W.Y. [1 ]
机构
[1] Stockholm University, Department of Computer and Systems Sciences, DSV Stockholm University
关键词
CRISP-DM; CRISP-IM; Dynamic topic modeling; Idea evaluation; Idea generation; Idea mining evaluation;
D O I
10.14569/IJACSA.2020.0110603
中图分类号
学科分类号
摘要
Data mining project managers can benefit from using standard data mining process models. The benefits of using standard process models for data mining, such as the de facto and the most popular, Cross-Industry-Standard-Process model for Data Mining (CRISP-DM) are reduced cost and time. Also, standard models facilitate knowledge transfer, reuse of best practices, and minimize knowledge requirements. On the other hand, to unlock the potential of ever-growing textual data such as publications, patents, social media data, and documents of various forms, digital innovation is increasingly needed. Furthermore, the introduction of cutting-edge machine learning tools and techniques enable the elicitation of ideas. The processing of unstructured textual data to generate new and useful ideas is referred to as idea mining. Existing literature about idea mining merely overlooks the utilization of standard data mining process models. Therefore, the purpose of this paper is to propose a reusable model to generate ideas, CRISP-DM, for Idea Mining (CRISP-IM). The design and development of the CRISP-IM are done following the design science approach. The CRISP-IM facilitates idea generation, through the use of Dynamic Topic Modeling (DTM), unsupervised machine learning, and subsequent statistical analysis on a dataset of scholarly articles. The adapted CRISP-IM can be used to guide the process of identifying trends using scholarly literature datasets or temporally organized patent or any other textual dataset of any domain to elicit ideas. The ex-post evaluation of the CRISP-IM is left for future study. © Science and Information Organization.
引用
收藏
页码:20 / 32
页数:12
相关论文
共 50 条
  • [21] CMIN - a CRISP-DM-based case tool for supporting data mining projects
    Cobos, Carlos
    Zuniga, Jhon
    Guarin, Juan
    Leon, Elizabeth
    Mendoza, Martha
    INGENIERIA E INVESTIGACION, 2010, 30 (03): : 45 - 56
  • [22] PRODUCTS DATASET ANALYSIS USING DATA MINING TECHNIQUES
    Jaleel, Hanan Qassim
    Stephan, Jane Jaleel
    Naji, Sinan A.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2021, 16 (05): : 3880 - 3906
  • [23] Process mining in oncology using the MIMIC-III dataset
    Kurniati, Angelina Prima
    Hall, Geoff
    Hogg, David
    Johnson, Owen
    INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE (ICODIS), 2018, 971
  • [24] Data mining a prostate cancer dataset using rough sets
    Revett, Kenneth
    de Magalhaes, Sergio Tenreiro
    Santos, Henrique A. D.
    2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 285 - 288
  • [25] Data mining a prostate cancer dataset using neural networks
    Revett, Kenneth
    NEUREL 2006: EIGHT SEMINAR ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING, PROCEEDINGS, 2006, : 157 - 160
  • [26] GEDI: Generating Event Data with Intentional Features for Benchmarking Process Mining
    Maldonado, Andrea
    Frey, Christian M. M.
    Tavares, Gabriel Marques
    Rehwald, Nikolina
    Seidl, Thomas
    BUSINESS PROCESS MANAGEMENT, BPM 2024, 2024, 14940 : 221 - 237
  • [27] The needs and benefits of applying textual data mining within the product development process
    Menon, R
    Tong, LH
    Sathiyakeerthi, S
    Brombacher, A
    Leong, C
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2004, 20 (01) : 1 - 15
  • [28] Adapting gamified learning systems using educational data mining techniques
    Daghestani, Lamya F.
    Ibrahim, Lamiaa F.
    Al-Towirgi, Reem S.
    Salman, Hesham A.
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2020, 28 (03) : 568 - 589
  • [29] Idea Development from Consumers' Complaint Messages Using Data Mining
    Kao, Shu-Chen
    Syu, Shun-Wei
    FUZZY SYSTEMS AND DATA MINING V (FSDM 2019), 2019, 320 : 429 - 434
  • [30] Clustering Technique on Search Engine Dataset using Data Mining Tool
    Ahmed, Ezaz
    Bansal, Preeti
    2013 THIRD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION TECHNOLOGIES (ACCT 2013), 2013, : 86 - 89