A novel evolutionary data mining algorithm with applications to churn prediction

被引:191
|
作者
Au, WH [1 ]
Chan, KCC
Yao, X
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
关键词
churn prediction; customer retention; data mining; evolutionary computation; genetic algorithms;
D O I
10.1109/TEVC.2003.819264
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification is an important topic in data mining research. Given a set of data records; each of which belongs to one of a number of predefined classes, the classification problem is concerned with the discovery of classification rules that can allow records with unknown class membership to be correctly classified. Many algorithms have been developed to mine large data sets for classification models and they have been shown to be very effective. However, when it comes to determining the likelihood of each classification made, many of them are not designed with, such purpose in mind. For this, they are not readily applicable to such problem as churn prediction. For such an application, the goal is not only to predict whether or not a subscriber would switch from one carrier to another, it is. also important that the likelihood of the subscriber's doing so be predicted. The reason for this is that a carrier can then choose to provide special personalized offer and services to those subscribers who are predicted with higher likelihood to churn. Given its importance, we propose a new data mining algorithm, called data mining by evolutionary learning (DMEL), to handle classification problems of which the accuracy of each predictions made has to be estimated. In performing its tasks, DMEL searches through the possible rule space using an evolutionary approach that has the following characteristics: 1) the evolutionary process begins with the generation of an initial set of first-order rules (i.e., rules with one conjunct/condition) using a probabilistic induction technique and based on these rules, rules of higher order (two or more conjuncts) are obtained iteratively; 2) when identifying interesting rules, an objective interestingness measure is used; 3) the fitness of a chromosome is defined in terms of the probability that the attribute values of a record can be correctly determined using the rules it encodes; and 4) the likelihood of predictions (or classifications) made are estimated so that subscribers can be ranked according to their likelihood to churn. Experiments with different data sets showed that DMEL is able to effectively discover interesting classification rules. In particular; it is able to predict churn accurately under different churn rates when applied to real telecom subscriber data.
引用
收藏
页码:532 / 545
页数:14
相关论文
共 50 条
  • [21] A hybrid evolutionary algorithm for attribute selection in data mining
    Tan, K. C.
    Teoh, E. J.
    Yu, Q.
    Goh, K. C.
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 8616 - 8630
  • [22] An Improved Evolutionary Algorithm for Data Mining and Knowledge Discovery
    Al Duhayyim, Mesfer
    Marzouk, Radwa
    Al-Wesabi, Fahd N.
    Alrajhi, Maram
    Hamza, Manar Ahmed
    Zamani, Abu Sarwar
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (01): : 1233 - 1247
  • [23] Multiage Evolutionary Algorithm and Its Application in Data Mining
    Zhao, Li
    Wang, Lei
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (01): : 347 - 362
  • [24] New insights into churn prediction in the telecommunication sector: A profit driven data mining approach
    Verbeke, Wouter
    Dejaeger, Karel
    Martens, David
    Hur, Joon
    Baesens, Bart
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 218 (01) : 211 - 229
  • [25] Game Data Mining Competition on Churn Prediction and Survival Analysis Using Commercial Game Log Data
    Lee, Eunjo
    Jang, Yoonjae
    Yoon, Du-Mim
    Jeon, Jihoon
    Yang, Seong-il
    Lee, Sang-Kwang
    Kim, Dae-Wook
    Chen, Pei Pei
    Guitart, Anna
    Bertens, Paul
    Perianez, Africa
    Hadiji, Fabian
    Mueller, Marc
    Joo, Youngjun
    Lee, Jiyeon
    Hwang, Inchon
    Kim, Kyung-Joong
    IEEE TRANSACTIONS ON GAMES, 2019, 11 (03) : 215 - 226
  • [26] Measuring the impact of data mining on churn management
    Lejeune, MAPM
    INTERNET RESEARCH-ELECTRONIC NETWORKING APPLICATIONS AND POLICY, 2001, 11 (05): : 375 - 387
  • [27] Research on the Applications of Data Mining in Financial Prediction
    Zhou Yingying
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON EDUCATION TECHNOLOGY, MANAGEMENT AND HUMANITIES SCIENCE (ETMHS 2015), 2015, 27 : 448 - 453
  • [28] Segmentation modeling algorithm: a novel algorithm in data mining
    Bulysheva, Larisa
    Bulyshev, Alexander
    INFORMATION TECHNOLOGY & MANAGEMENT, 2012, 13 (04): : 263 - 271
  • [29] Segmentation modeling algorithm: a novel algorithm in data mining
    Larisa Bulysheva
    Alexander Bulyshev
    Information Technology and Management, 2012, 13 : 263 - 271
  • [30] Resilience to churn of a peer-to-peer evolutionary algorithm
    Laredo, Juan L.J.
    Castillo, Perdo A.
    Mora, A.M.
    Merelo, M.
    Fernandes, C.
    International Journal of High Performance Systems Architecture, 2008, 1 (04) : 260 - 268