Customer feature selection from high-dimensional bank direct marketing data for uplift modeling

被引:4
|
作者
Hu, Jinping [1 ]
机构
[1] Shenzhen Technol Univ, 3002 Lantian Rd, Shenzhen 518118, Guangdong, Peoples R China
关键词
Bank direct marketing; Feature selection; Redundant features; Relevant features; Uplift modeling; RELEVANCE; PREDICTION; CHURN;
D O I
10.1057/s41270-022-00160-z
中图分类号
F [经济];
学科分类号
02 ;
摘要
Uplift modeling estimates the incremental impact (i.e., uplift) of a marketing campaign on customer outcomes. These models are essential to banks' direct marketing efforts. However, bank data are often high-dimensional, with hundreds to thousands of customer features; and keeping irrelevant and redundant features in an uplift model can be computationally inefficient and adversely affect model performance. Therefore, banks must narrow their feature selection for uplift modeling. Yet, literature on feature selection has rarely focused on uplift modeling. This paper proposes several two-step feature selection approaches to uplift models, structured to cluster highly relevant, low-redundant feature subsets from high-dimensional banking data. Empirical experiments show that fewer features in a selected set (20 out of 180 features) lead to 68.6% of these uplift models performing as well or better than complete feature set models.
引用
收藏
页码:160 / 171
页数:12
相关论文
共 50 条
  • [41] Bird's Eye View feature selection for high-dimensional data
    Belhaouari, Samir Brahim
    Shakeel, Mohammed Bilal
    Erbad, Aiman
    Oflaz, Zarina
    Kassoul, Khelil
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [42] Feature selection using autoencoders with Bayesian methods to high-dimensional data
    Shu, Lei
    Huang, Kun
    Jiang, Wenhao
    Wu, Wenming
    Liu, Hongling
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7397 - 7406
  • [43] Feature Selection for High-Dimensional Data Through Instance Vote Combining
    Chamakura, Lily
    Saha, Goutam
    PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 161 - 169
  • [44] Improving Evolutionary Algorithm Performance for Feature Selection in High-Dimensional Data
    Cilia, N.
    De Stefano, C.
    Fontanella, F.
    di Freca, A. Scotto
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2018, 2018, 10784 : 439 - 454
  • [45] The feature selection bias problem in relation to high-dimensional gene data
    Krawczuk, Jerzy
    Lukaszuk, Tomasz
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2016, 66 : 63 - 71
  • [46] A GA-based Feature Selection for High-dimensional Data Clustering
    Sun, Mei
    Xiong, Langhuan
    Sun, Haojun
    Jiang, Dazhi
    THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 769 - 772
  • [47] A Cost-Sensitive Feature Selection Method for High-Dimensional Data
    An, Chaojie
    Zhou, Qifeng
    14TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2019), 2019, : 1089 - 1094
  • [48] Online feature selection for high-dimensional class-imbalanced data
    Zhou, Peng
    Hu, Xuegang
    Li, Peipei
    Wu, Xindong
    KNOWLEDGE-BASED SYSTEMS, 2017, 136 : 187 - 199
  • [49] Accurate and fast feature selection workflow for high-dimensional omics data
    Perez-Riverol, Yasset
    Kuhn, Max
    Vizcaino, Juan Antonio
    Hitz, Marc-Phillip
    Audain, Enrique
    PLOS ONE, 2017, 12 (12):
  • [50] Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data
    Yamada, Makoto
    Tang, Jiliang
    Lugo-Martinez, Jose
    Hodzic, Ermin
    Shrestha, Raunak
    Saha, Avishek
    Ouyang, Hua
    Yin, Dawei
    Mamitsuka, Hiroshi
    Sahinalp, Cenk
    Radivojac, Predrag
    Menczer, Filippo
    Chang, Yi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (07) : 1352 - 1365