Novel Fuzzy Correlation Coefficient and Variable Selection Method for Fuzzy Regression Analysis Based on Distance Approach

被引:3
|
作者
Yoon, Jin Hee [1 ]
Kim, Dae Jong [2 ]
Koo, Yoo Young [3 ]
机构
[1] Sejong Univ, Dept Math & Stat, Seoul 05006, South Korea
[2] Sejong Univ, Dept Business & Adm, Seoul 05006, South Korea
[3] Univ Coll, Yonsei Univ, Incheon 21983, South Korea
基金
新加坡国家研究基金会;
关键词
Fuzzy correlation coefficient; Fuzzy Regression; Fuzzy variable Selection Method; L2; Distance;
D O I
10.1007/s40815-023-01546-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
data analysis, analyzing the relationships between the variables such as correlation analysis and regression analysis are very important. Correlation analysis and regression analysis are not only very important in analyzing the influence relationship and causal relationship of variables but also serve as the basis for statistical analysis. Furthermore, they are essential and important as basic analysis for machine learning analysis such as deep learning. This is because in analyzing the input and output in deep learning, variables with high correlation are selected first, and in analyzing the causal relationship, it is basic to first conduct basic analysis such as regression analysis. Especially, when data are observed as fuzzy data with ambiguous information, it is difficult to propose unique methods for those analyses due to its complexity. However, the application of fuzzy theory to correlation analysis for data with such ambiguous information has not been an effective study, and several studies have been conducted in cases where the data is not general fuzzy data or interval estimation. As a result, the effectiveness of the fuzzy theory was not highlighted. In particular, the variable selection method for selecting important variables in multiple regression analysis is a very important and essential process in regression analysis. A variable that is significant in simple regression analysis may not be significant in multiple regression analysis due to its relationship with other variables. Therefore, not all variables that affect the dependent variable can be used as independent variables in multiple regression analysis. Therefore, multiple regression analysis goes through the process of excluding some variables. But until now, the process of fuzzy multiple regression analysis has not been applied without a variable selection method and the significance of important variables has not been emphasized that much. In this paper, a fuzzy correlation coefficient and multiple fuzzy regression analysis using variable section method are proposed. For this, first defuzzification and fuzzy ordering are defined. And then fuzzy correlation coefficient is proposed using L2 distance. Next, fuzzy sum of squares are defined for F-statistics to test the significance of the regression model. Using this F-statistics, fuzzy R2, and fuzzy RMSE, several variable selection methods are proposed based on distance approach. For the data analysis, foreign exchange reserve data and house price of South Korea have been applied which are important indicators for economic crisis. The financial data is mostly recorded as closing values, but the closing values cannot be the representative of the given period of time. Therefore, we can deal with the financial data as fuzzy data which have some fluctuation that can be considered as vagueness that the data originally include. We have used foreign exchange reserve data and house price data with several financial variables. And the proposed fuzzy correlation coefficient and variable selection for fuzzy regression analysis are applied to these financial data.
引用
收藏
页码:2969 / 2985
页数:17
相关论文
共 50 条
  • [21] Variable based fuzzy blocking regression model
    Sato-Ilic, Mika
    Knowledge-Based Intelligent Information and Engineering Systems: KES 2007 - WIRN 2007, Pt II, Proceedings, 2007, 4693 : 525 - 532
  • [22] A NOVEL APPROACH TO RANKING FUZZY NUMBERS BASED ON FUZZY ACCEPTABILITY ANALYSIS
    Yatsalo, Boris
    Martinez, Luis
    UNCERTAINTY MODELLING IN KNOWLEDGE ENGINEERING AND DECISION MAKING, 2016, 10 : 75 - 80
  • [23] Feature selection based on regularization of sparsity based regression models by hesitant fuzzy correlation
    Mokhtia, Mahla
    Eftekhari, Mahdi
    Saberi-Movahed, Farid
    APPLIED SOFT COMPUTING, 2020, 91
  • [24] FUZZY FACILITY SITE SELECTION MODEL BASED ON SIGNED DISTANCE METHOD
    Lee, Huey-Ming
    Lin, Lily
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (06): : 1505 - 1514
  • [25] Feature selection method based on fuzzy entropy for regression in QSAR studies
    Elmi, Zahra
    Faez, Karim
    Goodarzi, Mohammad
    Goudarzi, Nasser
    MOLECULAR PHYSICS, 2009, 107 (17) : 1787 - 1798
  • [26] AN INNOVATIVE APPROACH ON FUZZY CORRELATION COEFFICIENT WITH INTERVAL DATA
    Hsu, Hui-Li
    Wu, Berlin
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (3A): : 1049 - 1058
  • [27] Goodness of fit and variable selection in the fuzzy multiple linear regression
    D'Urso, Pierpaolo
    Santoro, Adriana
    FUZZY SETS AND SYSTEMS, 2006, 157 (19) : 2627 - 2647
  • [28] A fuzzy clustering approach for fuzzy data based on a generalized distance
    Belen Ramos-Guajardo, Ana
    Ferraro, Maria Brigida
    FUZZY SETS AND SYSTEMS, 2020, 389 : 29 - 50
  • [29] A fuzzy approach method for supplier selection
    Bayrak, M. Y.
    Celebi, N.
    Taskin, H.
    PRODUCTION PLANNING & CONTROL, 2007, 18 (01) : 54 - 63
  • [30] Some New Correlation Coefficient Measures Based on Fermatean Fuzzy Sets using Decision Making Approach in Pattern Analysis and Supplier Selection
    Bhatia, Mansi
    Arora, H. D.
    Naithani, Anjali
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2023, 8 (02) : 245 - 263