Sequential estimate for linear regression models with uncertain number of effective variables

被引:8
|
作者
Wang, Zhanfeng [1 ]
Chang, Yuan-chin Ivan [2 ]
机构
[1] Univ Sci & Technol China, Dept Stat & Finance, Hefei 230026, Peoples R China
[2] Acad Sinica, Inst Stat Sci, Taipei 115, Taiwan
基金
中国国家自然科学基金;
关键词
Confidence set; Shrinkage estimation; Stochastic regression; Stopping time;
D O I
10.1007/s00184-012-0426-4
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
As a result of novel data collection technologies, it is now common to encounter data in which the number of explanatory variables collected is large, while the number of variables that actually contribute to the model remains small. Thus, a method that can identify those variables with impact on the model without inferring other noneffective ones will make analysis much more efficient. Many methods are proposed to resolve the model selection problems under such circumstances, however, it is still unknown how large a sample size is sufficient to identify those "effective" variables. In this paper, we apply sequential sampling method so that the effective variables can be identified efficiently, and the sampling is stopped as soon as the "effective" variables are identified and their corresponding regression coefficients are estimated with satisfactory accuracy, which is new to sequential estimation. Both fixed and adaptive designs are considered. The asymptotic properties of estimates of the number of effective variables and their coefficients are established, and the proposed sequential estimation procedure is shown to be asymptotically optimal. Simulation studies are conducted to illustrate the performance of the proposed estimation method, and a diabetes data set is used as an example.
引用
收藏
页码:949 / 978
页数:30
相关论文
共 50 条
  • [31] COMPARISON OF LOG-LINEAR AND REGRESSION MODELS FOR SYSTEMS OF DICHOTOMOUS VARIABLES
    KNOKE, D
    SOCIOLOGICAL METHODS & RESEARCH, 1975, 3 (04) : 416 - 434
  • [32] Fuzzification of linear regression models with indicator variables in medical decision making
    Bolotin, Arkady
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 572 - 576
  • [33] Uncertain logistic regression models
    Gao, Jinling
    Gong, Zengtai
    AIMS MATHEMATICS, 2024, 9 (05): : 10478 - 10493
  • [34] A Non-Linear Regression Model to Estimate the Number of Defects for Software Testing Phase
    Prykhodko, Sergiy
    Prykhodko, Natalia
    Makarova, Lidiia
    Prykhodko, Kateryna
    Pukhalevych, Andrii
    Smykodub, Tatyana
    2019 IEEE 2ND UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON-2019), 2019, : 965 - 969
  • [35] Linear regression with fuzzy variables
    Varga, S
    Sabo, M
    STATE OF THE ART IN COMPUTATIONAL INTELLIGENCE, 2000, : 99 - 103
  • [36] Estimating Causal Effects in Linear Regression Models With Observational Data: The Instrumental Variables Regression Model
    Maydeu-Olivares, Alberto
    Shi, Dexin
    Fairchild, Amanda J.
    PSYCHOLOGICAL METHODS, 2020, 25 (02) : 243 - 258
  • [37] On the possibilistic approach to linear regression models involving uncertain, indeterminate or interval data
    Cerny, Michal
    Antoch, Jaromir
    Hladik, Milan
    INFORMATION SCIENCES, 2013, 244 : 26 - 47
  • [38] Delete-group Jackknife Estimate in Partially Linear Regression Models with Heteroscedasticity
    Jin-hong You
    Gemai Chen
    Acta Mathematicae Applicatae Sinica, 2003, 19 (4) : 599 - 610
  • [39] LINEAR COMPLEMENTARITY PROBLEMS WITH UNCERTAIN VARIABLES
    Du, Hongbo
    Yuan, Rui
    Mai, Xiaojun
    Din, Norrina
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2024, 25 (03) : 633 - 644
  • [40] Reference evapotranspiration estimate with missing climatic data and multiple linear regression models
    Koc, Deniz Levent
    Can, Mueg Erkan
    PEERJ, 2023, 11