Complete Statistical Theory of Learning (Learning Using Statistical Invariants)

被引:0
|
作者
Vapnik, Vladimir [1 ]
Izmailov, Rauf [2 ]
机构
[1] Columbia Univ, New York, NY 10025 USA
[2] Perspecta Labs, Basking Ridge, NJ USA
关键词
Learning Theory; Weak convergence; Statistical Invariants; Complete solution of learning problem; Reproducing Kernel Hilbert Space; Kernel Machines; Statistical Invariants for Support Vector Classification; Statistical Invariants for Support Vector Regression; Statistical Invariants for Neural Nets; Predicates; Symmetries Invariants;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Statistical theory of learning considers methods of constructing approximations that converge to the desired function with increasing number of observations. This theory studies mechanisms that provide convergence in the space of functions in L-2 norm, i.e., it studies the so-called strong mode of convergence. However, in Hilbert space, along with the convergence in the space of functions, there also exists the so-called weak mode of convergence, i.e., convergence in the space of functionals. Under some conditions, this weak mode of convergence also implies the convergence of approximations to the desired function in L-2 norm, although such convergence is based on other mechanisms. The paper discusses new learning methods which use both modes of convergence (weak and strong) simultaneously. Such methods allow one to execute the following: (1) select an admissible subset of functions (i.e., the set of appropriate approximation functions), and (2) find the desired approximation in this admissible subset. Since only two modes of convergence exist in Hilbert space, we call the theory that uses both modes the complete statistical theory of learning. Along with general reasoning, we describe new learning algorithms referred to as Learning Using Statistical Invariants (LUSI). LUSI algorithms were developed for sets of functions belonging to Reproducing Kernel Hilbert Space (RKHS); they include the modified SVM method (LUSI-SVM method). Also, the paper presents a LUSI modification of Neural Networks (LUSI-NN). LUSI methods require fewer training examples that standard approaches for achieving the same performance. In conclusion, the paper discusses the general (philosophical) framework of a new learning paradigm that includes the concept of intelligence.
引用
收藏
页码:4 / 40
页数:37
相关论文
共 50 条
  • [1] Rethinking statistical learning theory: learning using statistical invariants
    Vapnik, Vladimir
    Izmailov, Rauf
    MACHINE LEARNING, 2019, 108 (03) : 381 - 423
  • [2] Rethinking statistical learning theory: learning using statistical invariants
    Vladimir Vapnik
    Rauf Izmailov
    Machine Learning, 2019, 108 : 381 - 423
  • [3] Complete Statistical Theory of Learning
    Vapnik, V. N.
    AUTOMATION AND REMOTE CONTROL, 2019, 80 (11) : 1949 - 1975
  • [4] Complete Statistical Theory of Learning
    V. N. Vapnik
    Automation and Remote Control, 2019, 80 : 1949 - 1975
  • [5] Two birational invariants in statistical learning theory
    Watanabe, Sumio
    SINGULARITIES IN GEOMETRY AND TOPOLOGY: STRASBOURG 2009, 2012, 20 : 249 - 268
  • [6] Learning using statistical invariants with privileged information
    Yan, Xueqin
    Li, Chunna
    Shao, Yuanhai
    Meng, Yanhui
    INFORMATION SCIENCES, 2025, 709
  • [7] Learning using granularity statistical invariants for classification
    Zhu, Ting-Ting
    Li, Chun-Na
    Liu, Tian
    Shao, Yuan-Hai
    APPLIED INTELLIGENCE, 2024, 54 (08) : 6667 - 6681
  • [8] PROBABILITY LEARNING IN STATISTICAL LEARNING THEORY
    FEICHTINGER, G
    METRIKA, 1971, 18 (01) : 35 - 55
  • [9] Motion estimation using statistical learning theory
    Wechsler, H
    Duric, Z
    Li, FY
    Cherkassky, V
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (04) : 466 - 478
  • [10] A statistical approach for learning invariants: Application to image color correction and learning invariants to illumination
    Bascle, B.
    Bernier, O.
    Lemaire, V.
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 294 - 303