How Much Can Machines Learn Finance from Chinese Text Data?

被引:6
|
作者
Zhou, Yang [1 ,2 ]
Fan, Jianqing [3 ,4 ,5 ]
Xue, Lirong [4 ]
机构
[1] Fudan Univ, Inst Big Data, Shanghai 200433, Peoples R China
[2] Fudan Univ, MOE Lab Natl Dev & Intelligent Governance, Shanghai 200433, Peoples R China
[3] Capital Univ Econ & Business, Int Sch Econ & Management, Beijing 100070, Peoples R China
[4] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
[5] Fudan Univ, Sch Data Sci, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
machine learning; FarmPredict; factor model; sparse regression; textual analysis; INVESTOR SENTIMENT; STOCK; RETURNS; NUMBER; RISK; NEWS;
D O I
10.1287/mnsc.2022.01468
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
How much can we learn finance directly from text data? This paper presents a new framework for learning textual data based on the factor augmentation model and sparsity regularization, called the factor -augmented regularized model for prediction (FarmPredict), to let machines learn financial returns directly from news. FarmPredict allows the model itself to extract information directly from articles without predefined information, such as dictionaries or pretrained models as in most studies. Using unsupervised learned factors to augment the predictors would benefit our method with a "doublerobust" feature: that the machine would learn to balance between individual words or text factors/topics. It also avoids the information loss of factor regression in dimensionality reduction. We apply our model to the Chinese stock market with a large proportion of retail investors by using Chinese news data to predict financial returns. We show that positive sentiments scored by our FarmPredict approach from news generate on average 83 basic points (bps) stock daily excess returns, and negative news has an adverse impact of 26 bps on the days of news announcements, where both effects can last for a few days. This asymmetric effect aligns well with the short -sale constraints in the Chinese equity market. The result shows that the machine -learned prediction does provide sizeable predictive power with an annualized return of 54% at most with a simple investment strategy. Compared with other statistical and machine learning methods, FarmPredict significantly outperforms them on model prediction and portfolio performance. Our study demonstrates the of machines to learn text data.
引用
收藏
页码:8962 / 8987
页数:27
相关论文
共 50 条
  • [41] What can we learn from universal Turing machines?
    Margenstern, Maurice
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2022, 37 (02) : 123 - 151
  • [42] Composition of pluralistic health systems: how much can we learn from household surveys? An exploration in Cambodia
    Meessen, Bruno
    Bigdeli, Maryam
    Chheng, Kannarath
    Decoster, Kristof
    Ir, Por
    Men, Chean
    Van Damme, Wim
    HEALTH POLICY AND PLANNING, 2011, 26 : i30 - i44
  • [43] How much can we learn about voluntary climate action from behavior in public goods games?
    Goeschl, Timo
    Kettner, Sara Elisa
    Lohse, Johannes
    Schwieren, Christiane
    ECOLOGICAL ECONOMICS, 2020, 171
  • [44] How much pretraining data do language models need to learn syntax?
    Perez-Mayos, Laura
    Ballesteros, Miguel
    Wanner, Leo
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1571 - 1582
  • [45] You can't learn much from books you can't read
    Allington, RL
    EDUCATIONAL LEADERSHIP, 2002, 60 (03) : 16 - 19
  • [46] PERSONAL VIEW We can learn much from the air force
    Mitchell, Annelies E.
    Eastwood, Thomas A.
    Mitchell, Ian M.
    BRITISH MEDICAL JOURNAL, 2011, 343 : d6464
  • [47] How dynamical models can learn from the data—an example with a simplified ENSO model
    Heiko Paeth
    Janna Lindenberg
    Maik Kschischo
    Andreas Hense
    Theoretical and Applied Climatology, 2011, 104 : 221 - 231
  • [48] Processors can learn much from mom-and-pop shops
    Toensmeier, PA
    MODERN PLASTICS, 2002, 79 (02): : 9 - 9
  • [49] Mobile Data Offloading: How Much Can WiFi Deliver?
    Lee, Kyunghan
    Rhee, Injong
    Lee, Joohyun
    Yi, Yung
    Chong, Song
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) : 425 - 426
  • [50] Mobile Data Offloading: How Much Can WiFi Deliver?
    Lee, Kyunghan
    Lee, Joohyun
    Yi, Yung
    Rhee, Injong
    Chong, Song
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2013, 21 (02) : 536 - 550