Discovering the Arrow of Time in Machine Learning

被引:0
|
作者
Kasmire, J. [1 ]
Zhao, Anran [1 ]
机构
[1] Univ Manchester, UK Data Serv & Cathie Marsh Inst, Manchester M13 9PL, Lancs, England
关键词
machine learning; time; naive Bayes classification; recurrent neural networks; Twitter; social media data; automatic classification; INFORMATION;
D O I
10.3390/info12110439
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) is increasingly useful as data grow in volume and accessibility. ML can perform tasks (e.g., categorisation, decision making, anomaly detection, etc.) through experience and without explicit instruction, even when the data are too vast, complex, highly variable, full of errors to be analysed in other ways. Thus, ML is great for natural language, images, or other complex and messy data available in large and growing volumes. Selecting ML models for tasks depends on many factors as they vary in supervision needed, tolerable error levels, and ability to account for order or temporal context, among many other things. Importantly, ML methods for tasks that use explicitly ordered or time-dependent data struggle with errors or data asymmetry. Most data are (implicitly) ordered or time-dependent, potentially allowing a hidden 'arrow of time' to affect ML performance on non-temporal tasks. This research explores the interaction of ML and implicit order using two ML models to automatically classify (a non-temporal task) tweets (temporal data) under conditions that balance volume and complexity of data. Results show that performance was affected, suggesting that researchers should carefully consider time when matching appropriate ML models to tasks, even when time is only implicitly included.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Discovering High-Temperature Conventional Superconductors via Machine Learning
    Cui Z.
    Luo Y.
    Zhang Y.
    Kuei Suan Jen Hsueh Pao/Journal of the Chinese Ceramic Society, 2023, 51 (02): : 411 - 415
  • [42] DISCOVERING CRASH SEVERITY FACTORS OF GRADE CROSSING WITH A MACHINE LEARNING APPROACH
    Lee, Dahye
    Warner, Jeffery
    Morgan, Curtis
    PROCEEDINGS OF THE ASME JOINT RAIL CONFERENCE, 2019, 2019,
  • [43] Discovering new DNA gyrase inhibitors using machine learning approaches
    Li, Long
    Le, Xiu
    Wang, Ling
    Gu, Qiong
    Zhou, Huihao
    Xu, Jun
    RSC ADVANCES, 2015, 5 (128): : 105600 - 105608
  • [44] Discovering Depressurization Events in Service Difficulty Reports using Machine Learning
    Niraula, Nobal
    Nguyen, Hai
    Kansal, Jennifer
    Hafner, Sean
    Branscum, Logan
    Brown, Eric
    Garcia, Ricardo
    2023 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT, ICPHM, 2023, : 48 - 52
  • [45] Discovering nonlinear resonances through physics-informed machine learning
    Barmparis, G. D.
    Tsironis, G. P.
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA B-OPTICAL PHYSICS, 2021, 38 (09) : C120 - C126
  • [46] Discovering de novo peptide substrates for enzymes using machine learning
    Kim, Woojoo
    Tallorin, Lorillee
    Wangj, Jialei
    Sahu, Swagat
    Kosa, Nicholas
    Yang, Pu
    Thompson, Matthew
    Gilson, Michael
    Frazier, Peter
    Burkart, Michael
    Gianneschi, Nathan
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 258
  • [47] Discovering HIV related information by means of association rules and machine learning
    Araujo, Lourdes
    Martinez-Romo, Juan
    Bisbal, Otilia
    Sanchez-de-Madariaga, Ricardo
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [48] Discovering Reaction Pathways, Slow Variables, and Committor Probabilities with Machine Learning
    Chen, Haochuan
    Roux, Benoit
    Chipot, Christophe
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2023, 19 (14) : 4414 - 4426
  • [49] Discovering trends and hotspots of biosafety and biosecurity research via machine learning
    Guan, Renchu
    Pang, Haoyu
    Liang, Yanchun
    Shao, Zhongjun
    Gao, Xin
    Xu, Dong
    Feng, Xiaoyue
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (05)
  • [50] Discovering de novo peptide substrates for enzymes using machine learning
    Lorillee Tallorin
    JiaLei Wang
    Woojoo E. Kim
    Swagat Sahu
    Nicolas M. Kosa
    Pu Yang
    Matthew Thompson
    Michael K. Gilson
    Peter I. Frazier
    Michael D. Burkart
    Nathan C. Gianneschi
    Nature Communications, 9