A survey on outlier detection methods applied on air quality data

被引:0
|
作者
Stroia-Vlad, Iuliana-Andreea [1 ]
Danciu, Gabriel Mihail [1 ]
机构
[1] Transilvania Univ Brasov, Dept Elect & Comp, Brasov, Romania
关键词
air pollution; time series; statistics; machine learning; regression;
D O I
10.1109/isetc50328.2020.9301140
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a study on the impact of various time series prediction algorithms applied on air quality data. This data is obtained from several sensors measurements, at every passing minute. The current research is concerned about finding a solution for a prediction algorithm based on fit functions. Traditional statistics models such as ARIMA (AutoRegressive Integrated Moving Average Model) and modern ones, like Facebook Prophet, were used for a comparative approach. Moreover, our proposed method has also been tested using different types of regression: Linear, Polynomial and Spline. After having made all the possible analogies between the selected algorithms for the given time series, regression spline has been found as the most accurate model. The purpose of this paper is to explain and to convince that results behave in a different manner depending on the used algorithm. The research has been done by studying air quality measurements received from various sensors, such as: PM2.5, PM1, PM10, O-3, CH2O, temperature, pressure and CO2. The study analyses sensors' values over a period of several months, obtaining over 43000 measurements per month for each sensor. The paper discusses the data obtained and its accuracy is tested using various metrics of evaluation.
引用
收藏
页码:23 / 26
页数:4
相关论文
共 50 条
  • [31] Advancements of Outlier Detection: A Survey
    Zhang, Ji
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2013, 13 (1-3):
  • [32] A survey of outlier detection methodologies
    Hodge, VJ
    Austin, J
    ARTIFICIAL INTELLIGENCE REVIEW, 2004, 22 (02) : 85 - 126
  • [33] Outlier Detection for Improved Data Quality and Diversity in Dialog Systems
    Larson, Stefan
    Mahendran, Anish
    Lee, Andrew
    Kummerfeld, Jonathan K.
    Hill, Parker
    Laurenzano, Michael A.
    Hauswald, Johann
    Tang, Lingjia
    Mars, Jason
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 517 - 527
  • [34] Outlier detection and missing data filling methods for coastal water temperature data
    Cho, Hong Yeon
    Oh, Ji Hee
    Kim, Kyeong Ok
    Shim, Jae Seol
    JOURNAL OF COASTAL RESEARCH, 2013, : 1898 - 1903
  • [35] Outlier Detection and a Method of Adjustment for the Iranian Manufacturing Establishment Survey Data
    Ghahroodi, Zahra Rezaei
    Baghfalaki, Taban
    Ganjali, Mojtaba
    APPLICATIONS AND APPLIED MATHEMATICS-AN INTERNATIONAL JOURNAL, 2015, 10 (01): : 588 - 608
  • [36] Methods for outlier detection in prediction
    Pierna, JAF
    Wahl, F
    de Noord, OE
    Massart, DL
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 63 (01) : 27 - 39
  • [37] AN INFLUENCE METHOD FOR OUTLIER DETECTION APPLIED TO TIME-SERIES TRAFFIC DATA
    WATSON, SM
    REDFERN, E
    CLARK, S
    TIGHT, M
    DAVIES, N
    JOURNAL OF APPLIED STATISTICS, 1995, 22 (01) : 135 - 149
  • [39] A comparison of multivariate outlier detection methods for clinical laboratory safety data
    Penny, KI
    Jolliffe, IT
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 2001, 50 : 295 - 308
  • [40] Electricity Consumption Data Analysis Using Various Outlier Detection Methods
    Kaddour, Sidi Mohammed
    Lehsaini, Mohamed
    INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2021, 13 (03): : 12 - 27