Comparison of Visual Datasets for Machine Learning

被引:23
|
作者
Gauen, Kent [1 ]
Dailey, Ryan [1 ]
Laiman, John [1 ]
Zi, Yuxiang [1 ]
Asokan, Nirmal [1 ]
Lu, Yung-Hsiang [1 ]
Thiruvathukal, George K. [2 ]
Shyu, Mei-Ling [3 ]
Chen, Shu-Ching [4 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Loyola Univ, Dept Comp Sci, Chicago, IL 60611 USA
[3] Univ Miami, Dept Elect & Comp Engn, Coral Gables, FL 33124 USA
[4] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
基金
美国国家科学基金会;
关键词
OBJECT;
D O I
10.1109/IRI.2017.59
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the greatest technological improvements in recent years is the rapid progress using machine learning for processing visual data. Among all factors that contribute to this development, datasets with labels play crucial roles. Several datasets are widely reused for investigating and analyzing different solutions in machine learning. Many systems, such as autonomous vehicles, rely on components using machine learning for recognizing objects. This paper compares different visual datasets and frameworks for machine learning. The comparison is both qualitative and quantitative and investigates object detection labels with respect to size, location, and contextual information. This paper also presents a new approach creating datasets using real-time, geo-tagged visual data, greatly improving the contextual information of the data. The data could be automatically labeled by cross-referencing information from other sources (such as weather).
引用
收藏
页码:346 / 355
页数:10
相关论文
共 50 条
  • [31] Generating implicit object fragment datasets for machine learning
    Lopez, Alfonso
    Rueda, Antonio J.
    Segura, Rafael J.
    Ogayar, Carlos J.
    Navarro, Pablo
    Fuertes, Jose M.
    COMPUTERS & GRAPHICS-UK, 2024, 125
  • [32] Machine Learning Methods with Noisy, Incomplete or Small Datasets
    Caiafa, Cesar F.
    Sun, Zhe
    Tanaka, Toshihisa
    Marti-Puig, Pere
    Sole-Casals, Jordi
    APPLIED SCIENCES-BASEL, 2021, 11 (09):
  • [33] Machine learning to combat cyberattack: a survey of datasets and challenges
    Prasad, Arvind
    Chandra, Shalini
    JOURNAL OF DEFENSE MODELING AND SIMULATION-APPLICATIONS METHODOLOGY TECHNOLOGY-JDMS, 2023, 20 (04): : 577 - 588
  • [34] Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets
    Chang, Joseph Chee
    Amershi, Saleema
    Kamar, Ece
    PROCEEDINGS OF THE 2017 ACM SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'17), 2017, : 2334 - 2346
  • [35] Cautionary Guidelines for Machine Learning Studies with Combinatorial Datasets
    Zahrt, Andrew F.
    Henle, Jeremy J.
    Denmark, Scott E.
    ACS COMBINATORIAL SCIENCE, 2020, 22 (11) : 586 - 591
  • [36] On the genealogy of machine learning datasets: A critical history of ImageNet
    Denton, Emily
    Hanna, Alex
    Amironesei, Razvan
    Smart, Andrew
    Nicole, Hilary
    BIG DATA & SOCIETY, 2021, 8 (02):
  • [37] Machine Learning With Computer Networks: Techniques, Datasets, and Models
    Afifi, Haitham
    Pochaba, Sabrina
    Boltres, Andreas
    Laniewski, Dominic
    Haberer, Janek
    Paeleke, Leonard
    Poorzare, Reza
    Stolpmann, Daniel
    Wehner, Nikolas
    Redder, Adrian
    Samikwa, Eric
    Seufert, Michael
    IEEE ACCESS, 2024, 12 : 54673 - 54720
  • [38] Surgical Tool Datasets for Machine Learning Research: A Survey
    Rodrigues, Mark
    Mayo, Michael
    Patros, Panos
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2222 - 2248
  • [39] A Review on Cyber Security Datasets for Machine Learning Algorithms
    Yavanoglu, Ozlem
    Aydos, Murat
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2186 - 2193
  • [40] Machine Learning Techniques for Heart Disease Datasets: A Survey
    Khan, Younas
    Qamar, Usman
    Yousaf, Nazish
    Khan, Aimal
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 21 - 29