Comparison of Visual Datasets for Machine Learning

被引：23

作者：

Gauen, Kent ^{[1
]}

Dailey, Ryan ^{[1
]}

Laiman, John ^{[1
]}

Zi, Yuxiang ^{[1
]}

Asokan, Nirmal ^{[1
]}

Lu, Yung-Hsiang ^{[1
]}

Thiruvathukal, George K. ^{[2
]}

Shyu, Mei-Ling ^{[3
]}

Chen, Shu-Ching ^{[4
]}

机构：

[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA

[2] Loyola Univ, Dept Comp Sci, Chicago, IL 60611 USA

[3] Univ Miami, Dept Elect & Comp Engn, Coral Gables, FL 33124 USA

[4] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA

来源：

2017 IEEE 18TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI 2017) | 2017年

基金：

美国国家科学基金会;

关键词：

OBJECT;

D O I：

10.1109/IRI.2017.59

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

One of the greatest technological improvements in recent years is the rapid progress using machine learning for processing visual data. Among all factors that contribute to this development, datasets with labels play crucial roles. Several datasets are widely reused for investigating and analyzing different solutions in machine learning. Many systems, such as autonomous vehicles, rely on components using machine learning for recognizing objects. This paper compares different visual datasets and frameworks for machine learning. The comparison is both qualitative and quantitative and investigates object detection labels with respect to size, location, and contextual information. This paper also presents a new approach creating datasets using real-time, geo-tagged visual data, greatly improving the contextual information of the data. The data could be automatically labeled by cross-referencing information from other sources (such as weather).

引用

页码：346 / 355

页数：10

共 50 条

[31] Generating implicit object fragment datasets for machine learning
Lopez, Alfonso
Rueda, Antonio J.
Segura, Rafael J.
Ogayar, Carlos J.
Navarro, Pablo
Fuertes, Jose M.
COMPUTERS & GRAPHICS-UK, 2024, 125
[32] Machine Learning Methods with Noisy, Incomplete or Small Datasets
Caiafa, Cesar F.
Sun, Zhe
Tanaka, Toshihisa
Marti-Puig, Pere
Sole-Casals, Jordi
APPLIED SCIENCES-BASEL, 2021, 11 (09):
[33] Machine learning to combat cyberattack: a survey of datasets and challenges
Prasad, Arvind
Chandra, Shalini
JOURNAL OF DEFENSE MODELING AND SIMULATION-APPLICATIONS METHODOLOGY TECHNOLOGY-JDMS, 2023, 20 (04): : 577 - 588
[34] Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets
Chang, Joseph Chee
Amershi, Saleema
Kamar, Ece
PROCEEDINGS OF THE 2017 ACM SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'17), 2017, : 2334 - 2346
[35] Cautionary Guidelines for Machine Learning Studies with Combinatorial Datasets
Zahrt, Andrew F.
Henle, Jeremy J.
Denmark, Scott E.
ACS COMBINATORIAL SCIENCE, 2020, 22 (11) : 586 - 591
[36] On the genealogy of machine learning datasets: A critical history of ImageNet
Denton, Emily
Hanna, Alex
Amironesei, Razvan
Smart, Andrew
Nicole, Hilary
BIG DATA & SOCIETY, 2021, 8 (02):
[37] Machine Learning With Computer Networks: Techniques, Datasets, and Models
Afifi, Haitham
Pochaba, Sabrina
Boltres, Andreas
Laniewski, Dominic
Haberer, Janek
Paeleke, Leonard
Poorzare, Reza
Stolpmann, Daniel
Wehner, Nikolas
Redder, Adrian
Samikwa, Eric
Seufert, Michael
IEEE ACCESS, 2024, 12 : 54673 - 54720
[38] Surgical Tool Datasets for Machine Learning Research: A Survey
Rodrigues, Mark
Mayo, Michael
Patros, Panos
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2222 - 2248
[39] A Review on Cyber Security Datasets for Machine Learning Algorithms
Yavanoglu, Ozlem
Aydos, Murat
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2186 - 2193
[40] Machine Learning Techniques for Heart Disease Datasets: A Survey
Khan, Younas
Qamar, Usman
Yousaf, Nazish
Khan, Aimal
ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 21 - 29

← 1 2 3 4 5 →