Data augmentation in microscopic images for material data mining

被引:0
|
作者
Boyuan Ma
Xiaoyan Wei
Chuni Liu
Xiaojuan Ban
Haiyou Huang
Hao Wang
Weihua Xue
Stephen Wu
Mingfei Gao
Qing Shen
Michele Mukeshimana
Adnan Omer Abuassba
Haokai Shen
Yanjing Su
机构
[1] University of Science and Technology Beijing,Beijing Advanced Innovation Center for Materials Genome Engineering
[2] University of Science and Technology Beijing,School of Computer and Communication Engineering
[3] Beijing Key Laboratory of Knowledge Engineering for Materials Science,Institute for Advanced Materials and Technology
[4] University of Science and Technology Beijing,School of Materials Science and Engineering
[5] University of Science and Technology Beijing,School of Materials Science and Technology
[6] Liaoning Technical University,The Institute of Statistical Mathematics
[7] Research Organization of Information and Systems,Faculty of Engineering Sciences
[8] Tachikawa,College of Information Science and Engineering
[9] National Intellectual Property Administration,Key Lab of Petroleum Data Mining
[10] University of Burundi,undefined
[11] Faculty of Engineering and Technology,undefined
[12] Palestine Technical University – Kadoorie,undefined
[13] China University of Petroleum,undefined
[14] China University of Petroleum,undefined
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Recent progress in material data mining has been driven by high-capacity models trained on large datasets. However, collecting experimental data (real data) has been extremely costly owing to the amount of human effort and expertise required. Here, we develop a novel transfer learning strategy to address problems of small or insufficient data. This strategy realizes the fusion of real and simulated data and the augmentation of training data in a data mining procedure. For a specific task of grain instance image segmentation, this strategy aims to generate synthetic data by fusing the images obtained from simulating the physical mechanism of grain formation and the “image style” information in real images. The results show that the model trained with the acquired synthetic data and only 35% of the real data can already achieve competitive segmentation performance of a model trained on all of the real data. Because the time required to perform grain simulation and to generate synthetic data are almost negligible as compared to the effort for obtaining real data, our proposed strategy is able to exploit the strong prediction power of deep learning without significantly increasing the experimental burden of training data preparation.
引用
收藏
相关论文
共 50 条
  • [21] Cycles Improve Conditional Generators: Synthesis and Augmentation for Data Mining
    Moore, Alexander M.
    Paffenroth, Randy Clinton
    Ngo, Ken T.
    Uzarski, Joshua R.
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II, 2022, 13726 : 352 - 364
  • [22] Automated Data Augmentation Services Using Text Mining, Data Cleansing and Web Crawling Techniques
    Jacob, Matthias
    Kuscher, Alexander
    Plauth, Max
    Thiele, Christoph
    IEEE CONGRESS ON SERVICES 2008, PT I, PROCEEDINGS, 2008, : 136 - 143
  • [23] Scatter of fatigue data owing to material microscopic effects
    TANG XueSong
    Science China(Physics,Mechanics & Astronomy), 2014, Mechanics & Astronomy)2014 (01) : 90 - 97
  • [24] Scatter of fatigue data owing to material microscopic effects
    XueSong Tang
    Science China Physics, Mechanics and Astronomy, 2014, 57 : 90 - 97
  • [25] Scatter of fatigue data owing to material microscopic effects
    Tang XueSong
    SCIENCE CHINA-PHYSICS MECHANICS & ASTRONOMY, 2014, 57 (01) : 90 - 97
  • [26] PatchMask: A Data Augmentation Strategy with Gaussian Noise in Hyperspectral Images
    Dou, Hong-Xia
    Lu, Xing-Shun
    Wang, Chao
    Shen, Hao-Zhen
    Zhuo, Yu-Wei
    Deng, Liang-Jian
    REMOTE SENSING, 2022, 14 (24)
  • [27] Generative Image Translation for Data Augmentation in Colorectal Histopathology Images
    Wei, Jerry
    Suriawinata, Arief
    Vaickus, Louis
    Ren, Bing
    Liu, Xiaoying
    Wei, Jason
    Hassanpour, Saeed
    MACHINE LEARNING FOR HEALTH WORKSHOP, VOL 116, 2019, 116 : 10 - +
  • [28] Geostatistical Simulation of Medical Images for Data Augmentation in Deep Learning
    Tuan D Pham
    IEEE ACCESS, 2019, 7 : 68752 - 68763
  • [29] Improvement of recognition rate using data augmentation with blurred images
    Ishikawa, Shiori
    Chiyonobu, Miho
    Iida, Sayaka
    Takata, Masami
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (09): : 12154 - 12165
  • [30] Combined Oriented Data Augmentation Method for Brain MRI Images
    Farhan, Ahmeed Suliman
    Khalid, Muhammad
    Manzoor, Umar
    IEEE ACCESS, 2025, 13 : 9981 - 9994