Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science

被引：9

作者：

Loh, Charlotte ^{[1
]}

Christensen, Thomas ^{[2
]}

Dangovski, Rumen ^{[1
]}

Kim, Samuel ^{[1
]}

Soljacic, Marin ^{[2
]}

机构：

[1] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA

[2] MIT, Dept Phys, Cambridge, MA 02139 USA

来源：

NATURE COMMUNICATIONS | 2022年 / 13卷 / 01期

基金：

美国国家科学基金会;

关键词：

MULTIPLE-SCATTERING THEORY;

D O I：

10.1038/s41467-022-31915-y

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Deep learning techniques usually require a large quantity of training data and may be challenging for scarce datasets. The authors propose a framework that involves contrastive and transfer learning and reduces data requirements for training while keeping the prediction accuracy. Deep learning techniques have been increasingly applied to the natural sciences, e.g., for property prediction and optimization or material discovery. A fundamental ingredient of such approaches is the vast quantity of labeled data needed to train the model. This poses severe challenges in data-scarce settings where obtaining labels requires substantial computational or labor resources. Noting that problems in natural sciences often benefit from easily obtainable auxiliary information sources, we introduce surrogate- and invariance-boosted contrastive learning (SIB-CL), a deep learning framework which incorporates three inexpensive and easily obtainable auxiliary information sources to overcome data scarcity. Specifically, these are: abundant unlabeled data, prior knowledge of symmetries or invariances, and surrogate data obtained at near-zero cost. We demonstrate SIB-CL's effectiveness and generality on various scientific problems, e.g., predicting the density-of-states of 2D photonic crystals and solving the 3D time-independent Schrodinger equation. SIB-CL consistently results in orders of magnitude reduction in the number of labels needed to achieve the same network accuracies.

引用

页数：12

共 50 条

[41] An Integrated Transfer Learning Method for Power Generation Prediction of Run-Off Small Hydropower in Data-Scarce Areas
Wei, Zetao
Shen, Xiaodong
Qiu, Gao
Liu, Youbo
Liu, Junyong
IEEE TRANSACTIONS ON SMART GRID, 2024, 15 (01) : 1030 - 1041
[42] Data-centric or algorithm-centric: Exploiting the performance of transfer learning for improving building energy predictions in data-scarce context
Fan, Cheng
Lei, Yutian
Sun, Yongjun
Piscitelli, Marco Savino
Chiosa, Roberto
Capozzoli, Alfonso
ENERGY, 2022, 240
[43] Improvement of streamflow simulation by combining physically hydrological model with deep learning methods in data-scarce glacial river basin
Yang, Chengde
Xu, Min
Kang, Shichang
Fu, Congsheng
Hu, Didi
JOURNAL OF HYDROLOGY, 2023, 625
[44] Optimizing deep reinforcement learning in data-scarce domains: a cross-domain evaluation of double DQN and dueling DQN
Din, Nusrat Mohi Ud
Assad, Assif
Ul Sabha, Saqib
Rasool, Muzafar
INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024,
[45] Automatic gap-filling of daily streamflow time series in data-scarce regions using a machine learning algorithm
Arriagada, Pedro
Karelovic, Bruno
Link, Oscar
JOURNAL OF HYDROLOGY, 2021, 598
[46] Transfer learning framework for streamflow prediction in large-scale transboundary catchments:Sensitivity analysis and applicability in data-scarce basins
MA Kai
SHEN Chaopeng
XU Ziyue
HE Daming
Journal of Geographical Sciences, 2024, 34 (05) : 963 - 988
[47] Integrating machine learning and zoning-based techniques for bias correction in gridded precipitation data to improve hydrological estimation in the data-scarce region
Meema, Thatkiat
Wattanasetpong, Jatuwat
Wichakul, Supattana
JOURNAL OF HYDROLOGY, 2025, 646
[48] Detecting springs and groundwater-dependent vegetation in data-scarce regions of Australia combining citizen science, GRACE, and optical/ radar imagery
Castellazzi, Pascal
Gao, Sicong
Pritchard, Jodie
Ponce-Reyes, Rocio
Stratford, Danial
Crosbie, Russell
REMOTE SENSING OF ENVIRONMENT, 2024, 313
[49] Transfer learning framework for streamflow prediction in large-scale transboundary catchments: Sensitivity analysis and applicability in data-scarce basins
Ma, Kai
Shen, Chaopeng
Xu, Ziyue
He, Daming
JOURNAL OF GEOGRAPHICAL SCIENCES, 2024, 34 (05) : 963 - 984
[50] Groundwater level forecasting in a data-scarce region through remote sensing data downscaling, hydrological modeling, and machine learning: A case study from Morocco
Rafik, Abdellatif
Brahim, Yassine Ait
Amazirh, Abdelhakim
Ouarani, Mohamed
Bargam, Bouchra
Ouatiki, Hamza
Bouslihim, Yassine
Bouchaou, Lhoussaine
Chehbouni, Abdelghani
JOURNAL OF HYDROLOGY-REGIONAL STUDIES, 2023, 50

← 1 2 3 4 5 →