Introduction
Transfer learning, a technique for adapting trained neural network models to new datasets and task, is an indispensable tool in machine learning (ML). In scientific ML applications, transfer learning has recently become a popular component in training pipelines for tackling new problems and domains Bets et al., 2024Peng et al., 2024Pathrudkar et al., 2024Kim et al., 2021Chen et al., 2025Chen & Ong, 2021Gupta et al., 2021. It falls within a family of training procedure which optimize neural network models in multiple stages. In the first stage, one first optimizes a model’s performance on a pretraining dataset. Typically, the pretraining dataset is large and consists of a diverse or broad set of data from a general domain of interest. Transfer learning is the second stage: one reoptimizes the model using a separate dataset—typically smaller and more narrowly distributed—to improve model performance on a specific task or domain of interest.
ML-based approaches in electron microscopy have been shown to be broadly able to address data analysis and modeling challenges in nanoscale characterizations Ede, 2021Treder et al., 2022Kalinin et al., 2022, spanning a wide variety of tasks and utilizing a broad family of training approaches. Image segmentation—the foundational task of classifying portions of micrographs into regions of different atomic structures—is particularly well-studied, including phase contrast HRTEM segmentation with U-Net models Madsen et al., 2018Sadre et al., 2021Groschner et al., 2020Rangel DaCosta et al., 2024, HRTEM and STEM segmentation via unsupervised learning Wang et al., 2021Wang et al., 2021, STEM segmentation with synthetic data training Eliasson & Erni, 2024Lin et al., 2021, and few-shot training Akers et al., 2021Kaufmann et al., 2021.
Multi-stage training techniques like transfer learning could be especially useful in extending current ML applications in electron microscopy, as well-curated experimental data suitable for developing ML models can be quite scarce and expensive to acquire, features of data can differ substantially across experiments, and performing consistent quantitative analysis across many different experiments can be difficult. While significant work in our field has been devoted to developing a precise understanding of the behavior of neural network models used for segmentation tasks, including dependence on model architecture and training hyperparameters Horwath et al., 2020Sytwu et al., 2022Kazimi & Sandfeld, 2025, experimental generalization Sytwu et al., 2024, and behavior with respect to image noise dependence Leth Larsen et al., 2023, less is known about how to understand model behavior after transfer learning, which necessarily introduces more decisions into already-complex training workflows. Creative and informed use of transfer learning can accelerate and expand adoption of ML for scientific applications in electron microscopy by significantly reducing development costs and improving model performance on niche tasks. However, effective use of transfer learning and similar advanced training protocols asks for a more nuanced understanding of the machine learning process and its myriad moving parts.
Here, we define a setting for transfer learning via the domain adaptation and domain shift interpretations Zhuang et al., 2020: transfer learning is used when the data domain shifts or changes as a way to adapt the model to a new, target domain, and thus ‘transferring’ the model to the new domain. In practice, this domain adaptation is performed as previously described by first training a model on a large training set, and then providing the model with a small, new dataset in the target domain to re-train the model and refine its performance. Data generated similarly to the training data are deemed in-distribution; most neural network training approaches optimize performance within only the distribution of the training data. We typically cannot guarantee that a neural network model will generalize to domains which differ significantly from the training data, deemed out-of-distribution (OOD) data; model performance can degrade rapidly for hard to predict reasons in OOD settings and determining a priori which data are and are not OOD is difficult. While model dependence on the pretraining dataset is complex Entezari et al., 2023 and understanding OOD behavior across both phases of transfer learning can require a strong understanding of the data domains involved Wenzel et al., 2022, transfer learning can circumvent OOD generalization challenges by re-adapting an already training model, avoiding the issue at a small cost of curating new training data.
Overview of our study¶
Using a simulation-based study, we explore some common flavors of transfer learning procedures for the task of semantic segmentation in atomic resolution HRTEM, analyzing the downstream effects on model performance arising from practical choices one must make in an ML workflow. We limit our discussion to the supervised training regime, where data labels exist for training data to use for optimization against some type of loss function. In particular, we will specifically discuss the effect of the original training dataset on performance, and how one can reason about the relationship between pretraining and transfer-learning domains. We use simulated datasets to provide an objective basis for comparative analysis of transfer learning procedures. With total control over the data-generating process and access to ground-truth information about atomic structures and their micrographs, we can accurately describe data domain shifts and absolutely quantify corresponding shifts in model performance. We train over 10,500 U-net models using transfer learning procedures on a series of simulated datasets across a wide swatch of training conditions and transfer learning strategies and analyze their absolute performance as well as their generalization behavior. Specifically, we focus on transferring model applicability across three categories of domains: imaging conditions, noise conditions, and atomic structural distributions. Lastly, we provide some starting recipes and technical guidance for successfully employing transfer learning in electron microscopy.
- Bets, K. V., O’Driscoll, P. C., & Yakobson, B. I. (2024). Physics-Inspired Transfer Learning for ML-prediction of CNT Band Gaps from Limited Data. Npj Comput Mater, 10(1), 66. 10.1038/s41524-024-01247-0
- Peng, X., Liang, J., Wang, K., Zhao, X., Peng, Z., Li, Z., Zeng, J., Lan, Z., Lei, M., & Huang, D. (2024). Construction Frontier Molecular Orbital Prediction Model with Transfer Learning for Organic Materials. Npj Comput Mater, 10(1), 213. 10.1038/s41524-024-01403-6
- Pathrudkar, S., Thiagarajan, P., Agarwal, S., Banerjee, A. S., & Ghosh, S. (2024). Electronic Structure Prediction of Multi-Million Atom Systems through Uncertainty Quantification Enabled Transfer Learning. Npj Comput Mater, 10(1), 175. 10.1038/s41524-024-01305-7
- Kim, Y., Kim, Y., Yang, C., Park, K., Gu, G. X., & Ryu, S. (2021). Deep Learning Framework for Material Design Space Exploration Using Active Transfer Learning and Data Augmentation. Npj Comput Mater, 7(1), 140. 10.1038/s41524-021-00609-2
- Chen, W., Xu, Z., Wang, K., Gao, L., Song, A., & Ma, T. (2025). Transferable Machine Learning Model for Multi-Target Nanoscale Simulations in Hydrogen-Carbon System from Crystal to Amorphous. Npj Comput Mater, 11(1), 119. 10.1038/s41524-025-01629-y
- Chen, C., & Ong, S. P. (2021). AtomSets as a Hierarchical Transfer Learning Framework for Small and Large Materials Datasets. Npj Comput Mater, 7(1), 173. 10.1038/s41524-021-00639-w
- Gupta, V., Choudhary, K., Tavazza, F., Campbell, C., Liao, W., Choudhary, A., & Agrawal, A. (2021). Cross-Property Deep Transfer Learning Framework for Enhanced Predictive Analytics on Small Materials Data. Nat Commun, 12(1), 6595. 10.1038/s41467-021-26921-5
- Ede, J. M. (2021). Deep Learning in Electron Microscopy. Mach. Learn.: Sci. Technol., 2(1), 011004. 10.1088/2632-2153/abd614
- Treder, K. P., Huang, C., Kim, J. S., & Kirkland, A. I. (2022). Applications of Deep Learning in Electron Microscopy. Microscopy, 71(Supplement_1), i100–i115. 10.1093/jmicro/dfab043
- Kalinin, S. V., Ophus, C., Voyles, P. M., Erni, R., Kepaptsoglou, D., Grillo, V., Lupini, A. R., Oxley, M. P., Schwenker, E., Chan, M. K. Y., Etheridge, J., Li, X., Han, G. G. D., Ziatdinov, M., Shibata, N., & Pennycook, S. J. (2022). Machine Learning in Scanning Transmission Electron Microscopy. Nat Rev Methods Primers, 2(1), 1–28. 10.1038/s43586-022-00095-w
- Madsen, J., Liu, P., Kling, J., Wagner, J. B., Hansen, T. W., Winther, O., & Schiøtz, J. (2018). A Deep Learning Approach to Identify Local Structures in Atomic-Resolution Transmission Electron Microscopy Images. Advanced Theory and Simulations, 1(8), 1800037. 10.1002/adts.201800037
- Sadre, R., Ophus, C., Butko, A., & Weber, G. H. (2021). Deep Learning Segmentation of Complex Features in Atomic-Resolution Phase-Contrast Transmission Electron Microscopy Images. Microscopy and Microanalysis, 27(4), 804–814. 10.1017/S1431927621000167
- Groschner, C., Choi, C., & Scott, M. C. (2020). High Throughput Pipeline for Segmentation and Defect Identification. 10.5281/zenodo.3755011
- Rangel DaCosta, L., Sytwu, K., Groschner, C. K., & Scott, M. C. (2024). A Robust Synthetic Data Generation Framework for Machine Learning in High-Resolution Transmission Electron Microscopy (HRTEM). Npj Comput Mater, 10(1), 1–11. 10.1038/s41524-024-01336-0
- Wang, X., Li, J., Ha, H. D., Dahl, J. C., Ondry, J. C., Moreno-Hernandez, I., Head-Gordon, T., & Alivisatos, A. P. (2021). AutoDetect-mNP: An Unsupervised Machine Learning Algorithm for Automated Analysis of Transmission Electron Microscope Images of Metal Nanoparticles. JACS Au, 1(3), 316–327. 10.1021/jacsau.0c00030