Document Type
Article
Publication Date
9-23-2025
Publication Title
Computers, Materials and Continua
Abstract
Transfer learning is the predominant method for adapting pre-trained models on another task to new domains while preserving their internal architectures and augmenting them with requisite layers in Deep Neural Network models. Training intricate pre-trained models on a sizable dataset requires significant resources to fine-tune hyperparameters carefully. Most existing initialization methods mainly focus on gradient flow-related problems, such as gradient vanishing or exploding, or other existing approaches that require extra models that do not consider our setting, which is more practical. To address these problems, we suggest employing gradient-free heuristic methods to initialize the weights of the final new-added fully connected layer in neural networks from a small set of training data with fewer classes. The approach relies on partitioning the output values from pre-trained models for a small set into two separate intervals determined by the targets. This process is framed as an optimization problem for each output neuron and class. The optimization selects the highest values as weights, considering their direction towards the respective classes. Furthermore, empirical 145 experiments involve a variety of neural network models tested across multiple benchmarks and domains, occasionally yielding accuracies comparable to those achieved with gradient descent methods by using only small subsets.
First Page
4155
Last Page
4171
Volume
85
Issue
2
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Lolaev, Musulmon; Paul, Anand; and Kim, Jeonghong, "Heuristic Weight Initialization for Transfer Learning in Classification Problems" (2025). School of Public Health Faculty Publications. 525.
https://digitalscholar.lsuhsc.edu/soph_facpubs/525
10.32604/cmc.2025.064758
Included in
Artificial Intelligence and Robotics Commons, Biostatistics Commons, Data Science Commons, Theory and Algorithms Commons