Document Type

Article

Publication Date

9-23-2025

Publication Title

Computers, Materials and Continua

Abstract

Transfer learning is the predominant method for adapting pre-trained models on another task to new domains while preserving their internal architectures and augmenting them with requisite layers in Deep Neural Network models. Training intricate pre-trained models on a sizable dataset requires significant resources to fine-tune hyperparameters carefully. Most existing initialization methods mainly focus on gradient flow-related problems, such as gradient vanishing or exploding, or other existing approaches that require extra models that do not consider our setting, which is more practical. To address these problems, we suggest employing gradient-free heuristic methods to initialize the weights of the final new-added fully connected layer in neural networks from a small set of training data with fewer classes. The approach relies on partitioning the output values from pre-trained models for a small set into two separate intervals determined by the targets. This process is framed as an optimization problem for each output neuron and class. The optimization selects the highest values as weights, considering their direction towards the respective classes. Furthermore, empirical 145 experiments involve a variety of neural network models tested across multiple benchmarks and domains, occasionally yielding accuracies comparable to those achieved with gradient descent methods by using only small subsets.

First Page

4155

Last Page

4171

Volume

85

Issue

2

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS