MODELS AND METHODS FOR TRAINING NEURAL NETWORKS WITH AN EXTENDED VECTOR OF VARYING PARAMETERS

Authors

  • Dmytro Zelentsov
  • Taras Shaptala

DOI:

https://doi.org/10.34185/1991-7848.itmm.2023.01.037

Keywords:

neural networks, neural network training task, multidimensional optimization, vector of variable parameters, gradient methods.

Abstract

A studied of models and methods for training neural networks using an extended vector of varying parameters is conducted. The training problem is formulated as a continuous multidimensional unconditional optimization problem. The extended vector of varying parameters implies that it includes some parameters of activation functions in addition to weight coefficients. The introduction of additional varying parameters does not change the architecture of a neural network, but makes it impossible to use the back propagation method. A number of gradient methods have been used to solve optimization problems. Different formulations of optimization problems and methods for their solution have been investigated according to accuracy and efficiency criteria.

References

Russell S., Norvig P. Artificial Intelligence: A Modern Approach, fourth edition. London: Pearson, 2020. 1136 p.

Freund Y., Hausler D. Unsupervised learning of distributions on binary vectors using two layer networks. In Advances in Neural Information Processing Systems 4: Conference and Workshop on Neural Information Processing Systems, Denver, 1992. P. 912–919.

Haykin S. Neural Networks: A comprehensive foundation. Prentice Hall, 1999. 842 p.

Downloads

Published

2024-04-03

Issue

Section

Статті