Adaptive Momentum for Neural Network Optimization

Rashidi, Zana

Adaptive Momentum for Neural Network Optimization

Files

Rashidi_Zana_2019_MSc.pdf (2.98 MB)

Date

2020-05-11

Authors

Rashidi, Zana

Abstract

In this thesis, we develop a novel and efficient algorithm for optimizing neural networks inspired by a recently proposed geodesic optimization algorithm. Our algorithm, which we call Stochastic Geodesic Optimization (SGeO), utilizes an adaptive coefficient on top of Polyaks Heavy Ball method effectively controlling the amount of weight put on the previous update to the parameters based on the change of direction in the optimization path. Experimental results on strongly convex functions with Lipschitz gradients and deep Autoencoder benchmarks show that SGeO reaches lower errors than established first-order methods and competes well with lower or similar errors to a recent second-order method called K-FAC (Kronecker-Factored Approximate Curvature). We also incorporate Nesterov style lookahead gradient into our algorithm (SGeO-N) and observe notable improvements. We believe that our research will open up new directions for high-dimensional neural network optimization where combining the efficiency of first-order methods and the effectiveness of second-order methods proves a promising avenue to explore.

Keywords

Computer science

URI

https://hdl.handle.net/10315/37485

Collections

Computer Science and Engineering

Full item page

Adaptive Momentum for Neural Network Optimization

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections