VanishingGradients.md (636B)
1 # Vanishing Gradients 2 3 ML 550 4 5 **Definition:** Vanishing gradients is a neural network problem where lower levels (earlier hidden layers) have such small gradients that gradient steps make tiny changes and the model never converges upon an a good solution. 6 7 This is a very common problem as most of the time gradients get smaller and smaller. As such, this problem is much more common than [ExplodingGradients](ExplodingGradients.md) which primarly happens with RNNs. 8 9 ### Solutions 10 11 Use ReLU and better weight initialization (not gaussian distribution with std deviation of 1). 12 13 See [UnstableGradients](UnstableGradients.md) for more.