notes

Personal notes
git clone git://git.laack.co/notes.git
Log | Files | Refs

NeuralNetworks.md (1661B)


      1 # (Artificial) Neural Networks (ANNs)
      2 
      3 ML D5
      4 
      5 **Definition:** Artificial neural networks are machine learning models that mimick biological neurons to complete some task.
      6 
      7 ReLU activations can be used on output layers to force the output to be positive. Additionally, we can use softplus which is relu but smooth to set output values because by default there is not an activation function for the output layer.
      8 
      9 ### Hidden Layer Count Selection
     10 
     11 Deeper neural networks have better parameter efficiency. This means you need less neurons to model complex functions when compared with shallower NNs.
     12 
     13 ### Neuron Count Per Layer 
     14 
     15 It is common for all layers to be the same in most cases. There are however times when we make them a pyramid shape, descending, because each layer picks out different information that coalesces into higher level information. Another common approach is to make the first hidden layuer large and then all subsequent ones the same size (smaller). 
     16 
     17 In most cases, having all layers the same size is equally as accurate as a pyramid structure and reduces the number of hyperparameters to tune which is a good thing.
     18 
     19 Basically, normally they should all be the same size. Sometimes first hidden is bigger and the rest are same size smaller. Sometimes make a pyramid, but this increases the number of hyperparams.
     20 
     21 ### Count Info (Combined # of layers and neurons per layer)
     22 
     23 Sometimes we use a stretch pants method to prevent overfitting. We do this by selecting a bigger model than needed and then using early stopping to prevent overfitting. 
     24 
     25 Generally, increasing the number of layers is better than increasing the number of neurons.