Overfitting.md (1254B)
1 # Overfitting 2 3 ML CH1 4 5 **Definition:** Overfitting is when a model is trained on data and performs well on it but lacks the ability to generalize. 6 7 Generally, this is caused by having a complex model with lots of features but not enough training samples or training samples that have too much noise. This issue can be resolved by simplifying the model (decrease features), removing noise from the samples, or increasing the number of samples. 8 9 When reducing the risk of overfitting by simplifying a model we call this regularization. Doing this we can either remove features or limit the one or more degrees of freedom of the model. Let's assume we are doing linear regression, we can limit the m value (mx+b) to be within a certain range so while the model has two degrees of freedom still, it is simpler and thus, in some cases, more generalizable depending on the training samples and the inputs being inferenced upon. 10 11 Overfitting can be seen when you train on training data and find that the test set values have a high [GeneralizationError](GeneralizationError.md) meaning thatn the model is unable to generalize. 12 13 Overfitting can be easily thought about as making your model too good at the training data which limits its ability to generalize.