commit 0567b0893f86d12df1d1cac09b216f0ca18b8eba parent 265af28c5afa0ca616237b387c395026ec128139 Author: Andrew <andrewlaack1@gmail.com> Date: Sun, 26 May 2024 10:02:38 -0500 Clean up Diffstat:
| D | ; | | | 28 | ---------------------------- |
1 file changed, 0 insertions(+), 28 deletions(-)
diff --git a/; b/; @@ -1,28 +0,0 @@ -:ml: -# Standardization - -ML CH2 - -## Notes - -**Definition:** Standardization is the process of scaling values such that the value is equivalent to itself subtracing the mean and dividing by the standard deviation. - -This is optimal in some cases as [[MinMaxScaling.md]] has issues with outliers. If there is one outlier that is much bigger than all other values the max will be very large thus squishing the range of most values to be low numbers which can effect the accuracy of models. - -See [[FeatureScaling.md]] for more. - -Sample implementation: - -```python - -# Get number columns -df = df.select_dtypes(include=['number']) - -for i in df: - mean = df[i].mean() - std = df[i].std() - df[i] = (df[i] - mean) / std - -print(df) - -```