notes

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

commit 0567b0893f86d12df1d1cac09b216f0ca18b8eba
parent 265af28c5afa0ca616237b387c395026ec128139
Author: Andrew <andrewlaack1@gmail.com>
Date:   Sun, 26 May 2024 10:02:38 -0500

Clean up

Diffstat:
D; | 28----------------------------
1 file changed, 0 insertions(+), 28 deletions(-)

diff --git a/; b/; @@ -1,28 +0,0 @@ -:ml: -# Standardization - -ML CH2 - -## Notes - -**Definition:** Standardization is the process of scaling values such that the value is equivalent to itself subtracing the mean and dividing by the standard deviation. - -This is optimal in some cases as [[MinMaxScaling.md]] has issues with outliers. If there is one outlier that is much bigger than all other values the max will be very large thus squishing the range of most values to be low numbers which can effect the accuracy of models. - -See [[FeatureScaling.md]] for more. - -Sample implementation: - -```python - -# Get number columns -df = df.select_dtypes(include=['number']) - -for i in df: - mean = df[i].mean() - std = df[i].std() - df[i] = (df[i] - mean) / std - -print(df) - -```