commit 8c6928132ac1579f6abb7509965f7ca0eb9f7bdb parent e1cd74d08f67bd0419a4100a1598f8800a2c7b3f Author: Andrew <andrewlaack1@gmail.com> Date: Fri, 17 May 2024 13:13:18 -0500 Took some notes Diffstat:
| A | CorrelationCoefficient.md | | | 10 | ++++++++++ |
| M | MachineLearning.md | | | 1 | + |
| M | StratifiedSampling.md | | | 2 | ++ |
3 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/CorrelationCoefficient.md b/CorrelationCoefficient.md @@ -0,0 +1,10 @@ +:ml: :ds: :stats: +# Correlation Coefficient + +ML CH2 + +## Notes + +**Definition:** The correlation coefficient is a floating point number that represents the strength of a linear relationship between two variables x and y. + +The highest value is 1 and the lowest is -1. 1 and -1 mean there is either a proportional or inverse relationship between the two variables. diff --git a/MachineLearning.md b/MachineLearning.md @@ -68,6 +68,7 @@ Concepts: [[RMSE.md]] [[MAE.md]] [[StratifiedSampling.md]] +[[CorrelationCoefficient.md]] To do: diff --git a/StratifiedSampling.md b/StratifiedSampling.md @@ -8,3 +8,5 @@ ML CH2 **Definition:** Stratified sampling is the process of selecting samples based on the likelihood of samples being from strata. This is often used when there are smaller sample sizes that can't guarantee an accurate representative sample for testing and training data. We then define some strata and try to ensure accurate representation from each grouping to get more generalizable data. + +When you do sampling to make sure you get the correct ratios of data from each stratum this is called proportionate allocation whereas there is also optimum allocation or disproportionate allocation where we try to minimize variance (deviation).