commit ace9cfa24cf3a5265cb3139363bd61d0a1c27069
parent b9b70067784cd72b9c44251b11bb5f1c76242e4b
Author: Andrew <andrewlaack1@gmail.com>
Date: Mon, 3 Jun 2024 22:17:30 -0500
Completed daily ml work
Diffstat:
6 files changed, 50 insertions(+), 1 deletion(-)
diff --git a/ClassificationProblem.md b/ClassificationProblem.md
@@ -10,4 +10,3 @@ ML 1
In other words, if there is a finite set of possible outcomes, it is a classification problem. Oftentimes this manifests as yes/no, but also could include much larger sets of possible values.
The alternative to this would be a [[RegressionProblem.md]] where the output is a continuous set of values.
-
diff --git a/ConfusionMatrix.md b/ConfusionMatrix.md
@@ -0,0 +1,8 @@
+:ml:
+# Confusion Matrix
+
+ML CH3
+
+## Notes
+
+**Definition:** A confusion matrix is a matrix that describes the number of confused sample predictions a model has broken down by both the actual and predicted values.
diff --git a/CrossValidation.md b/CrossValidation.md
@@ -0,0 +1,10 @@
+:ml:
+# Cross-Validation
+
+ML CH3
+
+## Notes
+
+**Definition:** Cross validation is the process of creating a subset of your data and then training the model on some subset of said data.
+
+A common form of this is k-fold cross-validation. This creates k-folds (subsets) and trains the model on each set. It then checks the accuracy of each model by using the other folds as validation sets. This helps to ward off overfitting.
diff --git a/MachineLearning.md b/MachineLearning.md
@@ -83,6 +83,10 @@ Concepts:
[[KMeans.md]]
[[StochasticAlgorithm.md]]
[[Ensembles.md]]
+[[ConfusionMatrix.md]]
+[[CrossValidation.md]]
+[[Precision.md]]
+[[TruePositiveRate.md]]
To do:
diff --git a/Precision.md b/Precision.md
@@ -0,0 +1,14 @@
+:ml:
+# Precision of a classifier
+
+CH 3
+
+## Notes
+
+**Definition:** The precision of a classifier (classification model) is the accuracy of positive predictions.
+
+Here is the formula:
+
+precision = TP / TP+FP
+
+As can be seen, this does not take into account negatives only true positives and false positives.
diff --git a/TruePositiveRate.md b/TruePositiveRate.md
@@ -0,0 +1,14 @@
+:ml:
+# True Positive Rate (TPR) also known as recall and sensitivity
+
+ML CH3
+
+## Notes
+
+**Definition:** This is the ratio of positive instances that are correctly classified.
+
+As such, we have the following equation:
+
+recall = TP / (TP + FN)
+
+This takes the number of true positives and divides by the sum of all actually positives samples (TP + FN).