notes

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

commit faeca6df8e1fcf28445dfebcfa449a7c6b28aa36
parent 4f6bc191bc4dcb2b37ab2e5764f99e4e112e2a09
Author: Andrew <andrewlaack1@gmail.com>
Date:   Sat, 11 May 2024 12:52:31 -0500

Taking notes ml ch1 almost done

Diffstat:
MCS202.md | 2++
ACanaryValue.md | 10++++++++++
AGeneralizationError.md | 12++++++++++++
MLearningRate.md | 2++
MMachineLearning.md | 3+++
AOverfitting.md | 16++++++++++++++++
ASentinelValue.md | 23+++++++++++++++++++++++
AUnderfitting.md | 10++++++++++
8 files changed, 78 insertions(+), 0 deletions(-)

diff --git a/CS202.md b/CS202.md @@ -16,3 +16,5 @@ This is the index for my cs 202 notes. [[BreadthFirstSearch.md]] [[Rvalue.md]] [[Lvalue.md]] +[[SentinelValue.md]] +[[CanaryValue.md]] diff --git a/CanaryValue.md b/CanaryValue.md @@ -0,0 +1,10 @@ +:cs202: +# Canary Value + +CS202 SelfStudy + +## Notes + +**Definition:** A canary value is used to detect buffer overflows by placing dummy data to be validated at some future time to ensure buffer overflows do not occur. + +When doing this, we create dummy data in a sequential piece of memory and then at some future time validate the data stored there to ensure buffer overflows are not occuring as they would change this data. diff --git a/GeneralizationError.md b/GeneralizationError.md @@ -0,0 +1,12 @@ +:ml: +# Generalization Error + +ML CH1 + +## Notes + +**Definition:** Generalization error or out-of-sample error, is the error rate of a model on data that is not in the training set. + +When testing a model it is important to have a training set and a test set which is a certain amount of the total number of samples. You then train the model and check to see its accuracy on the test set. This accuracy is the generalization error rate. + +It is common practice to use 80% of the data for training and 20% for testing. There is also sometimes another set of data called the holdout set which is compared against to give another layer of verification. This is important because sometimes models will be tuned using different hyperparameters (learning rates) and then they may be better for the 20% of testing data, but by doing this you basically tuned the model to be the best for both the training and testing set so it is useful to have one more set in these cases. This is also sometimes referred to as the validation set, dev set, or development set. In this case you would first train on training data, test them all against the dev set, select the best one, and then evaluate on the test set for generalization error. diff --git a/LearningRate.md b/LearningRate.md @@ -11,3 +11,5 @@ ML L2 See [[GradientDescentCode.md]] and [[GradientDescent.md]] for an example of when a learning rate would be used and an implementation of it. Additionally, learning rate in a higher level sense, with regard to online learning, is how quickly a model will adapt to new data. + +These constants that affect learning rate are called "hyperparameters" which are defined as constants prior to model training that are not built into the model. diff --git a/MachineLearning.md b/MachineLearning.md @@ -62,6 +62,9 @@ Concepts: [[OfflineLearning.md]] [[OnlineLearning.md]] [[KNearestNeighbor.md]] +[[Overfitting.md]] +[[Underfitting.md]] +[[GeneralizationError.md]] To do: diff --git a/Overfitting.md b/Overfitting.md @@ -0,0 +1,16 @@ +:ml: +# Overfitting + +ML CH1 + +## Notes + +**Definition:** Overfitting is when a model is trained on data and performs well on it but lacks the ability to generalize. + +Generally, this is caused by having a complex model with lots of features but not enough training samples or training samples that have too much noise. This issue can be resolved by simplifying the model (decrease features), removing noise from the samples, or increasing the number of samples. + +When reducing the risk of overfitting by simplifying a model we call this regularization. Doing this we can either remove features or limit the one or more degrees of freedom of the model. Let's assume we are doing linear regression, we can limit the m value (mx+b) to be within a certain range so while the model has two degrees of freedom still, it is simpler and thus, in some cases, more generalizable depending on the training samples and the inputs being inferenced upon. + +Overfitting can be seen when you train on training data and find that the test set values have a high [[GeneralizationError.md]] meaning thatn the model is unable to generalize. + +Overfitting can be easily thought about as making your model too good at the training data which limits its ability to generalize. diff --git a/SentinelValue.md b/SentinelValue.md @@ -0,0 +1,23 @@ +:cs202: +# Sentinel Value + +CS202 (personal learning) + +## Notes + +**Definition:** A sentinel value is a constant value used to end an execution loop. + +This is also referred to as a flag value, trip value, rogue value, signal value, or dummy data. + +This is how you describe -1 in the context of a bfs algorithm where -1 denotes a visited location. When doing this, we know -1 is an in-band piece of data (valid based on type), but distinct from legal data values (ie. positives if we are using a non-negative weighted graph as an example). + +Another example where we use -1 as a sentinel value is as follows: + +```python3 + +def find(arr, val): + for i in arr: + if i == val: + return i + return -1 +``` diff --git a/Underfitting.md b/Underfitting.md @@ -0,0 +1,10 @@ +:ml: +# Underfitting + +ML CH1 + +## Notes + +**Definition:** Using a model that is too simple to learn the underlying structure of data. + +See [[Overfitting.md]] for the inverse of this.