notes

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

commit d27daa652ce245bac427d0d040769aac4e022414
parent 96f02ace3bd43a4b39b0a50b7a3f4f9aca469cde
Author: Andrew <andrewlaack1@gmail.com>
Date:   Tue,  4 Jun 2024 22:46:16 -0500

Completed notes for the day

Diffstat:
MGradientDescent.md | 4+++-
MLinearRegression.md | 10++++++++++
MMachineLearning.md | 3+++
AMultilabelClassification.md | 10++++++++++
AMultioutputClassification.md | 8++++++++
APartialDerivative.md | 10++++++++++
6 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/GradientDescent.md b/GradientDescent.md @@ -11,8 +11,10 @@ General idea is to start with some $\theta$ (parameters) and keep changing it to More specifically, you pick a starting point, see what direction you should go to get closer to 0 the fastest. You then repeat this algorithm. It's not perfect, but it's fast. -This is the algorithm used for [[LinearRegression.md]] to minimize the cost function (also defined in linear regression). +This is a common algorithm used for [[LinearRegression.md]] when there are lots of features or lots of samples (too big for memory) which would cause the formula for linear regression to be too slow. For a simple implementation of gradient descent using a [[LearningRate.md]] for third degree polynomials see [[GradientDescentCode.md]]. +When using gradient descent for linear regression one must calculate the partial derivative for each variable and then determine if it is positive or negative and move in the correct direction. +Another thing, batch gradient descent is calculating the descents of all variables every time based on all samples given. diff --git a/LinearRegression.md b/LinearRegression.md @@ -17,3 +17,13 @@ from sklearn.linear_model import LinearRegression model = LinearRegression() ``` + +Note that the constant term for a linear regression model is referred to as the bias term or the intercept term. + +The normal equation for linear regression (closed-form solution) is as follows: + +Theta = (X transpose * X) ^ -1 * X transpose * y + +Where y is an m x 1 vector of target values and X is in some way related to inputs as a matrix with a column of ones for the intercept term... + +This way of linear regression, the closed form way, is better when there are not a massive number of features, but if there are lots of features or the training instances aer too vast to fit into memory, then the [[GradientDescent.md]] way is better. diff --git a/MachineLearning.md b/MachineLearning.md @@ -95,6 +95,9 @@ Concepts: [[MulticlassClassifier.md]] [[OneVersusAll.md]] [[OneVersusOne.md]] +[[MultilabelClassification.md]] +[[MultioutputClassification.md]] +[[PartialDerivative.md]] To do: diff --git a/MultilabelClassification.md b/MultilabelClassification.md @@ -0,0 +1,10 @@ +:ml: +# Multilabel Classification + +ML D2 + +## Notes + +**Definition:** Multilabel classification is classification where there may be multiple binary outputs that are true. + +An example of this would be an human recognition model. Let's say we want to know if bob, jim, or mary are in an image. If bob and jim are in the image the model should then return [true, true, false] or some sort of understandable output to denote such information. diff --git a/MultioutputClassification.md b/MultioutputClassification.md @@ -0,0 +1,8 @@ +:ml: +# Multioutput Classification + +ML D2 + +## Notes + +**Definition:** Multioutput classification is a type of multilabel classification where each output can be multiple classes. diff --git a/PartialDerivative.md b/PartialDerivative.md @@ -0,0 +1,10 @@ +:ml: :calc: +# Partial Derivative + +ML D2 + +## Notes + +**Definition:** The partial derivative is a derivative of a multivariate function with respect to a singular variable by considering the others as constants. + +Often this is used in [[GradientDescent.md]] to determine in what ways parameters need to change.