notes

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

commit e7816522f701354b60125d78469103c62086b736
parent 7974bb6da9e46e6f4f5d2ec6a13142e4c06a1b53
Author: Andrew <andrewlaack1@gmail.com>
Date:   Wed, 13 Nov 2024 20:28:21 -0600

Took notes on Ch 1 of DL book

Diffstat:
ALossFunction.md | 12++++++++++++
MMachineLearning.md | 4++--
MReinforcementLearning.md | 4+---
MTemporalDifferenceLearning.md | 8+++++---
AUtilityFunction.md | 8++++++++
5 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/LossFunction.md b/LossFunction.md @@ -0,0 +1,12 @@ +:ml: :dl: +# Loss Function + +Ch 1 + +## Notes + +**Definition:** A loss function is a function from E -> R where E is the set of all events (outcomes) and R is the set of all real numbers where the function describes how bad a given event E is. + +When I say 'event' this is in the most general of senses. In the case of RL this could simply be a state and in supervised learning this could be a prediction based on a sample. + +When defining a loss function, we are stipulating how bad a result is. diff --git a/MachineLearning.md b/MachineLearning.md @@ -35,11 +35,11 @@ Deep Learning With Python (Francois Chollet): Ch 1: * [RepresentationLearning](RepresentationLearning.md) -* LossFunction - Cost Function - Objective Function +* [LossFunction](LossFunction.md) +* [UtilityFunction](UtilityFunction.md) Ch 2: -FINISH CH 1 NOTES FIRST!!! ISL Python: diff --git a/ReinforcementLearning.md b/ReinforcementLearning.md @@ -1,8 +1,6 @@ :ml: :index: :rl: # Reinforcement Learning -Reinforcement Learning Index - Reinforcement Learning: An Introduction (Sutton & Barto) Chapter 1 (Introduction) @@ -28,7 +26,7 @@ L1 L2 * [MarkovDecisionProcesses](MarkovDecisionProcesses.md) -* [MarkovAssumption](MarkovAssumption.md) - Also referred to as Markov Property +* [MarkovAssumption](MarkovAssumption.md) * [DiscountFactor](DiscountFactor.md) * [MarkovRewardProcess](MarkovRewardProcess.md) * [MarkovProcess](MarkovProcess.md) diff --git a/TemporalDifferenceLearning.md b/TemporalDifferenceLearning.md @@ -1,8 +1,10 @@ :ml: :rl: -# - +# Temporal Difference Learning +L4 ## Notes -**Definition:** +**Definition:** Temporal difference learning is a reinforcement learning process where we update the estimate of being in any given state by using the discounted value of next steps. + +This is different than MC because it does not require us to finish the episode, instead we can rely upon other states to calculate our expected return. diff --git a/UtilityFunction.md b/UtilityFunction.md @@ -0,0 +1,8 @@ +:dl: :ml: :rl: +# Utility Function + +Ch 1 + +## Notes + +**Definition:** A utility function is a function from E -> R where E is the set of events, R is the set of real numbers, and the mapping describes how good the event is.