commit e7816522f701354b60125d78469103c62086b736
parent 7974bb6da9e46e6f4f5d2ec6a13142e4c06a1b53
Author: Andrew <andrewlaack1@gmail.com>
Date: Wed, 13 Nov 2024 20:28:21 -0600
Took notes on Ch 1 of DL book
Diffstat:
5 files changed, 28 insertions(+), 8 deletions(-)
diff --git a/LossFunction.md b/LossFunction.md
@@ -0,0 +1,12 @@
+:ml: :dl:
+# Loss Function
+
+Ch 1
+
+## Notes
+
+**Definition:** A loss function is a function from E -> R where E is the set of all events (outcomes) and R is the set of all real numbers where the function describes how bad a given event E is.
+
+When I say 'event' this is in the most general of senses. In the case of RL this could simply be a state and in supervised learning this could be a prediction based on a sample.
+
+When defining a loss function, we are stipulating how bad a result is.
diff --git a/MachineLearning.md b/MachineLearning.md
@@ -35,11 +35,11 @@ Deep Learning With Python (Francois Chollet):
Ch 1:
* [RepresentationLearning](RepresentationLearning.md)
-* LossFunction - Cost Function - Objective Function
+* [LossFunction](LossFunction.md)
+* [UtilityFunction](UtilityFunction.md)
Ch 2:
-FINISH CH 1 NOTES FIRST!!!
ISL Python:
diff --git a/ReinforcementLearning.md b/ReinforcementLearning.md
@@ -1,8 +1,6 @@
:ml: :index: :rl:
# Reinforcement Learning
-Reinforcement Learning Index
-
Reinforcement Learning: An Introduction (Sutton & Barto)
Chapter 1 (Introduction)
@@ -28,7 +26,7 @@ L1
L2
* [MarkovDecisionProcesses](MarkovDecisionProcesses.md)
-* [MarkovAssumption](MarkovAssumption.md) - Also referred to as Markov Property
+* [MarkovAssumption](MarkovAssumption.md)
* [DiscountFactor](DiscountFactor.md)
* [MarkovRewardProcess](MarkovRewardProcess.md)
* [MarkovProcess](MarkovProcess.md)
diff --git a/TemporalDifferenceLearning.md b/TemporalDifferenceLearning.md
@@ -1,8 +1,10 @@
:ml: :rl:
-#
-
+# Temporal Difference Learning
+L4
## Notes
-**Definition:**
+**Definition:** Temporal difference learning is a reinforcement learning process where we update the estimate of being in any given state by using the discounted value of next steps.
+
+This is different than MC because it does not require us to finish the episode, instead we can rely upon other states to calculate our expected return.
diff --git a/UtilityFunction.md b/UtilityFunction.md
@@ -0,0 +1,8 @@
+:dl: :ml: :rl:
+# Utility Function
+
+Ch 1
+
+## Notes
+
+**Definition:** A utility function is a function from E -> R where E is the set of events, R is the set of real numbers, and the mapping describes how good the event is.