commit 704f29bb22a5ea3ebd5f11c55172f0178f241672
parent 5d3850ee3d06254acdae4ca2cbd1429f2618dc7b
Author: Andrew <andrewlaack1@gmail.com>
Date: Sun, 10 Nov 2024 10:46:59 -0600
Took some notes on RL
Diffstat:
5 files changed, 45 insertions(+), 6 deletions(-)
diff --git a/BellmanEquation.md b/BellmanEquation.md
@@ -0,0 +1,10 @@
+:rl: :ml:
+# Bellman Equation
+
+L2
+
+## Notes
+
+**Definition:** The Bellman equation is an equation that states the value of the optimal choice right now is the value of the next choice + the value of the current choice.
+
+This is intuitive and simple to understand, but it is the basis for our ability to do dynamic programming because without it there is no optimal substructure.
diff --git a/DynamicProgramming.md b/DynamicProgramming.md
@@ -0,0 +1,13 @@
+:rl: :ml: :algorithms:
+# Dynamic Programming
+
+L3
+
+## Notes
+
+**Definition:** Dynamic programming is the idea that we can break down a problem into subproblems, solve those subproblems, and then use the results to find the problem's overall solution.
+
+There are two necessary conditions for a problem to be solvable via DP:
+
+1. [OptimalSubstructure](OptimalSubstructure.md)
+2. [OverlappingSubproblems](OverlappingSubproblems.md)
diff --git a/OptimalSubstructure.md b/OptimalSubstructure.md
@@ -0,0 +1,8 @@
+:rl: :ml: :algorithms:
+# Optimal Substructure
+
+L3
+
+## Notes
+
+**Definition:** Optimal substructure is a property of problems such that an overall (optimal) solution to the problem can be derived by finding out something about subproblems.
diff --git a/OverlappingSubproblems.md b/OverlappingSubproblems.md
@@ -0,0 +1,8 @@
+:ml: :rl: :algorithms:
+# Overlapping Subproblems
+
+L3
+
+## Notes
+
+**Definition:** Overlapping subproblems is a property of a problem such that subproblems occur again and again meaning we are being more efficient by solving these subproblems than by trying to solve the original problem.
diff --git a/ReinforcementLearning.md b/ReinforcementLearning.md
@@ -34,9 +34,9 @@ L2
* [MarkovProcess](MarkovProcess.md)
* [Return](Return.md)
* [Policy](Policy.md)
-* State-ValueFunction
-* Action-ValueFunction
-* BellmanEquation
-* ControlTheory (lookup)
-* StateTransitionMatrix
-* OptimalControl (lookup)
+* [BellmanEquation](BellmanEquation.md)
+
+L3
+* [DynamicProgramming](DynamicProgramming.md)
+* [OptimalSubstructure](OptimalSubstructure.md)
+* [OverlappingSubproblems](OverlappingSubproblems.md)