commit df6e3b8782c131ec64360c5a8cc3b23aefd894cf
parent cbb41d56fd32d89cc0f9e920150039251d5fb5ed
Author: Andrew <andrewlaack1@gmail.com>
Date: Wed, 30 Oct 2024 20:39:07 -0500
Took some notes
Diffstat:
5 files changed, 49 insertions(+), 11 deletions(-)
diff --git a/DiscreteMath.md b/DiscreteMath.md
@@ -175,4 +175,10 @@ Unit 9.4 (Closures of Relations)
- [Closure](Closure.md)
Unit 9.5 (Equivalence Relations)
- - EquivalenceRelation
+ - [EquivalenceRelation](EquivalenceRelation.md)
+ - [EquivalenceClass](EquivalenceClass.md) ([a] notation)
+ - [Representative](Representative.md)
+ - [Partition](Partition.md)
+
+Unit 9.6 (Partial Orderings)
+ -
diff --git a/EquivalenceRelation.md b/EquivalenceRelation.md
@@ -0,0 +1,8 @@
+:discrete:
+# Equivalence Relation
+
+Ch 9.5
+
+## Notes
+
+**Definition:** An equivalence relation is a relation that is reflexive (xRx), symmetric (xRy -> yRx), and transitive (xRy and yRz -> xRz).
diff --git a/ReinforcementLearning.md b/ReinforcementLearning.md
@@ -1,16 +1,29 @@
-:ml:
+:ml: :index:
# Reinforcement Learning
-ML L1
+Reinforcement Learning Index
-## Notes
+Reinforcement Learning: An Introduction (Sutton & Barto)
-**Definition:** Reinforce good behavior and punish bad behavior to get closer to goals.
+Chapter 1 (Introduction)
+* MarkovDecisionProcesses
+* Exploit
+* Explore
+* Policy
+* RewardSignal
+* ValueFunction
+* Model
+* EvolutionaryMethods (learn not interacting)
-If there is not an 'optimal' way to do something, you train the system to do stuff that works. Think like a dog, you let the dog do what it wants and then reinforce good behavior and punish bad behavior.
+Stanford Lectures
-This would probably be what you would want to create a chess model.
-
-In reinforcement learning the learning system is called an agent. This agent can observe the environment and perform actions to be rewarded.
-
-The strategy followed by the agent is called the 'policy'.
+L1
+* CreditAssignmentProblem
+* ImitationLearning (separate)
+* MarkovAssumption
+* MDP
+* POMDP
+* ModelFree
+* Bandits
+* Evaluation
+* Control
diff --git a/Representative.md b/Representative.md
@@ -0,0 +1,10 @@
+:discrete:
+# Representative
+
+Ch 9.5
+
+## Notes
+
+**Definition:** A representative is any element of an equivalence class chosen to describe the class.
+
+Often we use the least positive residual for this (think in the case of mod equivalence classes).
diff --git a/index.md b/index.md
@@ -28,6 +28,7 @@ This is the index for my main note classifications. I will maintain this as a ho
[[Physics.md]]
[[Assembly.md]]
[[Vocabulary.md]]
+[[ReinforcementLearning.md]]
## Things to Learn More About