Took some notes - notes - Unnamed repository; edit this file 'description' to name the repository.

commit df6e3b8782c131ec64360c5a8cc3b23aefd894cf
parent cbb41d56fd32d89cc0f9e920150039251d5fb5ed
Author: Andrew <andrewlaack1@gmail.com>
Date:   Wed, 30 Oct 2024 20:39:07 -0500

Took some notes

Diffstat:
M DiscreteMath.md  | 8 +++++++-
A EquivalenceRelation.md  | 8 ++++++++
M ReinforcementLearning.md  | 33 +++++++++++++++++++++++----------
A Representative.md  | 10 ++++++++++
M index.md  | 1 +

5 files changed, 49 insertions(+), 11 deletions(-)
diff --git a/DiscreteMath.md b/DiscreteMath.md
@@ -175,4 +175,10 @@ Unit 9.4 (Closures of Relations)
 	- [Closure](Closure.md)
 
 Unit 9.5 (Equivalence Relations)
-	- EquivalenceRelation
+	- [EquivalenceRelation](EquivalenceRelation.md)
+	- [EquivalenceClass](EquivalenceClass.md) ([a] notation)
+	- [Representative](Representative.md)
+	- [Partition](Partition.md)
+
+Unit 9.6 (Partial Orderings)
+	- 
diff --git a/EquivalenceRelation.md b/EquivalenceRelation.md
@@ -0,0 +1,8 @@
+:discrete: 
+# Equivalence Relation
+
+Ch 9.5
+
+## Notes
+
+**Definition:** An equivalence relation is a relation that is reflexive (xRx), symmetric (xRy -> yRx), and transitive (xRy and yRz -> xRz).
diff --git a/ReinforcementLearning.md b/ReinforcementLearning.md
@@ -1,16 +1,29 @@
-:ml:
+:ml: :index:
 # Reinforcement Learning
 
-ML L1
+Reinforcement Learning Index
 
-## Notes
+Reinforcement Learning: An Introduction (Sutton & Barto)
 
-**Definition:** Reinforce good behavior and punish bad behavior to get closer to goals. 
+Chapter 1 (Introduction)
+* MarkovDecisionProcesses
+* Exploit
+* Explore
+* Policy
+* RewardSignal
+* ValueFunction
+* Model
+* EvolutionaryMethods (learn not interacting)
 
-If there is not an 'optimal' way to do something, you train the system to do stuff that works. Think like a dog, you let the dog do what it wants and then reinforce good behavior and punish bad behavior. 
+Stanford Lectures
 
-This would probably be what you would want to create a chess model.
-
-In reinforcement learning the learning system is called an agent. This agent can observe the environment and perform actions to be rewarded.
-
-The strategy followed by the agent is called the 'policy'.
+L1
+* CreditAssignmentProblem
+* ImitationLearning (separate)
+* MarkovAssumption
+* MDP
+* POMDP
+* ModelFree
+* Bandits
+* Evaluation
+* Control
diff --git a/Representative.md b/Representative.md
@@ -0,0 +1,10 @@
+:discrete: 
+# Representative
+
+Ch 9.5
+
+## Notes
+
+**Definition:** A representative is any element of an equivalence class chosen to describe the class.
+
+Often we use the least positive residual for this (think in the case of mod equivalence classes).
diff --git a/index.md b/index.md
@@ -28,6 +28,7 @@ This is the index for my main note classifications. I will maintain this as a ho
 [[Physics.md]]
 [[Assembly.md]]
 [[Vocabulary.md]]
+[[ReinforcementLearning.md]]
 
 ## Things to Learn More About

	notes Unnamed repository; edit this file 'description' to name the repository.
	Log \| Files \| Refs

M	DiscreteMath.md	\|	8	+++++++-
A	EquivalenceRelation.md	\|	8	++++++++
M	ReinforcementLearning.md	\|	33	+++++++++++++++++++++++----------
A	Representative.md	\|	10	++++++++++
M	index.md	\|	1	+