ReinforcementLearning.md (1822B)
1 # Reinforcement Learning 2 3 Reinforcement Learning: An Introduction (Sutton & Barto) 4 5 Chapter 1 (Introduction) 6 * [Markov Decision Processes](MarkovDecisionProcesses.md) 7 * [Exploit](Exploit.md) 8 * [Explore](Explore.md) 9 * [Policy](Policy.md) 10 * [Reward Signal](RewardSignal.md) 11 * [Value Function](ValueFunction.md) 12 * [Model](Model.md) 13 * [Evolutionary Methods](EvolutionaryMethods.md) 14 15 DeepMind UCL Lectures 16 17 L1 18 * [Credit Assignment Problem](CreditAssignmentProblem.md) 19 * [Imitation Learning](ImitationLearning.md) 20 * [Markov Assumption](MarkovAssumption.md) 21 * [Partially Observable Markov Decision Process](PartiallyObservableMarkovDecisionProcess.md) 22 * [Model Free](ModelFree.md) 23 * [Bandits](Bandits.md) 24 * [Evaluation](Evaluation.md) 25 26 L2 27 * [Markov Decision Processes](MarkovDecisionProcesses.md) 28 * [Markov Assumption](MarkovAssumption.md) 29 * [Discount Factor](DiscountFactor.md) 30 * [Markov Reward Process](MarkovRewardProcess.md) 31 * [Markov Process](MarkovProcess.md) 32 * [Return](Return.md) 33 * [Policy](Policy.md) 34 * [Bellman Equation](BellmanEquation.md) 35 36 L3 37 * [Dynamic Programming](DynamicProgramming.md) 38 * [Optimal Substructure](OptimalSubstructure.md) 39 * [Overlapping Subproblems](OverlappingSubproblems.md) 40 41 L4 42 * [Model Free](ModelFree.md) 43 * [Episodic](Episodic.md) 44 * [Episode](Episode.md) 45 * [Monte Carlo Learning](MonteCarloLearning.md) 46 * [Incremental Mean](IncrementalMean.md) 47 * [Temporal Difference Learning](TemporalDifferenceLearning.md) 48 * [Frequency Heuristic](FrequencyHeuristic.md) 49 * [Recency Heuristic](RecencyHeuristic.md) 50 * [Eligibility Traces](EligibilityTraces.md) 51 52 L5 53 * [Model Free](ModelFree.md) 54 * [Monte Carlo Learning](MonteCarloLearning.md) 55 * [Temporal Difference Learning](TemporalDifferenceLearning.md) 56 * [On Policy Learning](OnPolicyLearning.md) 57 * [Off Policy Learning](OffPolicyLearning.md) 58 * EpsilonGreedy