Bandits.md - notes - Personal notes

Bandits.md (314B)

      1 # Bandits
      2 
      3 L1
      4 
      5 **Definition:** Bandits are a class of problems in RL where an agent repeatedly chooses from a set of actions which give a reward drawn from an unknown probability distribution.
      6 
      7 Basically, there are a set of actions, you do one, you have a reward... that's all
      8 
      9 This is an MDP with only one state.

	notes Personal notes
	git clone git://git.laack.co/notes.git
	Log \| Files \| Refs