26/12/2020 - multi arm bandit with greedy epsilon
currently obsessed with this
multi arm bandit with greedy epsilon
explore exploit dilemma
optimal solution is UCB1
you can just read his annotated code
https://github.com/lazyprogrammer/machine_learning_examples/blob/master/ab_testing/epsilon_greedy.py
took it for a spin and manually typed out everything, played around with the #trials
https://repl.it/@wongluyi/bandit-algorithm#bandit.py
where's the reinforcement part though? is it the reward?
todoist recurring due dates with natural language and rotate subtasks or projects and karma points
i use this + projects + subtasks
because i have a 101 things going on, this helps me decide which ones i want to cover more ground faster and rotate among the tasks. also has this goal achieving feature by gamify with 'karma points'
Comments