Sunday, November 1, 2020

In the context of reinforcement learning we will show that a specific scheme of Monte Carlo control is monotonic if Q(a, pi) is well estimated by the exploration stage.  https://drive.google.com/file/d/11Aa92Mr3nMF1Gxa5r0kIiHfg-9wn_rkI/view?usp=sharing

No comments:

Post a Comment

  Our next ML study group meeting will take place on Monday the 8 th  of October.   I'll cover the contraction theorem.   See relevant s...