Reliable machine learning

Sunday, November 1, 2020

In the context of reinforcement learning we will show that a specific scheme of Monte Carlo control is monotonic if Q(a, pi) is well estimated by the exploration stage. https://drive.google.com/file/d/11Aa92Mr3nMF1Gxa5r0kIiHfg-9wn_rkI/view?usp=sharing

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)