
LightZero

Image Source : opendilab:lightzero
Project Description
-
This is a project I participated during my internship in SenseTime Group Limit.
-
This project is a part of projects of OpenDILab.
-
This project focused on combining Monte Carlo Tree Search (MCTS) and Deep Reinforcement learning.
-
This project aims to implement various state-of-the-art algorithms, ranging from AlphaZero to Muzero Series.
-
More information can be found in github link and paper.
My Contribution
-
Preproduced the MuZero Algorithm, an innovative method that extends the applicablity of techniques akin to enabling tree search in environments with unkonwn transition dynamics.
-
Implemented the Sampled MuZero method, an extension of MuZero, to facilitate learning in domains with arbitrary complex action spaces through strategic planning over sampled actions.
-
Reproduced the Stochastic Muzero Method, enabling comprehensive incorporation of the stochastic nature of the envrionment in the tree search process.
Algorithm Framework
-
Muzero

-
Sampled Muzero

-
Stochastic Muzero

Experimental Result
-
Muzero

-
Sampled Muzero

-
Stochastic Muzero
