LightZero

Project Description

This is a project I participated during my internship in SenseTime Group Limit.
This project is a part of projects of OpenDILab.
This project focused on combining Monte Carlo Tree Search (MCTS) and Deep Reinforcement learning.
This project aims to implement various state-of-the-art algorithms, ranging from AlphaZero to Muzero Series.
More information can be found in github link and paper.

Preproduced the MuZero Algorithm, an innovative method that extends the applicablity of techniques akin to enabling tree search in environments with unkonwn transition dynamics.
Implemented the Sampled MuZero method, an extension of MuZero, to facilitate learning in domains with arbitrary complex action spaces through strategic planning over sampled actions.
Reproduced the Stochastic Muzero Method, enabling comprehensive incorporation of the stochastic nature of the envrionment in the tree search process.