OpenDILabCommunity
/

CartPole-v0-MuZero

@@ -21,7 +21,7 @@ model-index:
       type: CartPole-v0
     metrics:
     - type: mean_reward
-      value: 198.6 +/- 4.2
       name: mean_reward
 ---
@@ -32,7 +32,7 @@ model-index:
 This implementation applies **MuZero** to the OpenAI/Gym/Box2d **CartPole-v0** environment using [LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine).
-**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers.
 ## Model Usage
 ### Install the Dependencies
@@ -139,13 +139,16 @@ push_model_to_hub(
     github_repo_url="https://github.com/opendilab/LightZero",
     github_doc_model_url=None,
     github_doc_env_url=None,
-    installation_guide="pip3 install DI-engine[common_env,video] LightZero",
     usage_file_by_git_clone="./muzero/cartpole_muzero_deploy.py",
     usage_file_by_huggingface_ding="./muzero/cartpole_muzero_download.py",
     train_file="./muzero/cartpole_muzero.py",
     repo_id="OpenDILabCommunity/CartPole-v0-MuZero",
-    platform_info="[DI-engine](https://github.com/opendilab/di-engine) and [LightZero](https://github.com/opendilab/LightZero)",
-    model_description="**LightZero** is a lightweight, efficient, and easy-to-understand open-source algorithm toolkit that combines Monte Carlo Tree Search (MCTS) and Deep Reinforcement Learning (RL).",
     create_repo=False
 )
@@ -288,7 +291,7 @@ exp_config = {
 - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/CartPole-v0-MuZero/blob/main/replay.mp4)
 <!-- Provide the size information for the model. -->
 - **Parameters total size:** 13548.13 KB
-- **Last Update Date:** 2023-12-05
 ## Environments
 <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->

       type: CartPole-v0
     metrics:
     - type: mean_reward
+      value: 200.0 +/- 0.0
       name: mean_reward
 ---
 This implementation applies **MuZero** to the OpenAI/Gym/Box2d **CartPole-v0** environment using [LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine).
+**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers. More details are in paper [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://huggingface.co/papers/2310.08348).
 ## Model Usage
 ### Install the Dependencies
     github_repo_url="https://github.com/opendilab/LightZero",
     github_doc_model_url=None,
     github_doc_env_url=None,
+    installation_guide='''
+pip3 install DI-engine[common_env,video]
+pip3 install LightZero
+''',
     usage_file_by_git_clone="./muzero/cartpole_muzero_deploy.py",
     usage_file_by_huggingface_ding="./muzero/cartpole_muzero_download.py",
     train_file="./muzero/cartpole_muzero.py",
     repo_id="OpenDILabCommunity/CartPole-v0-MuZero",
+    platform_info="[LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine)",
+    model_description="**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers. More details are in paper [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://huggingface.co/papers/2310.08348).",
     create_repo=False
 )
 - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/CartPole-v0-MuZero/blob/main/replay.mp4)
 <!-- Provide the size information for the model. -->
 - **Parameters total size:** 13548.13 KB
+- **Last Update Date:** 2023-12-11
 ## Environments
 <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->