Update README.md
87c00c1
verified
Evaluation Results
PushT Environment
Method |
Avg Sum Reward |
Avg Max Reward |
Success % |
Eval Time (s) |
Eval Ep Time (s) |
DOT |
98.61 |
0.927 |
67.3% |
949.14 |
0.949 |
Diffusion PushT |
104.84 |
0.955 |
65.4% |
730.87 |
1.462 |
vqbet_pusht |
87.22 |
0.817 |
57.0% |
465.85 |
0.932 |
PushT Keypoints Environment
Method |
Avg Sum Reward |
Avg Max Reward |
Success % |
Eval Time (s) |
Eval Ep Time (s) |
Diffusion |
101.66 |
0.969 |
71.0% |
63.35 |
0.127 |
DOT |
129.63 |
0.921 |
44.7% |
382.24 |
0.382 |
PushT Training
- Policy: DOT
- Dataset:
lerobot/pusht
- Environment: PushT-v0
- Batch Size: 24
- Training Steps: 1,000,000
- Logging: Every 1,000 steps
- Evaluation Frequency: Every 10,000 steps
- Checkpoint Saving: Every 50,000 steps
- Random Seed: 100,000
- Workers: 24
- Mixed Precision (AMP): Enabled
- Device: CUDA
PushT Keypoints Training
- Policy: DOT
- Dataset:
lerobot/pusht_keypoints
- Environment: PushT-v0
- Batch Size: 24
- Training Steps: 1,000,000
- Logging: Every 1,000 steps
- Evaluation Frequency: Every 10,000 steps
- Checkpoint Saving: Every 50,000 steps
- Random Seed: 100,000
- Workers: 24
- Mixed Precision (AMP): Enabled
- Device: CUDA
- Training Horizon: 30
- Inference Horizon: 30