devdharpatel
/

tla-InvertedDoublePendulum-v2

Reinforcement Learning

InvertedDoublePendulum-v2

deep-reinforcement-learning

Model card Files Files and versions Community

devdharpatel commited on Nov 3

Commit

f100dad

•

1 Parent(s): e2a2843

Update README

Files changed (1) hide show

README.md +88 -3

README.md CHANGED Viewed

@@ -1,3 +1,88 @@
----
-license: bsd-3-clause
----

+---
+license: bsd-3-clause
+tags:
+- InvertedDoublePendulum-v2
+- reinforcement-learning
+- decisions
+- TLA
+- deep-reinforcement-learning
+model-index:
+- name: TLA
+  results:
+  - metrics:
+    - type: mean_reward
+      value: 9356.67
+      name: mean_reward
+    - type: Action Repetition
+      value: .7522
+      name: Action Repetition
+    - type: Average Decisions
+      value: 247.76
+      name: Average Decisions
+    task:
+      type: OpenAI Gym
+      name: OpenAI Gym
+    dataset:
+      name: InvertedDoublePendulum-v2
+      type: InvertedDoublePendulum-v2
+  Paper: https://arxiv.org/abs/2305.18701
+  Code: https://github.com/dee0512/Temporally-Layered-Architecture
+---
+# Temporally Layered Architecture: InvertedDoublePendulum-v2
+These are 10 trained models over **seeds (0-9)** of **[Temporally Layered Architecture (TLA)](https://github.com/dee0512/Temporally-Layered-Architecture)** agent playing **InvertedDoublePendulum-v2**.
+## Model Sources
+**Repository:** [https://github.com/dee0512/Temporally-Layered-Architecture](https://github.com/dee0512/Temporally-Layered-Architecture)
+**Paper:** [https://doi.org/10.1162/neco_a_01718](https://doi.org/10.1162/neco_a_01718)
+**Arxiv:** [arxiv.org/abs/2305.18701](https://arxiv.org/abs/2305.18701)
+# Training Details:
+Using the repository:
+```
+python main.py --env_name <environment> --seed <seed>
+```
+# Evaluation:
+Download the models folder and place it in the same directory as the cloned repository.
+Using the repository:
+```
+python eval.py --env_name <environment>
+```
+## Metrics:
+**mean_reward:** Mean reward over 10 seeds
+**action_repeititon:** percentage of actions that are equal to the previous action
+**mean_decisions:** Number of decisions required (neural network/model forward pass)
+# Citation
+The paper can be cited with the following bibtex entry:
+## BibTeX:
+```
+@article{10.1162/neco_a_01718,
+    author = {Patel, Devdhar and Sejnowski, Terrence and Siegelmann, Hava},
+    title = "{Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures}",
+    journal = {Neural Computation},
+    pages = {1-30},
+    year = {2024},
+    month = {10},
+    issn = {0899-7667},
+    doi = {10.1162/neco_a_01718},
+    url = {https://doi.org/10.1162/neco\_a\_01718},
+    eprint = {https://direct.mit.edu/neco/article-pdf/doi/10.1162/neco\_a\_01718/2474695/neco\_a\_01718.pdf},
+}
+```
+## APA:
+```
+Patel, D., Sejnowski, T., & Siegelmann, H. (2024). Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures. Neural Computation, 1-30.
+```