devdharpatel
commited on
Update Readme
Browse files
README.md
CHANGED
@@ -4,12 +4,13 @@ tags:
|
|
4 |
- Pendulum-v1
|
5 |
- Reinforcement-Learning
|
6 |
- Decisions
|
|
|
7 |
model-index:
|
8 |
- name: TLA
|
9 |
results:
|
10 |
- metrics:
|
11 |
- type: mean_reward
|
12 |
-
value: -154.92 +/- 31.97
|
13 |
name: mean_reward
|
14 |
- type: action_repetition
|
15 |
value: 70.32%
|
@@ -24,4 +25,59 @@ model-index:
|
|
24 |
name: Pendulum-v1
|
25 |
type: Pendulum-v1
|
26 |
---
|
27 |
-
# Temporally Layered Architecture: Pendulum-v1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- Pendulum-v1
|
5 |
- Reinforcement-Learning
|
6 |
- Decisions
|
7 |
+
- TLA
|
8 |
model-index:
|
9 |
- name: TLA
|
10 |
results:
|
11 |
- metrics:
|
12 |
- type: mean_reward
|
13 |
+
value: '-154.92 +/- 31.97'
|
14 |
name: mean_reward
|
15 |
- type: action_repetition
|
16 |
value: 70.32%
|
|
|
25 |
name: Pendulum-v1
|
26 |
type: Pendulum-v1
|
27 |
---
|
28 |
+
# Temporally Layered Architecture: Pendulum-v1
|
29 |
+
|
30 |
+
These are 10 trained models over **seeds (0-9)** of **[Temporally Layered Architecture (TLA)](https://github.com/dee0512/Temporally-Layered-Architecture)** agent playing **Pendulum-v1**.
|
31 |
+
|
32 |
+
## Model Sources
|
33 |
+
|
34 |
+
**Repository:** [https://github.com/dee0512/Temporally-Layered-Architecture](https://github.com/dee0512/Temporally-Layered-Architecture)
|
35 |
+
**Paper:** [https://doi.org/10.1162/neco_a_01718]
|
36 |
+
|
37 |
+
# Training Details:
|
38 |
+
Using the repository:
|
39 |
+
|
40 |
+
```
|
41 |
+
python main.py --env_name <environment> --seed <seed>
|
42 |
+
```
|
43 |
+
|
44 |
+
# Evaluation:
|
45 |
+
|
46 |
+
Using the repository:
|
47 |
+
|
48 |
+
```
|
49 |
+
python eval.py --env_name <environment>
|
50 |
+
```
|
51 |
+
|
52 |
+
## Metrics:
|
53 |
+
|
54 |
+
**mean_reward:** Mean reward over 10 seeds
|
55 |
+
**action_repeititon:** percentage of actions that are equal to the previous action
|
56 |
+
**mean_decisions:** Number of decisions required (neural network/model forward pass)
|
57 |
+
|
58 |
+
|
59 |
+
# Citation
|
60 |
+
|
61 |
+
The paper can be cited with the following bibtex entry:
|
62 |
+
|
63 |
+
## BibTeX:
|
64 |
+
|
65 |
+
```
|
66 |
+
@article{10.1162/neco_a_01718,
|
67 |
+
author = {Patel, Devdhar and Sejnowski, Terrence and Siegelmann, Hava},
|
68 |
+
title = "{Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures}",
|
69 |
+
journal = {Neural Computation},
|
70 |
+
pages = {1-30},
|
71 |
+
year = {2024},
|
72 |
+
month = {10},
|
73 |
+
issn = {0899-7667},
|
74 |
+
doi = {10.1162/neco_a_01718},
|
75 |
+
url = {https://doi.org/10.1162/neco\_a\_01718},
|
76 |
+
eprint = {https://direct.mit.edu/neco/article-pdf/doi/10.1162/neco\_a\_01718/2474695/neco\_a\_01718.pdf},
|
77 |
+
}
|
78 |
+
```
|
79 |
+
|
80 |
+
## APA:
|
81 |
+
```
|
82 |
+
Patel, D., Sejnowski, T., & Siegelmann, H. (2024). Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures. Neural Computation, 1-30.
|
83 |
+
```
|