ArthurFischel commited on
Commit
1ab224b
1 Parent(s): aff5e71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -12,6 +12,14 @@ pinned: true
12
 
13
  ![image info](./mlion.png)
14
 
 
 
 
 
 
 
 
 
15
  # 7/25/23
16
  https://astralcodexten.substack.com/p/were-not-platonists-weve-just-learned
17
  intelligence explosion
 
12
 
13
  ![image info](./mlion.png)
14
 
15
+ # 7/27/23 RougeGPT
16
+ Rogue GPT is an attempt to not only instruction Tune LLM powered agents ( treating llms as reasoning engines) for tasks and the mini hack environment, but explore the use of reinforcement learning and continuous learning for embodied agents inside environments using only llms so that Lessons Learned can be abstracted to other modalities.
17
+
18
+ I want to use small llms and a focused data set so I can really get a good idea of how the moving Parts perform and what data is necessary besides just large general knowledge. I'm under the assumption that a carefully curated data set specifically tailored towards continuous learning in an embodied agent can have desirable results even in less than billion parameter models
19
+
20
+ My rough strat is to use the tiny stories data set, a trajectory based data set only using the human Monk trajectories, and select categories from the Nat hack Wiki. I plan to do some ablations to see which data sets are critical. And once I have the basic instruction tuning up so we can follow basic small instructions, I will then attempt to implement some combination of ideas of some papers that I've been interested in.
21
+
22
+
23
  # 7/25/23
24
  https://astralcodexten.substack.com/p/were-not-platonists-weve-just-learned
25
  intelligence explosion