awacke1 commited on
Commit
5c1c468
·
verified ·
1 Parent(s): 73c62c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -6
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: TalkingAIResearcher
3
  emoji: 🏆🏆🏆
4
  colorFrom: red
5
  colorTo: purple
@@ -8,14 +8,66 @@ sdk_version: 1.41.1
8
  app_file: app.py
9
  pinned: true
10
  license: mit
11
- short_description: TalkingAIResearcher
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
 
16
- #OPENAI_API_KEY=your_key
17
- #ANTHROPIC_API_KEY=your_key
18
- #HF_KEY=your_key
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  Features:
21
 
 
1
  ---
2
+ title: DeepResearchEvaluator
3
  emoji: 🏆🏆🏆
4
  colorFrom: red
5
  colorTo: purple
 
8
  app_file: app.py
9
  pinned: true
10
  license: mit
11
+ short_description: Deep Research Evaluator for Long Horizon Learning Tasks
12
  ---
13
 
 
14
 
15
+ A Deep Research Evaluator is a conceptual AI system designed to analyze and synthesize information from extensive research literature, such as arXiv papers, to learn about specific topics and generate code applicable to long-horizon tasks in AI. This involves understanding complex subjects, identifying relevant methodologies, and implementing solutions that require planning and execution over extended sequences.
16
+
17
+ Key Topics and Related Papers:
18
+
19
+ Long-Horizon Task Planning in Robotics:
20
+
21
+ "MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model"
22
+ Authors: Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song
23
+ This paper introduces a method that decomposes complex tasks at multiple levels to enhance planning capabilities using open-source large language models.
24
+ ARXIV
25
+
26
+ "ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning"
27
+ Authors: Zhehua Zhou, Jiayang Song, Kunpeng Yao, Zhan Shu, Lei Ma
28
+ The study presents a framework that improves LLM-based planning through an iterative self-refinement process, enhancing feasibility and correctness in task plans.
29
+ ARXIV
30
+
31
+ Skill-Based Reinforcement Learning:
32
+
33
+ "Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks"
34
+ Authors: Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, Zongqing Lu
35
+ This research focuses on building multi-task agents in open-world environments by learning basic skills and planning over them to accomplish long-horizon tasks efficiently.
36
+ ARXIV
37
+
38
+ "SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks"
39
+ Authors: Yongyan Wen, Siyuan Li, Rongchang Zuo, Lei Yuan, Hangyu Mao, Peng Liu
40
+ The paper proposes a framework that integrates a differentiable decision tree within the high-level policy to generate skill embeddings, enhancing explainability in decision-making for complex tasks.
41
+ ARXIV
42
+
43
+ Neuro-Symbolic Approaches:
44
+
45
+ "Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation"
46
+ Authors: Jie-Jing Shao, Hao-Ran Hao, Xiao-Wen Yang, Yu-Feng Li
47
+ This work introduces a framework that combines data-driven learning and symbolic-based reasoning to enable long-horizon planning through abductive imitation learning.
48
+ ARXIV
49
+
50
+ "CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning"
51
+ Authors: [Authors not specified]
52
+ The study presents a method that utilizes large language models to translate constraints into formal specifications, facilitating long-horizon task and motion planning.
53
+ ARXIV
54
+
55
+ Evaluation Frameworks for AI Models:
56
+
57
+ "ASI: Accuracy-Stability Index for Evaluating Deep Learning Models"
58
+ Authors: Wei Dai, Daniel Berleant
59
+ The paper introduces the Accuracy-Stability Index (ASI), a quantitative measure that incorporates both accuracy and stability for assessing deep learning models.
60
+ ARXIV
61
+
62
+ "Benchmarks for Deep Off-Policy Evaluation"
63
+ Authors: Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine
64
+ This research provides a collection of policies that, in conjunction with existing offline datasets, can be used for benchmarking off-policy evaluation in deep learning.
65
+ ARXIV
66
+
67
+ These topics and papers contribute to the development of AI systems capable of understanding research literature and applying the acquired knowledge to complex, long-horizon tasks, thereby advancing the field of artificial intelligence.
68
+
69
+ ---
70
+
71
 
72
  Features:
73