Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
emoji: 🏆🏆🏆
|
4 |
colorFrom: red
|
5 |
colorTo: purple
|
@@ -8,14 +8,66 @@ sdk_version: 1.41.1
|
|
8 |
app_file: app.py
|
9 |
pinned: true
|
10 |
license: mit
|
11 |
-
short_description:
|
12 |
---
|
13 |
|
14 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
15 |
|
16 |
-
|
17 |
-
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
Features:
|
21 |
|
|
|
1 |
---
|
2 |
+
title: DeepResearchEvaluator
|
3 |
emoji: 🏆🏆🏆
|
4 |
colorFrom: red
|
5 |
colorTo: purple
|
|
|
8 |
app_file: app.py
|
9 |
pinned: true
|
10 |
license: mit
|
11 |
+
short_description: Deep Research Evaluator for Long Horizon Learning Tasks
|
12 |
---
|
13 |
|
|
|
14 |
|
15 |
+
A Deep Research Evaluator is a conceptual AI system designed to analyze and synthesize information from extensive research literature, such as arXiv papers, to learn about specific topics and generate code applicable to long-horizon tasks in AI. This involves understanding complex subjects, identifying relevant methodologies, and implementing solutions that require planning and execution over extended sequences.
|
16 |
+
|
17 |
+
Key Topics and Related Papers:
|
18 |
+
|
19 |
+
Long-Horizon Task Planning in Robotics:
|
20 |
+
|
21 |
+
"MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model"
|
22 |
+
Authors: Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song
|
23 |
+
This paper introduces a method that decomposes complex tasks at multiple levels to enhance planning capabilities using open-source large language models.
|
24 |
+
ARXIV
|
25 |
+
|
26 |
+
"ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning"
|
27 |
+
Authors: Zhehua Zhou, Jiayang Song, Kunpeng Yao, Zhan Shu, Lei Ma
|
28 |
+
The study presents a framework that improves LLM-based planning through an iterative self-refinement process, enhancing feasibility and correctness in task plans.
|
29 |
+
ARXIV
|
30 |
+
|
31 |
+
Skill-Based Reinforcement Learning:
|
32 |
+
|
33 |
+
"Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks"
|
34 |
+
Authors: Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, Zongqing Lu
|
35 |
+
This research focuses on building multi-task agents in open-world environments by learning basic skills and planning over them to accomplish long-horizon tasks efficiently.
|
36 |
+
ARXIV
|
37 |
+
|
38 |
+
"SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks"
|
39 |
+
Authors: Yongyan Wen, Siyuan Li, Rongchang Zuo, Lei Yuan, Hangyu Mao, Peng Liu
|
40 |
+
The paper proposes a framework that integrates a differentiable decision tree within the high-level policy to generate skill embeddings, enhancing explainability in decision-making for complex tasks.
|
41 |
+
ARXIV
|
42 |
+
|
43 |
+
Neuro-Symbolic Approaches:
|
44 |
+
|
45 |
+
"Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation"
|
46 |
+
Authors: Jie-Jing Shao, Hao-Ran Hao, Xiao-Wen Yang, Yu-Feng Li
|
47 |
+
This work introduces a framework that combines data-driven learning and symbolic-based reasoning to enable long-horizon planning through abductive imitation learning.
|
48 |
+
ARXIV
|
49 |
+
|
50 |
+
"CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning"
|
51 |
+
Authors: [Authors not specified]
|
52 |
+
The study presents a method that utilizes large language models to translate constraints into formal specifications, facilitating long-horizon task and motion planning.
|
53 |
+
ARXIV
|
54 |
+
|
55 |
+
Evaluation Frameworks for AI Models:
|
56 |
+
|
57 |
+
"ASI: Accuracy-Stability Index for Evaluating Deep Learning Models"
|
58 |
+
Authors: Wei Dai, Daniel Berleant
|
59 |
+
The paper introduces the Accuracy-Stability Index (ASI), a quantitative measure that incorporates both accuracy and stability for assessing deep learning models.
|
60 |
+
ARXIV
|
61 |
+
|
62 |
+
"Benchmarks for Deep Off-Policy Evaluation"
|
63 |
+
Authors: Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine
|
64 |
+
This research provides a collection of policies that, in conjunction with existing offline datasets, can be used for benchmarking off-policy evaluation in deep learning.
|
65 |
+
ARXIV
|
66 |
+
|
67 |
+
These topics and papers contribute to the development of AI systems capable of understanding research literature and applying the acquired knowledge to complex, long-horizon tasks, thereby advancing the field of artificial intelligence.
|
68 |
+
|
69 |
+
---
|
70 |
+
|
71 |
|
72 |
Features:
|
73 |
|