Update README.md
Browse files
README.md
CHANGED
@@ -8,8 +8,10 @@ license: apache-2.0
|
|
8 |
|
9 |
## Model Information
|
10 |
|
|
|
|
|
11 |
<p align="left">
|
12 |
-
|
13 |
</p>
|
14 |
|
15 |
We propose HtmlRAG, which uses HTML instead of plain text as the format of external knowledge in RAG systems. To tackle the long context brought by HTML, we propose **Lossless HTML Cleaning** and **Two-Step Block-Tree-Based HTML Pruning**.
|
|
|
8 |
|
9 |
## Model Information
|
10 |
|
11 |
+
We release the HTML pruner model used in **HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieval Results in RAG Systems**.
|
12 |
+
|
13 |
<p align="left">
|
14 |
+
Useful links: π <a href="https://arxiv.org/abs/2411.02959" target="_blank">Paper</a> β’ π€ <a href="https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/" target="_blank">Hugging Face</a> ⒠𧩠<a href="https://github.com/plageon/SlimPLM" target="_blank">Github</a>
|
15 |
</p>
|
16 |
|
17 |
We propose HtmlRAG, which uses HTML instead of plain text as the format of external knowledge in RAG systems. To tackle the long context brought by HTML, we propose **Lossless HTML Cleaning** and **Two-Step Block-Tree-Based HTML Pruning**.
|