zstanjj commited on
Commit
c2419fb
β€’
1 Parent(s): d40ab27

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -8,8 +8,10 @@ license: apache-2.0
8
 
9
  ## Model Information
10
 
 
 
11
  <p align="left">
12
- β€’ πŸ“ <a href="https://arxiv.org/abs/2411.02959" target="_blank">Paper</a> β€’ πŸ€— <a href="https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/" target="_blank">Hugging Face</a> β€’ 🧩 <a href="https://github.com/plageon/SlimPLM" target="_blank">Github</a>
13
  </p>
14
 
15
  We propose HtmlRAG, which uses HTML instead of plain text as the format of external knowledge in RAG systems. To tackle the long context brought by HTML, we propose **Lossless HTML Cleaning** and **Two-Step Block-Tree-Based HTML Pruning**.
 
8
 
9
  ## Model Information
10
 
11
+ We release the HTML pruner model used in **HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieval Results in RAG Systems**.
12
+
13
  <p align="left">
14
+ Useful links: πŸ“ <a href="https://arxiv.org/abs/2411.02959" target="_blank">Paper</a> β€’ πŸ€— <a href="https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/" target="_blank">Hugging Face</a> β€’ 🧩 <a href="https://github.com/plageon/SlimPLM" target="_blank">Github</a>
15
  </p>
16
 
17
  We propose HtmlRAG, which uses HTML instead of plain text as the format of external knowledge in RAG systems. To tackle the long context brought by HTML, we propose **Lossless HTML Cleaning** and **Two-Step Block-Tree-Based HTML Pruning**.