Text Classification
fastText
English
kenhktsui commited on
Commit
2962a95
1 Parent(s): 9fb5293

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -6,7 +6,7 @@ library_name: fasttext
6
  pipeline_tag: text-classification
7
  inference: false
8
  ---
9
- # 📚llm-data-textbook-quality-fasttext-classifer-v2
10
 
11
 
12
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60e50ce5350d181892d5a636/acAPg-_NawdIfE2XXwcgc.png)
@@ -22,8 +22,9 @@ There are 3 labels instead of 2 labels, as it offers higher granularity of educa
22
  - Mid (Middle 25-75% educational value)
23
  - Low (Bottom 25% educational value)
24
 
25
- A detailed report/ paper will follow when more downstream experiments of this classifier become available.
26
- The classifier had been applied to various pretraining dataset. See [**Benchmark**](https://huggingface.co/kenhktsui/llm-data-textbook-quality-fasttext-classifer-v2#benchmark)
 
27
 
28
  ⚡ Model is built on fasttext - it can classify more than 2000 examples per second in CPU, and so it can be used **on-the-fly** during pretraining.
29
 
 
6
  pipeline_tag: text-classification
7
  inference: false
8
  ---
9
+ # 📚llm-data-textbook-quality-fasttext-classifier-v2
10
 
11
 
12
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60e50ce5350d181892d5a636/acAPg-_NawdIfE2XXwcgc.png)
 
22
  - Mid (Middle 25-75% educational value)
23
  - Low (Bottom 25% educational value)
24
 
25
+ A detailed report/ paper will follow when more downstream experiments of this classifier become available.
26
+ About the validation of this classifier. See [**Analysis**](https://huggingface.co/kenhktsui/llm-data-textbook-quality-fasttext-classifer-v2#%F0%9F%93%88analysis).
27
+ The classifier had been applied to various pretraining dataset. See [**Benchmark**](https://huggingface.co/kenhktsui/llm-data-textbook-quality-fasttext-classifer-v2#%F0%9F%93%8Abenchmark)
28
 
29
  ⚡ Model is built on fasttext - it can classify more than 2000 examples per second in CPU, and so it can be used **on-the-fly** during pretraining.
30