nielsr HF staff commited on
Commit
9f9c21f
·
verified ·
1 Parent(s): d2a8ead

Add pipeline tag and link to Github repository

Browse files

This PR adds the `pipeline_tag: text-generation` to the model card metadata. This will improve discoverability of the model on the Hugging Face Hub.
It also adds a link to the Github repository.

Files changed (1) hide show
  1. README.md +7 -9
README.md CHANGED
@@ -1,23 +1,23 @@
1
  ---
2
- license: mit
3
- library_name: transformers
4
  base_model:
5
  - meta-llama/Llama-3.1-8B
 
 
 
6
  ---
 
7
  # TokenButler
8
  <!-- markdownlint-disable first-line-h1 -->
9
  <!-- markdownlint-disable html -->
10
  <!-- markdownlint-disable no-duplicate-header -->
11
 
12
-
13
-
14
  <div align="center">
15
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/tokenbutlerlogo.png?raw=true" width="50%" alt="TokenButler" />
16
  </div>
17
  <hr>
18
  <div align="center" style="line-height: 1;">
19
  <!-- Paper Badge -->
20
- <a href="https://github.com/abdelfattah-lab/TokenButler/blob/main/TokenButler_Draft.pdf" target="_blank" style="margin: 2px;">
21
  <img alt="Paper"
22
  src="https://img.shields.io/badge/Paper-View-orange?logo=readthedocs&logoColor=white"
23
  style="display: inline-block; vertical-align: middle;"/>
@@ -32,10 +32,9 @@ base_model:
32
 
33
  <br>
34
 
35
-
36
  The collection of TokenButler models can be found [here](https://huggingface.co/collections/akhauriyash/tokenbutler-67cf181b5762d0d60e5f312b). To run the `meta-llama/Llama-3.1-8B` model, follow:
37
 
38
- ```
39
  from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
40
 
41
  question = "If millionaires have butlers, why don't million dollar language models have a butler too? I think its because "
@@ -52,7 +51,7 @@ print(response[0]['generated_text'][len(question):])
52
 
53
  Note that the 'default' configured sparsity is 50%. Further, there is a 'sliding window' of 128 and 8 'anchor tokens'. To 'change' the sparsity, you can use the following function after loading the model. Please note that the 'fixed' is the only supported strategy at the moment, which 'fixes' the sparsity of each layer (except the first) at the 'pc' (percentage) mentioned. This can also be found at `test_hf.py`. Sliding window and anchor tokens can be changed in a similar manner.
54
 
55
- ```
56
  def set_sparsity(model, sparsity):
57
  for module in model.modules():
58
  if module.__class__.__name__.__contains__("AttentionExperimental"):
@@ -63,7 +62,6 @@ def set_sparsity(model, sparsity):
63
  model = set_sparsity(model, "fixed_60pc")
64
  ```
65
 
66
-
67
  # Predictor Architecture
68
  <div align="center">
69
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/mainfig.png?raw=true" width="100%" alt="TokenButlerFigure" />
 
1
  ---
 
 
2
  base_model:
3
  - meta-llama/Llama-3.1-8B
4
+ library_name: transformers
5
+ license: mit
6
+ pipeline_tag: text-generation
7
  ---
8
+
9
  # TokenButler
10
  <!-- markdownlint-disable first-line-h1 -->
11
  <!-- markdownlint-disable html -->
12
  <!-- markdownlint-disable no-duplicate-header -->
13
 
 
 
14
  <div align="center">
15
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/tokenbutlerlogo.png?raw=true" width="50%" alt="TokenButler" />
16
  </div>
17
  <hr>
18
  <div align="center" style="line-height: 1;">
19
  <!-- Paper Badge -->
20
+ <a href="https://arxiv.org/abs/2503.07518" target="_blank" style="margin: 2px;">
21
  <img alt="Paper"
22
  src="https://img.shields.io/badge/Paper-View-orange?logo=readthedocs&logoColor=white"
23
  style="display: inline-block; vertical-align: middle;"/>
 
32
 
33
  <br>
34
 
 
35
  The collection of TokenButler models can be found [here](https://huggingface.co/collections/akhauriyash/tokenbutler-67cf181b5762d0d60e5f312b). To run the `meta-llama/Llama-3.1-8B` model, follow:
36
 
37
+ ```python
38
  from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
39
 
40
  question = "If millionaires have butlers, why don't million dollar language models have a butler too? I think its because "
 
51
 
52
  Note that the 'default' configured sparsity is 50%. Further, there is a 'sliding window' of 128 and 8 'anchor tokens'. To 'change' the sparsity, you can use the following function after loading the model. Please note that the 'fixed' is the only supported strategy at the moment, which 'fixes' the sparsity of each layer (except the first) at the 'pc' (percentage) mentioned. This can also be found at `test_hf.py`. Sliding window and anchor tokens can be changed in a similar manner.
53
 
54
+ ```python
55
  def set_sparsity(model, sparsity):
56
  for module in model.modules():
57
  if module.__class__.__name__.__contains__("AttentionExperimental"):
 
62
  model = set_sparsity(model, "fixed_60pc")
63
  ```
64
 
 
65
  # Predictor Architecture
66
  <div align="center">
67
  <img src="https://github.com/abdelfattah-lab/TokenButler/blob/main/figs/mainfig.png?raw=true" width="100%" alt="TokenButlerFigure" />