Text Generation
English
Eval Results
d-matrix commited on
Commit
93cadd1
1 Parent(s): 55072b7

Create README.md

Browse files

draft of model card

Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - wikitext
5
+ - ptb_text_only
6
+ language:
7
+ - en
8
+ metrics:
9
+ - perplexity
10
+ pipeline_tag: text-generation
11
+ model-index:
12
+ - name: distilgpt2
13
+ results:
14
+ - task:
15
+ type: text-generation
16
+ dataset:
17
+ name: penn_treebank
18
+ type: ptb_text_only
19
+ metrics:
20
+ - name: perlexity@BASELINE
21
+ type: dmx-perlexity
22
+ value: 63.45857238769531
23
+ - name: perlexity@FALLBACK
24
+ type: dmx-perlexity
25
+ value: 64.36720275878906
26
+ - task:
27
+ type: text-generation
28
+ dataset:
29
+ name: wikitext2
30
+ type: wikitext-2-raw-v1
31
+ metrics:
32
+ - name: perlexity@BASELINE
33
+ type: dmx-perlexity
34
+ value: 46.05925369262695
35
+ - name: perlexity@FALLBACK
36
+ type: dmx-perlexity
37
+ value: 46.570838928222656
38
+ ---
39
+ This is a quantized version of [DistilGPT2](https://huggingface.co/distilbert/distilgpt2). We provide the following two quantization configurations:
40
+
41
+ BASELINE: Everything in original format, equivalent to original model.
42
+
43
+ FALLBACK: Quantized Linear and Conv1D layers to BFP16. Added approximation functions for Layer Norm, GELU and Softmax.
44
+
45
+ ### Usage Example
46
+
47
+ Prerequisites:
48
+ - Install dmx-mltools: "pip install dmx-mltools"
49
+ - clone this repo. "cd" to the cloned repo.
50
+ ```python
51
+ >>> import os
52
+ >>> import torch
53
+ >>> from mltools import dmx
54
+ >>> from transformers import pipeline,AutoModelForCausalLM
55
+ >>> import evaluate
56
+ >>> from datasets import load_dataset
57
+
58
+ # Get model
59
+ >>> my_hf_token = os.environ.get("Dmatrix_HF_Token")
60
+ >>> device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
61
+
62
+ >>> pipe = pipeline(
63
+ >>> "text-generation",
64
+ >>> model="d-matrix/distilgpt2",
65
+ >>> device=device,
66
+ >>> use_auth_token=my_hf_token,
67
+ >>> )
68
+ >>> pipe.model = dmx.Model(pipe.model,monkey_patched=False,hf=True,input_names=["input_ids", "labels"])
69
+
70
+ # Configure quantization formats
71
+ >>> pipe.model.transform('FALLBACK.yaml')
72
+
73
+ # Evaluate
74
+ >>> perplexity = evaluate.load("d-matrix/dmx_perplexity", module_type="metric")
75
+ >>> input_texts = load_dataset("ptb_text_only", "penn_treebank", split="test")["sentence"]
76
+ >>> pipe.model.eval()
77
+ >>> results = perplexity.compute(model=pipe.model.body,references=input_texts)
78
+ >>> print(results)
79
+ {'loss': 4.164604187011719, 'perplexity': 64.36720275878906}
80
+ ```