rdiehlmartinez commited on
Commit
e8e3158
·
verified ·
1 Parent(s): cf4f54e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -3,14 +3,14 @@ title: Perplexity
3
  emoji: 🤗
4
  colorFrom: blue
5
  colorTo: red
6
- sdk: gradio
7
- sdk_version: 3.19.1
8
- app_file: app.py
9
  pinned: false
10
  tags:
11
  - evaluate
12
  - metric
13
  description: >-
 
 
14
  Perplexity (PPL) is one of the most common metrics for evaluating language
15
  models. It is defined as the exponentiated average negative log-likelihood of
16
  a sequence, calculated with exponent base `e`.
@@ -21,8 +21,10 @@ description: >-
21
 
22
  # Metric Card for Perplexity
23
 
24
- ## Metric Description
25
- Given a model and an input text sequence, perplexity measures how likely the model is to generate the input text sequence.
 
 
26
 
27
  As a metric, it can be used to evaluate how well the model has learned the distribution of the text it was trained on.
28
 
@@ -39,7 +41,7 @@ The metric takes a list of text as input, as well as the name of the model used
39
 
40
  ```python
41
  from evaluate import load
42
- perplexity = load("perplexity", module_type="metric")
43
  results = perplexity.compute(predictions=predictions, model_id='gpt2')
44
  ```
45
 
@@ -50,6 +52,7 @@ results = perplexity.compute(predictions=predictions, model_id='gpt2')
50
  - **batch_size** (int): the batch size to run texts through the model. Defaults to 16.
51
  - **add_start_token** (bool): whether to add the start token to the texts, so the perplexity can include the probability of the first word. Defaults to True.
52
  - **device** (str): device to run on, defaults to `cuda` when available
 
53
 
54
  ### Output Values
55
  This metric outputs a dictionary with the perplexity scores for the text input in the list, and the average perplexity.
 
3
  emoji: 🤗
4
  colorFrom: blue
5
  colorTo: red
6
+ sdk: static
 
 
7
  pinned: false
8
  tags:
9
  - evaluate
10
  - metric
11
  description: >-
12
+ This is a fork of the huggingface evaluate library's implementation of perplexity.
13
+
14
  Perplexity (PPL) is one of the most common metrics for evaluating language
15
  models. It is defined as the exponentiated average negative log-likelihood of
16
  a sequence, calculated with exponent base `e`.
 
21
 
22
  # Metric Card for Perplexity
23
 
24
+ > This is a fork of the huggingface evaluate library's implementation of perplexity.
25
+
26
+
27
+ ## Metric DescriptionGiven a model and an input text sequence, perplexity measures how likely the model is to generate the input text sequence.
28
 
29
  As a metric, it can be used to evaluate how well the model has learned the distribution of the text it was trained on.
30
 
 
41
 
42
  ```python
43
  from evaluate import load
44
+ perplexity = load("pico-lm/perplexity")
45
  results = perplexity.compute(predictions=predictions, model_id='gpt2')
46
  ```
47
 
 
52
  - **batch_size** (int): the batch size to run texts through the model. Defaults to 16.
53
  - **add_start_token** (bool): whether to add the start token to the texts, so the perplexity can include the probability of the first word. Defaults to True.
54
  - **device** (str): device to run on, defaults to `cuda` when available
55
+ - **trust_remote_code** (bool): enables running metric on custom models
56
 
57
  ### Output Values
58
  This metric outputs a dictionary with the perplexity scores for the text input in the list, and the average perplexity.