nielsr HF staff commited on
Commit
df856e8
·
verified ·
1 Parent(s): 00165d7

Add pipeline tag and usage example

Browse files

This PR ensures the model can be found at https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending. It also adds a sample usage from the Github README.

Files changed (1) hide show
  1. README.md +57 -6
README.md CHANGED
@@ -1,20 +1,20 @@
1
  ---
2
- library_name: transformers
 
3
  datasets:
4
  - microsoft/ms_marco
5
  language:
6
  - en
7
- base_model:
8
- - google-bert/bert-base-uncased
 
9
  ---
10
 
11
  # Model Card
12
  This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364).
13
 
14
-
15
  ## Model Details
16
- This is a Hypencoder Dual Enocder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model.
17
-
18
 
19
  ### Model Variants
20
  We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers.
@@ -26,6 +26,57 @@ We released the four models used in the paper. Each model is identical except th
26
  | [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 |
27
  | [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 |
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Citation
30
  **BibTeX:**
31
  ```
 
1
  ---
2
+ base_model:
3
+ - google-bert/bert-base-uncased
4
  datasets:
5
  - microsoft/ms_marco
6
  language:
7
  - en
8
+ library_name: transformers
9
+ pipeline_tag: feature-extraction
10
+ license: mit
11
  ---
12
 
13
  # Model Card
14
  This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364).
15
 
 
16
  ## Model Details
17
+ This is a Hypencoder Dual Encoder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model.
 
18
 
19
  ### Model Variants
20
  We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers.
 
26
  | [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 |
27
  | [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 |
28
 
29
+ ## Usage
30
+
31
+ ```python
32
+ from hypencoder_cb.modeling.hypencoder import Hypencoder, HypencoderDualEncoder, TextEncoder
33
+ from transformers import AutoTokenizer
34
+
35
+ dual_encoder = HypencoderDualEncoder.from_pretrained("jfkback/hypencoder.6_layer")
36
+ tokenizer = AutoTokenizer.from_pretrained("jfkback/hypencoder.6_layer")
37
+
38
+ query_encoder: Hypencoder = dual_encoder.query_encoder
39
+ passage_encoder: TextEncoder = dual_encoder.passage_encoder
40
+
41
+ queries = [
42
+ "how many states are there in india",
43
+ "when do concussion symptoms appear",
44
+ ]
45
+
46
+ passages = [
47
+ "India has 28 states and 8 union territories.",
48
+ "Concussion symptoms can appear immediately or up to 72 hours after the injury.",
49
+ ]
50
+
51
+ query_inputs = tokenizer(queries, return_tensors="pt", padding=True, truncation=True)
52
+ passage_inputs = tokenizer(passages, return_tensors="pt", padding=True, truncation=True)
53
+
54
+ q_nets = query_encoder(input_ids=query_inputs["input_ids"], attention_mask=query_inputs["attention_mask"]).representation
55
+ passage_embeddings = passage_encoder(input_ids=passage_inputs["input_ids"], attention_mask=passage_inputs["attention_mask"]).representation
56
+
57
+ # The passage_embeddings has shape (2, 768), but the q_nets expect the shape
58
+ # (num_queries, num_items_per_query, input_hidden_size) so we need to reshape
59
+ # the passage_embeddings.
60
+
61
+ # In the simple case where each q_net only takes one passage, we can just
62
+ # reshape the passage_embeddings to (num_queries, 1, input_hidden_size).
63
+ passage_embeddings_single = passage_embeddings.unsqueeze(1)
64
+ scores = q_nets(passage_embeddings_single) # Shape (2, 1, 1)
65
+ # [
66
+ # [[-12.1192]],
67
+ # [[-13.5832]]
68
+ # ]
69
+
70
+ # In the case where each q_net takes both passages we can reshape the
71
+ # passage_embeddings to (num_queries, 2, input_hidden_size).
72
+ passage_embeddings_double = passage_embeddings.repeat(2, 1).reshape(2, 2, -1)
73
+ scores = q_nets(passage_embeddings_double) # Shape (2, 2, 1)
74
+ # [
75
+ # [[-12.1192], [-32.7046]],
76
+ # [[-34.0934], [-13.5832]]
77
+ # ]
78
+ ```
79
+
80
  ## Citation
81
  **BibTeX:**
82
  ```