Add pipeline tag and usage example
#1
by
nielsr
HF staff
- opened
README.md
CHANGED
@@ -1,20 +1,20 @@
|
|
1 |
---
|
2 |
-
|
|
|
3 |
datasets:
|
4 |
- microsoft/ms_marco
|
5 |
language:
|
6 |
- en
|
7 |
-
|
8 |
-
|
|
|
9 |
---
|
10 |
|
11 |
# Model Card
|
12 |
This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364).
|
13 |
|
14 |
-
|
15 |
## Model Details
|
16 |
-
This is a Hypencoder Dual
|
17 |
-
|
18 |
|
19 |
### Model Variants
|
20 |
We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers.
|
@@ -26,6 +26,57 @@ We released the four models used in the paper. Each model is identical except th
|
|
26 |
| [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 |
|
27 |
| [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 |
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
## Citation
|
30 |
**BibTeX:**
|
31 |
```
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- google-bert/bert-base-uncased
|
4 |
datasets:
|
5 |
- microsoft/ms_marco
|
6 |
language:
|
7 |
- en
|
8 |
+
library_name: transformers
|
9 |
+
pipeline_tag: feature-extraction
|
10 |
+
license: mit
|
11 |
---
|
12 |
|
13 |
# Model Card
|
14 |
This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364).
|
15 |
|
|
|
16 |
## Model Details
|
17 |
+
This is a Hypencoder Dual Encoder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model.
|
|
|
18 |
|
19 |
### Model Variants
|
20 |
We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers.
|
|
|
26 |
| [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 |
|
27 |
| [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 |
|
28 |
|
29 |
+
## Usage
|
30 |
+
|
31 |
+
```python
|
32 |
+
from hypencoder_cb.modeling.hypencoder import Hypencoder, HypencoderDualEncoder, TextEncoder
|
33 |
+
from transformers import AutoTokenizer
|
34 |
+
|
35 |
+
dual_encoder = HypencoderDualEncoder.from_pretrained("jfkback/hypencoder.6_layer")
|
36 |
+
tokenizer = AutoTokenizer.from_pretrained("jfkback/hypencoder.6_layer")
|
37 |
+
|
38 |
+
query_encoder: Hypencoder = dual_encoder.query_encoder
|
39 |
+
passage_encoder: TextEncoder = dual_encoder.passage_encoder
|
40 |
+
|
41 |
+
queries = [
|
42 |
+
"how many states are there in india",
|
43 |
+
"when do concussion symptoms appear",
|
44 |
+
]
|
45 |
+
|
46 |
+
passages = [
|
47 |
+
"India has 28 states and 8 union territories.",
|
48 |
+
"Concussion symptoms can appear immediately or up to 72 hours after the injury.",
|
49 |
+
]
|
50 |
+
|
51 |
+
query_inputs = tokenizer(queries, return_tensors="pt", padding=True, truncation=True)
|
52 |
+
passage_inputs = tokenizer(passages, return_tensors="pt", padding=True, truncation=True)
|
53 |
+
|
54 |
+
q_nets = query_encoder(input_ids=query_inputs["input_ids"], attention_mask=query_inputs["attention_mask"]).representation
|
55 |
+
passage_embeddings = passage_encoder(input_ids=passage_inputs["input_ids"], attention_mask=passage_inputs["attention_mask"]).representation
|
56 |
+
|
57 |
+
# The passage_embeddings has shape (2, 768), but the q_nets expect the shape
|
58 |
+
# (num_queries, num_items_per_query, input_hidden_size) so we need to reshape
|
59 |
+
# the passage_embeddings.
|
60 |
+
|
61 |
+
# In the simple case where each q_net only takes one passage, we can just
|
62 |
+
# reshape the passage_embeddings to (num_queries, 1, input_hidden_size).
|
63 |
+
passage_embeddings_single = passage_embeddings.unsqueeze(1)
|
64 |
+
scores = q_nets(passage_embeddings_single) # Shape (2, 1, 1)
|
65 |
+
# [
|
66 |
+
# [[-12.1192]],
|
67 |
+
# [[-13.5832]]
|
68 |
+
# ]
|
69 |
+
|
70 |
+
# In the case where each q_net takes both passages we can reshape the
|
71 |
+
# passage_embeddings to (num_queries, 2, input_hidden_size).
|
72 |
+
passage_embeddings_double = passage_embeddings.repeat(2, 1).reshape(2, 2, -1)
|
73 |
+
scores = q_nets(passage_embeddings_double) # Shape (2, 2, 1)
|
74 |
+
# [
|
75 |
+
# [[-12.1192], [-32.7046]],
|
76 |
+
# [[-34.0934], [-13.5832]]
|
77 |
+
# ]
|
78 |
+
```
|
79 |
+
|
80 |
## Citation
|
81 |
**BibTeX:**
|
82 |
```
|