File size: 3,890 Bytes
1a16dd8 df856e8 00165d7 df856e8 1a16dd8 00165d7 1a16dd8 df856e8 1a16dd8 00165d7 1a16dd8 00165d7 1a16dd8 df856e8 00165d7 1a16dd8 00165d7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
---
base_model:
- google-bert/bert-base-uncased
datasets:
- microsoft/ms_marco
language:
- en
library_name: transformers
pipeline_tag: feature-extraction
license: mit
---
# Model Card
This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364).
## Model Details
This is a Hypencoder Dual Encoder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model.
### Model Variants
We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers.
| Huggingface Repo | Number of Layers |
|:------------------:|:------------------:|
| [jfkback/hypencoder.2_layer](https://huggingface.co/jfkback/hypencoder.2_layer) | 2 |
| [jfkback/hypencoder.4_layer](https://huggingface.co/jfkback/hypencoder.4_layer) | 4 |
| [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 |
| [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 |
## Usage
```python
from hypencoder_cb.modeling.hypencoder import Hypencoder, HypencoderDualEncoder, TextEncoder
from transformers import AutoTokenizer
dual_encoder = HypencoderDualEncoder.from_pretrained("jfkback/hypencoder.6_layer")
tokenizer = AutoTokenizer.from_pretrained("jfkback/hypencoder.6_layer")
query_encoder: Hypencoder = dual_encoder.query_encoder
passage_encoder: TextEncoder = dual_encoder.passage_encoder
queries = [
"how many states are there in india",
"when do concussion symptoms appear",
]
passages = [
"India has 28 states and 8 union territories.",
"Concussion symptoms can appear immediately or up to 72 hours after the injury.",
]
query_inputs = tokenizer(queries, return_tensors="pt", padding=True, truncation=True)
passage_inputs = tokenizer(passages, return_tensors="pt", padding=True, truncation=True)
q_nets = query_encoder(input_ids=query_inputs["input_ids"], attention_mask=query_inputs["attention_mask"]).representation
passage_embeddings = passage_encoder(input_ids=passage_inputs["input_ids"], attention_mask=passage_inputs["attention_mask"]).representation
# The passage_embeddings has shape (2, 768), but the q_nets expect the shape
# (num_queries, num_items_per_query, input_hidden_size) so we need to reshape
# the passage_embeddings.
# In the simple case where each q_net only takes one passage, we can just
# reshape the passage_embeddings to (num_queries, 1, input_hidden_size).
passage_embeddings_single = passage_embeddings.unsqueeze(1)
scores = q_nets(passage_embeddings_single) # Shape (2, 1, 1)
# [
# [[-12.1192]],
# [[-13.5832]]
# ]
# In the case where each q_net takes both passages we can reshape the
# passage_embeddings to (num_queries, 2, input_hidden_size).
passage_embeddings_double = passage_embeddings.repeat(2, 1).reshape(2, 2, -1)
scores = q_nets(passage_embeddings_double) # Shape (2, 2, 1)
# [
# [[-12.1192], [-32.7046]],
# [[-34.0934], [-13.5832]]
# ]
```
## Citation
**BibTeX:**
```
@misc{killingback2025hypencoderhypernetworksinformationretrieval,
title={Hypencoder: Hypernetworks for Information Retrieval},
author={Julian Killingback and Hansi Zeng and Hamed Zamani},
year={2025},
eprint={2502.05364},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2502.05364},
}
``` |