base_model: | |
- google-bert/bert-base-uncased | |
datasets: | |
- microsoft/ms_marco | |
language: | |
- en | |
library_name: transformers | |
pipeline_tag: feature-extraction | |
license: apache-2.0 | |
# Model Card | |
This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364). | |
## Model Details | |
This is a Hypencoder Dual Enocder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model. | |
### Model Variants | |
We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers. | |
| Huggingface Repo | Number of Layers | | |
|:------------------:|:------------------:| | |
| [jfkback/hypencoder.2_layer](https://huggingface.co/jfkback/hypencoder.2_layer) | 2 | | |
| [jfkback/hypencoder.4_layer](https://huggingface.co/jfkback/hypencoder.4_layer) | 4 | | |
| [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 | | |
| [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 | | |
## Citation | |
**BibTeX:** | |
``` | |
@misc{killingback2025hypencoderhypernetworksinformationretrieval, | |
title={Hypencoder: Hypernetworks for Information Retrieval}, | |
author={Julian Killingback and Hansi Zeng and Hamed Zamani}, | |
year={2025}, | |
eprint={2502.05364}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.IR}, | |
url={https://arxiv.org/abs/2502.05364}, | |
} | |
``` |