hypencoder.6_layer / README.md
jfkback's picture
Add license and pipeline tag (#3)
68d42e8 verified
---
base_model:
- google-bert/bert-base-uncased
datasets:
- microsoft/ms_marco
language:
- en
library_name: transformers
pipeline_tag: feature-extraction
license: apache-2.0
---
# Model Card
This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364).
## Model Details
This is a Hypencoder Dual Enocder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model.
### Model Variants
We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers.
| Huggingface Repo | Number of Layers |
|:------------------:|:------------------:|
| [jfkback/hypencoder.2_layer](https://huggingface.co/jfkback/hypencoder.2_layer) | 2 |
| [jfkback/hypencoder.4_layer](https://huggingface.co/jfkback/hypencoder.4_layer) | 4 |
| [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 |
| [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 |
## Citation
**BibTeX:**
```
@misc{killingback2025hypencoderhypernetworksinformationretrieval,
title={Hypencoder: Hypernetworks for Information Retrieval},
author={Julian Killingback and Hansi Zeng and Hamed Zamani},
year={2025},
eprint={2502.05364},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2502.05364},
}
```