--- language: - en library_name: CountGD license: mit tags: - computer-vision - counting - grounding-dino - model_hub_mixin - multi-modal - open-vocabulary - pytorch_model_hub_mixin - transformers --- # CountGD A Multi-Modal Open-World Counting Model for counting objects in an image with text and image prompts. For more details, please check out the following links - Project page: https://www.robots.ox.ac.uk/~vgg/research/countgd/ - Code: https://github.com/niki-amini-naieni/CountGD - Demo: https://huggingface.co/spaces/nikigoli/countgd - Paper: https://arxiv.org/pdf/2407.04619 ![Sample prediction](https://www.robots.ox.ac.uk/~vgg/research/countgd/images/teaser-improved.png) ## Architecture ![CountGD Architecture](https://www.robots.ox.ac.uk/~vgg/research/countgd/images/architecture.png) ## Citation ``` @inproceedings{AminiNaieni24, author = "Amini-Naieni, N. and Han, T. and Zisserman, A.", title = "CountGD: Multi-Modal Open-World Counting", booktitle = "NeurIPS", year = "2024", } ```