|
--- |
|
language: |
|
- en |
|
thumbnail: null |
|
tags: |
|
- image-classification |
|
- multimodal |
|
- tensorflow |
|
license: mit |
|
datasets: |
|
- flex_civ_vi_screens |
|
pipeline_tag: audio-text-to-text |
|
library_name: tensorflow |
|
--- |
|
|
|
# Multimodal Classification Model (BM-v1) |
|
|
|
This model combines text and image inputs to predict player moves from in-game screenshots for the popular 4X Civilization VI. In use, screenshot inputs are provided and text inputs generated using an LLM. |
|
|
|
## Model Details |
|
|
|
- **Developed by:** BeakerStreet |
|
- **Model type:** Multimodal Classification Model |
|
- **Language(s):** English |
|
- **License:** MIT |
|
|
|
## Uses |
|
|
|
Predicts the likely moves a player will make from a complete sample space of all (observed) player moves, based on a provided screenshot and associated text. Can be fine-tuned to specifically predict types of move (scouting, build orders, settle/doesn't settle) |
|
|
|
### Direct Use |
|
|
|
Predicts the likely moves a player will make, from a complete sample space of all player moves, based on a provided screenshot and associated text. |
|
|