Multimodal Classification Model (BM-v1)

This model combines text and image inputs to predict player moves from in-game screenshots for the popular 4X Civilization VI. In use, screenshot inputs are provided and text inputs generated using an LLM.

Model Details

  • Developed by: BeakerStreet
  • Model type: Multimodal Classification Model
  • Language(s): English
  • License: MIT

Uses

Predicts the likely moves a player will make from a complete sample space of all (observed) player moves, based on a provided screenshot and associated text. Can be fine-tuned to specifically predict types of move (scouting, build orders, settle/doesn't settle)

Direct Use

Predicts the likely moves a player will make, from a complete sample space of all player moves, based on a provided screenshot and associated text.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support tensorflow models with pipeline type audio-text-to-text