SentenceTransformer based on Mihaiii/gte-micro-v4

This is a sentence-transformers model finetuned from Mihaiii/gte-micro-v4 on the mtg_cards-2025-04-04 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Mihaiii/gte-micro-v4
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("philipp-zettl/gte-micro-v4-mtg")
# Run inference
sentences = [
    '141a031d-f899-497b-adf7-4af142078085_0367fac8-6990-4544-ac7d-ed363b55a9cf',
    "Title: Quirion Explorer\nCost: {1}{G}\nColors: ['G']\nType: Creature — Elf Druid Scout\nDesc: {T}: Add one mana of any color that a land an opponent controls could produce.",
    "Title: Savage Hunger\nCost: {2}{G}\nColors: ['G']\nType: Enchantment — Aura\nDesc: Enchant creature\nEnchanted creature gets +1/+0 and has trample.\nCycling {2} ({2}, Discard this card: Draw a card.)",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.5888 0.586
spearman_cosine 0.6572 0.6549

Training Details

Training Dataset

mtg_cards-2025-04-04

  • Dataset: mtg_cards-2025-04-04 at a35ccc4
  • Size: 2,839,738 training samples
  • Columns: uuid, sentence_1, sentence_2, image_1, image_2, and score
  • Approximate statistics based on the first 1000 samples:
    uuid sentence_1 sentence_2 image_1 image_2 score
    type string string string string string float
    details
    • min: 49 tokens
    • mean: 56.99 tokens
    • max: 65 tokens
    • min: 17 tokens
    • mean: 69.4 tokens
    • max: 180 tokens
    • min: 15 tokens
    • mean: 68.59 tokens
    • max: 166 tokens
    • min: 53 tokens
    • mean: 58.17 tokens
    • max: 64 tokens
    • min: 52 tokens
    • mean: 58.28 tokens
    • max: 64 tokens
    • min: -1.0
    • mean: -0.43
    • max: 0.5
  • Samples:
    uuid sentence_1 sentence_2 image_1 image_2 score
    08f9b863-10b7-46d6-badd-97381e6c7c5e_4330efa7-a11b-4776-9fb0-1cae8aed67b1 Title: Blast Zone
    Type: Land
    Desc: This land enters with a charge counter on it.
    {T}: Add {C}.
    {X}{X}, {T}: Put X charge counters on this land.
    {3}, {T}, Sacrifice this land: Destroy each nonland permanent with mana value equal to the number of charge counters on this land.
    Title: Tom van de Logt Bio (2000)
    Type: Card
    Desc: Quarterfinalist Tom van de Logt posted a perfect 6—0 record during the Standard portion of this year's World Championships. The 19-year-old Groesbeek, Holland native was playing a deck that had a big impact on the metagame this year, "Replenish." This deck used cards like Attunement and Frantic Search to put powerful enchantments, such as Parallax Wave and Opalescence, into the graveyard and then used Replenish to put them all back into play at once.
    https://cards.scryfall.io/normal/front/0/8/08f9b863-10b7-46d6-badd-97381e6c7c5e.jpg?1674423042 https://cards.scryfall.io/normal/front/4/3/4330efa7-a11b-4776-9fb0-1cae8aed67b1.jpg?1562767017 0.25
    abe9cf1e-d398-41e0-8b11-afe1015e4fd9_40cb67f7-b4e1-423b-8f55-d44ed383e778 Title: Coral Net
    Cost: {U}
    Colors: ['U']
    Type: Enchantment — Aura
    Desc: Enchant green or white creature
    Enchanted creature has "At the beginning of your upkeep, sacrifice this creature unless you discard a card."
    Title: Silumgar Butcher
    Cost: {4}{B}
    Colors: ['B']
    Type: Creature — Zombie Djinn
    Desc: Exploit (When this creature enters, you may sacrifice a creature.)
    When this creature exploits a creature, target creature gets -3/-3 until end of turn.
    https://cards.scryfall.io/normal/front/a/b/abe9cf1e-d398-41e0-8b11-afe1015e4fd9.jpg?1562631469 https://cards.scryfall.io/normal/front/4/0/40cb67f7-b4e1-423b-8f55-d44ed383e778.jpg?1562785294 -1.0
    3dd13408-b4db-42e7-bf3c-d46716538a7c_05a6dc90-3997-4911-8bd6-854c85eca35b Title: Rishadan Brigand
    Cost: {4}{U}
    Colors: ['U']
    Type: Creature — Human Pirate
    Desc: Flying
    When this creature enters, each opponent sacrifices a permanent of their choice unless they pay {3}.
    This creature can block only creatures with flying.
    Title: Banishing Stroke
    Cost: {5}{W}
    Colors: ['W']
    Type: Instant
    Desc: Put target artifact, creature, or enchantment on the bottom of its owner's library.
    Miracle {W} (You may cast this card for its miracle cost when you draw it if it's the first card you drew this turn.)
    https://cards.scryfall.io/normal/front/3/d/3dd13408-b4db-42e7-bf3c-d46716538a7c.jpg?1632145390 https://cards.scryfall.io/normal/front/0/5/05a6dc90-3997-4911-8bd6-854c85eca35b.jpg?1723433851 -1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

mtg_cards-2025-04-04

  • Dataset: mtg_cards-2025-04-04 at a35ccc4
  • Size: 74,730 evaluation samples
  • Columns: uuid, sentence_1, sentence_2, image_1, image_2, and score
  • Approximate statistics based on the first 1000 samples:
    uuid sentence_1 sentence_2 image_1 image_2 score
    type string string string string string float
    details
    • min: 50 tokens
    • mean: 56.9 tokens
    • max: 65 tokens
    • min: 14 tokens
    • mean: 68.44 tokens
    • max: 181 tokens
    • min: 15 tokens
    • mean: 69.49 tokens
    • max: 179 tokens
    • min: 52 tokens
    • mean: 58.22 tokens
    • max: 64 tokens
    • min: 52 tokens
    • mean: 58.21 tokens
    • max: 64 tokens
    • min: -1.0
    • mean: -0.44
    • max: 0.75
  • Samples:
    uuid sentence_1 sentence_2 image_1 image_2 score
    6bdd8645-aee9-44cb-acaa-2674f55cdf2f_b34bb149-2e50-462e-8b83-5c8339bb3aff Title: Syr Cadian, Knight Owl
    Cost: {3}{W}{W}
    Colors: ['W']
    Type: Legendary Creature — Bird Knight
    Desc: Knightlifelink (Damage dealt by Knights you control also causes you to gain that much life.)
    {W}: Syr Cadian gains vigilance until end of turn. Activate only from sunrise to sunset.
    {B}: Syr Cadian gains flying until end of turn. Activate only from sunset to sunrise.
    Title: Non-Human Cannonball
    Cost: {2}{R}
    Colors: ['R']
    Type: Artifact Creature — Clown Robot
    Desc: When this creature dies, roll a six-sided die. If the result is 4 or less, this creature deals that much damage to you.
    https://cards.scryfall.io/normal/front/6/b/6bdd8645-aee9-44cb-acaa-2674f55cdf2f.jpg?1664317187 https://cards.scryfall.io/normal/front/b/3/b34bb149-2e50-462e-8b83-5c8339bb3aff.jpg?1673917877 0.25
    860f4304-38f1-4c2f-a122-2590619522fd_08d6db9b-b2da-4148-aa49-8c2fecac6e32 Title: Hindering Light
    Cost: {W}{U}
    Colors: ['U', 'W']
    Type: Instant
    Desc: Counter target spell that targets you or a permanent you control.
    Draw a card.
    Title: Gleam of Resistance
    Cost: {4}{W}
    Colors: ['W']
    Type: Instant
    Desc: Creatures you control get +1/+2 until end of turn. Untap those creatures.
    Basic landcycling {1}{W} ({1}{W}, Discard this card: Search your library for a basic land card, reveal it, put it into your hand, then shuffle.)
    https://cards.scryfall.io/normal/front/8/6/860f4304-38f1-4c2f-a122-2590619522fd.jpg?1712353583 https://cards.scryfall.io/normal/front/0/8/08d6db9b-b2da-4148-aa49-8c2fecac6e32.jpg?1573505575 0.25
    91b448f4-aa0c-42c7-a771-e8dd20e0520c_46f810c2-310e-42f5-ab1f-d56396cf5124 Title: Practiced Tactics
    Cost: {W}
    Colors: ['W']
    Type: Instant
    Desc: Choose target attacking or blocking creature. Practiced Tactics deals damage to that creature equal to twice the number of creatures in your party. (Your party consists of up to one each of Cleric, Rogue, Warrior, and Wizard.)
    Title: Anointer Priest
    Cost: {1}{W}
    Colors: ['W']
    Type: Creature — Human Cleric
    Desc: Whenever a creature token you control enters, you gain 1 life.
    Embalm {3}{W} ({3}{W}, Exile this card from your graveyard: Create a token that's a copy of it, except it's a white Zombie Human Cleric with no mana cost. Embalm only as a sorcery.)
    https://cards.scryfall.io/normal/front/9/1/91b448f4-aa0c-42c7-a771-e8dd20e0520c.jpg?1604192922 https://cards.scryfall.io/normal/front/4/6/46f810c2-310e-42f5-ab1f-d56396cf5124.jpg?1599769231 0.25
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • log_level_replica: passive
  • log_on_each_node: False
  • logging_nan_inf_filter: False
  • push_to_hub: True
  • resume_from_checkpoint: ./models/gte-micro-v4-mtg/
  • hub_model_id: philipp-zettl/gte-micro-v4-mtg
  • hub_always_push: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: passive
  • log_on_each_node: False
  • logging_nan_inf_filter: False
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: ./models/gte-micro-v4-mtg/
  • hub_model_id: philipp-zettl/gte-micro-v4-mtg
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: True
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.3315 -
0.0113 500 1.4254 - - -
0.0225 1000 0.3809 - - -
0.0338 1500 0.3494 - - -
0.0451 2000 0.3481 - - -
0.0563 2500 0.3466 - - -
0.0676 3000 0.3475 - - -
0.0789 3500 0.3467 - - -
0.0901 4000 0.3467 - - -
0.1014 4500 0.348 - - -
0.1127 5000 0.3469 0.3448 0.6769 -
0.1240 5500 0.3493 - - -
0.1352 6000 0.3463 - - -
0.1465 6500 0.3457 - - -
0.1578 7000 0.3449 - - -
0.1690 7500 0.3432 - - -
0.1803 8000 0.3424 - - -
0.1916 8500 0.3443 - - -
0.2028 9000 0.344 - - -
0.2141 9500 0.3466 - - -
0.2254 10000 0.3421 0.3449 0.6726 -
0.2366 10500 0.3422 - - -
0.2479 11000 0.3439 - - -
0.2592 11500 0.3454 - - -
0.2704 12000 0.3476 - - -
0.2817 12500 0.3461 - - -
0.2930 13000 0.3483 - - -
0.3043 13500 0.344 - - -
0.3155 14000 0.3496 - - -
0.3268 14500 0.3448 - - -
0.3381 15000 0.3462 0.3442 0.6632 -
0.3493 15500 0.3446 - - -
0.3606 16000 0.3443 - - -
0.3719 16500 0.3444 - - -
0.3831 17000 0.3452 - - -
0.3944 17500 0.3467 - - -
0.4057 18000 0.3439 - - -
0.4169 18500 0.3437 - - -
0.4282 19000 0.3426 - - -
0.4395 19500 0.3435 - - -
0.4507 20000 0.3453 0.3443 0.6550 -
0.4620 20500 0.3439 - - -
0.4733 21000 0.3434 - - -
0.4846 21500 0.3477 - - -
0.4958 22000 0.3471 - - -
0.5071 22500 0.3468 - - -
0.5184 23000 0.3453 - - -
0.5296 23500 0.3447 - - -
0.5409 24000 0.3441 - - -
0.5522 24500 0.3459 - - -
0.5634 25000 0.3431 0.3447 0.6558 -
0.5747 25500 0.3435 - - -
0.5860 26000 0.3464 - - -
0.5972 26500 0.3436 - - -
0.6085 27000 0.3446 - - -
0.6198 27500 0.3401 - - -
0.6310 28000 0.347 - - -
0.6423 28500 0.3412 - - -
0.6536 29000 0.3427 - - -
0.6648 29500 0.3423 - - -
0.6761 30000 0.3407 0.3418 0.6612 -
0.6874 30500 0.3404 - - -
0.6987 31000 0.3413 - - -
0.7099 31500 0.3434 - - -
0.7212 32000 0.3437 - - -
0.7325 32500 0.3442 - - -
0.7437 33000 0.3413 - - -
0.7550 33500 0.3441 - - -
0.7663 34000 0.3387 - - -
0.7775 34500 0.3416 - - -
0.7888 35000 0.3409 0.3392 0.6554 -
0.8001 35500 0.3414 - - -
0.8113 36000 0.338 - - -
0.8226 36500 0.3385 - - -
0.8339 37000 0.3391 - - -
0.8451 37500 0.3381 - - -
0.8564 38000 0.3372 - - -
0.8677 38500 0.3391 - - -
0.8790 39000 0.3404 - - -
0.8902 39500 0.3399 - - -
0.9015 40000 0.3413 0.3376 0.6572 -
0.9128 40500 0.3408 - - -
0.9240 41000 0.342 - - -
0.9353 41500 0.3389 - - -
0.9466 42000 0.3375 - - -
0.9578 42500 0.3378 - - -
0.9691 43000 0.3386 - - -
0.9804 43500 0.3377 - - -
0.9916 44000 0.3362 - - -
-1 -1 - - - 0.6549

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 4.0.2
  • Transformers: 4.50.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
40
Safetensors
Model size
19.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for philipp-zettl/gte-micro-v4-mtg

Finetuned
(1)
this model

Dataset used to train philipp-zettl/gte-micro-v4-mtg

Evaluation results