AP-MAE-SC2-7B

This Model is currently anonymized during the paper review process.

The AP-MAE transformer model design and configuration is available in the reproduction package attached to the submission

This version of AP-MAE is trained on attention heads generated by StarCoder2-7B during inference. The inference task used for generating attention outputs is FiM token prediction for a random 3-10 length masked section of Java code, with exactly 256 tokens of surrounding context.

Usage:

from ap_mae import APMAE
model = APMAE.from_pretrained(
    "LaughingLogits/AP-MAE-SC2-7B"
)
Downloads last month
34
Safetensors
Model size
127M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Collection including LaughingLogits/AP-MAE-SC2-7B