--- license: apache-2.0 language: - hu base_model: - state-spaces/mamba-130m-hf pipeline_tag: text-generation tags: - Transformers - mamba --- # PULI-HuBA 130M PULI-HuBA 130M is a monolingual Hungarian foundation model based on the Mamba configuration. (https://huggingface.co/state-spaces/mamba-130m-hf) Parameters: MambaForCausalLM( (backbone): MambaModel( (embeddings): Embedding(52000, 768) (layers): ModuleList( (0-23): 24 x MambaBlock( (norm): MambaRMSNorm(768, eps=1e-05) (mixer): MambaMixer( (conv1d): Conv1d(1536, 1536, kernel_size=(4,), stride=(1,), padding=(3,), groups=1536) (act): SiLU() (in_proj): Linear(in_features=768, out_features=3072, bias=False) (x_proj): Linear(in_features=1536, out_features=80, bias=False) (dt_proj): Linear(in_features=48, out_features=1536, bias=True) (out_proj): Linear(in_features=1536, out_features=768, bias=False) ) ) ) (norm_f): MambaRMSNorm(768, eps=1e-05) ) (lm_head): Linear(in_features=768, out_features=52000, bias=False) ) ## Training Data (Pretraining) The model was trained on a ~3.48B-token, toxic-filtered, deduplicated, and semantically segmented dataset. ## Training Details License: Apache 2.0 Hardware: 4 × NVIDIA A100 (80GB) GPUs Year of training: 2024 Input/output: Text only Parameter count: 130 million Available model size: Single variant Data type: float32 Batch size: 10 per GPU Learning rate: 3e-4 Reference: GitHub issue ## Ethical Considerations Concerns: Potential for biased, incorrect, or harmful content generation. ## **Usage Example** To generate text using this model with Hugging Face's `pipeline`, use the following Python code: ```python from transformers import pipeline # Load the model model_name = "NYTK/PULI-HuBA130M" # Initialize the text generation pipeline generator = pipeline("text-generation", model=model_name) # Generate text with recommended parameters output = generator( "Az a tény, hogy anyanyelvem magyar, és magyarul beszélek, gondolkozom, írok, életem legnagyobb eseménye, melyhez nincs fogható.", # Example prompt in Hungarian max_length=156, do_sample=True, repetition_penalty=1.35, temperature=0.2, top_k=100, top_p=0.99, truncation=True ) # Print the generated text print(output[0]["generated_text"]) ``` # Contact If you have any questions, please contact me: madarasz.gabor@nytud.hu-ren.hu or gabor.madarasz@gmail.com