Adapter info

  • This is an Lora adapter using dataset contains only 360 Vietnamese sentences and the "text" column in a format like:

      > \<s\>\[INST\] "Bạn bè có phúc cùng chia."\[\/INST\] Bạn bè có phúc cùng chia. Có họa trốn sạch chạy đi phương nào? Tay trắng làm nên… mấy chục ngàn bạc nợ. \<\/s\>
    
      or
    
      > \<s\>\[INST\] Ai bảo chăn trâu là khổ. \[\/INST\] Ai bảo chăn trâu là khổ. Tôi chăn chồng còn khổ hơn trâu. Trâu đi trâu biêt đường về. Chồng đi không biết dường về như trâu. \<\/s\>
    

Training procedure

  • The following bitsandbytes quantization config was used during training:
    • load_in_8bit: False
    • load_in_4bit: True
    • llm_int8_threshold: 6.0
    • llm_int8_skip_modules: None
    • llm_int8_enable_fp32_cpu_offload: False
    • llm_int8_has_fp16_weight: False
    • bnb_4bit_quant_type: nf4
    • bnb_4bit_use_double_quant: False
    • bnb_4bit_compute_dtype: float16

Usage

  • import torch
    from peft import PeftModel    
    from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
    
    model_name = "NousResearch/llama-2-7b-chat-hf"
    adapters_name = "dtthanh/llama-2-7b-und-lora-2.7"
    
    print(f"Starting to load the model {model_name} into memory")
    
    m = AutoModelForCausalLM.from_pretrained(
        model_name,
        # base_model_name_or_path # NousResearch/llama-2-7b-chat-hf
        #load_in_4bit=True,
        torch_dtype=torch.bfloat16,
        device_map={"": 0}
    )
    
    m = PeftModel.from_pretrained(m, adapters_name)
    m = m.merge_and_unload()
    tok = AutoTokenizer.from_pretrained(model_name)
    tok.pad_token_id = 18610 # _***
    
    
    print(f"Successfully loaded the model {model_name} into memory")
    
  • PEFT 0.4.0

Downloads last month
13
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.