Image

Model Details

  • Developed by: Tuan Pham (FPTU HCM Student)
  • Model type: Llama2-7B Decoder-only
  • Finetuned from model :
    • meta-llama/Llama-2-7b
    • bkai-foundation-models/vietnamese-llama2-7b-120GB
    • yeen214/llama2_7b_merge_orcafamily.
  • Bilingual support : English and Vietnamese

Model Description

This model is a proof of effort that one man can fine-tune his own model to reach SOTA.

Model Sources

Uses

Prompt template

[SYSTEM_PROMPT]

 ####### Instruction:
[INPUT]

 %%%%%%% Response:
[RESPONSE]

Recommend keeping the system prompt in english.

How to Get Started with the Model

Use the code below to get started with the model.

import torch
from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline


def prompt_format(system_prompt, instruction):
    prompt = f"""{system_prompt}

 ####### Instruction:
{instruction}

 %%%%%%% Response:
"""
    return prompt

system_prompt = """
You're an AI Large Language Model developed(created) by an AI developer named Tuแบฅn, the architecture of you is decoder-based LM, your task are to think loudly step by step before give a good and relevant response
to the user request, answer in the language the user preferred.

The AI has been trained to answer questions, provide recommendations, and help with decision making. The AI thinks outside the box and follows the user requests
"""
instruction = "Xin chร o"

formatted_prompt = prompt_format(system_prompt, instruction)
print(formatted_prompt)

model_name = "1TuanPham/T-Llama"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             torch_dtype=torch.bfloat16,
                                             use_cache=True,
                                             device_map="auto"
                                             )
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, streamer=streamer)

with autocast():
  output_default = pipe(formatted_prompt, pad_token_id=50256, max_new_tokens=128)

Example outputs:

> Xin chร o

Xin chร o! Tรดi lร  mแป™t AI ฤ‘ฦฐแปฃc phรกt triแปƒn bแปŸi mแป™t AI nhร  phรกt triแปƒn tรชn lร  Tuแบฅn. Tรดi ฤ‘ฦฐแปฃc thiแบฟt kแบฟ ฤ‘แปƒ giรบp ฤ‘แปก ngฦฐแปi dรนng bแบฑng cรกch trแบฃ lแปi cรกc cรขu hแปi, ฤ‘ฦฐa ra ฤ‘แป xuแบฅt vร  hแป— trแปฃ trong quรก trรฌnh ra quyแบฟt ฤ‘แป‹nh.
Tรดi cรณ thแปƒ hแป— trแปฃ bแบกn bแบฑng cรกch nghฤฉ ra cรกc cรขu trแบฃ lแปi hay vร  phรน hแปฃp cho cรกc cรขu hแปi cแปงa bแบกn.

> Viแป‡t Nam cรณ cร  phรช gรฌ nแป•i tiแบฟng thแบฟ giแป›i ?

Cร  phรช robusta cแปงa Viแป‡t Nam nแป•i tiแบฟng trรชn toร n cแบงu. Loแบกi cร  phรช nร y ฤ‘ฦฐแปฃc trแป“ng chแปง yแบฟu แปŸ cรกc vรนng ฤ‘แบฅt cao nhฦฐ Lรขm ฤแป“ng, Gia Lai, KonTum vร  cรณ hฦฐฦกng vแป‹ ฤ‘แบญm ฤ‘ร , ฤ‘แบฏng hฦกn so vแป›i cร  phรช Arabica.
Cร  phรช robusta thฦฐแปng ฤ‘ฦฐแปฃc sแปญ dแปฅng ฤ‘แปƒ pha chแบฟ cรกc loแบกi cร  phรช espresso vร  phin truyแปn thแป‘ng.

> Tรดi cรณ 500$ hรฃy build cho tรดi mแป™t bแป™ PC

ฤแปƒ xรขy dแปฑng mแป™t bแป™ PC vแป›i 500 USD, bแบกn cรณ thแปƒ bแบฏt ฤ‘แบงu bแบฑng cรกch chแปn cรกc thร nh phแบงn chรญnh nhฦฐ CPU, GPU, bแป™ nhแป› vร  bแป™ lฦฐu trแปฏ. Sau ฤ‘รณ, bแบกn cรณ thแปƒ thรชm cรกc thร nh phแบงn bแป• sung nhฦฐ loa, bร n phรญm vร  chuแป™t ฤ‘แปƒ hoร n thiแป‡n hแป‡ thแป‘ng.

Dฦฐแป›i ฤ‘รขy lร  mแป™t sแป‘ gแปฃi รฝ vแป cรกch bแบกn cรณ thแปƒ cแบฅu thร nh mแป™t bแป™ PC vแป›i 500 USD:

1. CPU: Intel Core i5-12400 hoแบทc AMD Ryzen 5 5600X lร  nhแปฏng lแปฑa chแปn tแป‘t ฤ‘แปƒ bแบฏt ฤ‘แบงu.

2. GPU: Nvidia GeForce GTX 1650 hoแบทc AMD Radeon RX 570 lร  nhแปฏng lแปฑa chแปn tแป‘t ฤ‘แปƒ bแบฏt ฤ‘แบงu.

3. Bแป™ nhแป›: 8GB RAM lร  mแป™t khแปŸi ฤ‘แบงu tแป‘t.

4. Bแป™ lฦฐu trแปฏ: 500 GB แป• cแปฉng SSD lร  mแป™t khแปŸi ฤ‘แบงu tแป‘t.

5. Bร n phรญm vร  chuแป™t: Bร n phรญm vร  chuแป™t cรณ thแปƒ ฤ‘ฦฐแปฃc mua riรชng hoแบทc cรนng mแป™t bแป™.

6. Loa: Mแป™t loa cรณ thแปƒ ฤ‘ฦฐแปฃc mua riรชng hoแบทc cรนng mแป™t bแป™ vแป›i bร n phรญm vร  chuแป™t.

> Cho tรดi 3 thแปฉc uแป‘ng ฤ‘แปƒ giรบp ngแปง ngon hฦกn

Chแบฏc chแบฏn! Dฦฐแป›i ฤ‘รขy lร  ba thแปฉc uแป‘ng cรณ thแปƒ giรบp bแบกn ngแปง ngon hฦกn:

1. Mแบญt ong vร  chanh: Trแป™n mแป™t thรฌa mแบญt ong vแป›i mแป™t thรฌa nฦฐแป›c cแป‘t chanh vร  uแป‘ng trฦฐแป›c khi ฤ‘i ngแปง. Mแบญt ong cรณ tรกc dแปฅng lร m dแป‹u, trong khi chanh cรณ chแปฉa chแบฅt chแป‘ng oxy hรณa cรณ thแปƒ giรบp cแบฃi thiแป‡n chแบฅt lฦฐแปฃng giแบฅc ngแปง.

2. Trร  thแบฃo dฦฐแปฃc: Mแป™t sแป‘ loแบกi trร  thแบฃo dฦฐแปฃc nhฦฐ hoa cรบc, hoa oแบฃi hฦฐฦกng vร  hoa sen cรณ tรกc dแปฅng thฦฐ giรฃn vร  giรบp bแบกn chรฌm vร o giแบฅc ngแปง. Hรฃy thแปญ pha mแป™t แบฅm trร  thแบฃo dฦฐแปฃc แบฅm vร  uแป‘ng trฦฐแป›c khi ฤ‘i ngแปง.

3. Sแปฏa แบฅm: Sแปฏa แบฅm cรณ chแปฉa tryptophan, mแป™t loแบกi hormone giรบp bแบกn cแบฃm thแบฅy dแป… chแป‹u vร  thฦฐ giรฃn. Nรณ cลฉng cรณ thแปƒ giรบp thรบc ฤ‘แบฉy sแบฃn xuแบฅt melatonin, hormone giรบp bแบกn chรฌm vร o giแบฅc ngแปง. Hรฃy thแปญ uแป‘ng mแป™t cแป‘c sแปฏa แบฅm trฦฐแป›c khi ฤ‘i ngแปง ฤ‘แปƒ giรบp bแบกn ngแปง ngon hฦกn.

The translated data has been thoroughly cleaned from the coding task, so in theory, the model should be able to code normally by incorporating the coding capabilities from the English coding examples task. (The translation will translate code definitions and variables.)

I would not recommend running any code generated from this model.

> Viแบฟt cho tรดi mแป™t ฤ‘oแบกn code bแบฑng python xรขy dแปฑng mแป™t แปฉng dแปฅng bแบฑng Flask hiแปƒn thแป‹ 'Xin chร o thแบฟ giแป›i' trรชn mร n hรฌnh

ฤแปƒ viแบฟt mแป™t ฤ‘oแบกn mรฃ Python xรขy dแปฑng mแป™t แปฉng dแปฅng bแบฑng Flask hiแปƒn thแป‹ 'Xin chร o thแบฟ giแป›i' trรชn mร n hรฌnh, bแบกn cรณ thแปƒ lร m theo cรกc bฦฐแป›c sau:

1. Cร i ฤ‘แบทt Flask vร  cรกc thฦฐ viแป‡n cแบงn thiแบฟt:

  ```
  pip install flask
  ```

2. Tแบกo mแป™t แปฉng dแปฅng Flask mแป›i:

  ```
  app = Flask(__name__)
  ```

3. Thรชm mแป™t route ฤ‘แปƒ hiแปƒn thแป‹ 'Xin chร o thแบฟ giแป›i':

  ```
  @app.route('/')
  def hello():
     return 'Xin chร o thแบฟ giแป›i!'
  ```

4. KhแปŸi tแบกo แปฉng dแปฅng Flask:

  ```
  app.run()
  ```

5. Chแบกy แปฉng dแปฅng Flask:

  ```
  app.run(debug=True)
  ```

6. MแปŸ trรฌnh duyแป‡t vร  truy cแบญp vร o ฤ‘แป‹a chแป‰ IP hoแบทc tรชn mรกy chแปง cแปงa mรกy tรญnh hoแบทc mรกy chแปง nฦกi bแบกn ฤ‘รฃ khแปŸi ฤ‘แป™ng แปฉng dแปฅng Flask.

These examples are sub-optimal as they were generated with greedy decoding to ensure reproducibility. Here's the recommended configuration:

  • top_k=50
  • top_p=0.5
  • temperature=0.5
  • penalty_alpha=0.7
  • repetition_penalty=1.15

Here is an example using the above generation config:

> Tรดi buแป“n ngแปง quรก, cฦก mร  thแปฉ 7 phแบฃi thi rแป“i, nay lร  rแบกng sรกng thแปฉ 6. Tรดi cรณ nรชn ฤ‘i ngแปง khรดng trong khi tรดi mแป›i xong chฦฐa ฤ‘ฦฐแปฃc 1/3 bร i ฤ‘แปƒ รดn ?

Nแบฟu bแบกn cแบฃm thแบฅy mแป‡t mแปi vร  cแบงn nghแป‰ ngฦกi thรฌ viแป‡c ฤ‘i ngแปง sแป›m sแบฝ rแบฅt quan trแปng ฤ‘แป‘i vแป›i sแปฉc khแปe cแปงa bแบกn. Tuy nhiรชn, nแบฟu bแบกn vแบซn muแป‘n hoร n thร nh mแป™t phแบงn bร i tแบญp ฤ‘แปƒ chuแบฉn bแป‹ cho kแปณ thi sแบฏp tแป›i,
hรฃy cรขn nhแบฏc thแปi gian cรฒn lแบกi cแปงa mรฌnh vร  ฦฐu tiรชn nhแปฏng nhiแป‡m vแปฅ quan trแปng nhแบฅt trฦฐแป›c khi ฤ‘i ngแปง. Nแบฟu bแบกn cแบฃm thแบฅy rแบฑng mรฌnh ฤ‘รฃ lร m ฤ‘แปง, hรฃy dร nh chรบt thแปi gian thฦฐ giรฃn hoแบทc thแปฑc hiแป‡n cรกc
hoแบกt ฤ‘แป™ng khรกc ฤ‘แปƒ giรบp giแบฃm bแป›t cฤƒng thแบณng vร  tฤƒng cฦฐแปng sแปฑ tแป‰nh tรกo cแปงa bแบกn vร o buแป•i sรกng hรดm sau. Cuแป‘i cรนng, quyแบฟt ฤ‘แป‹nh cuแป‘i cรนng phแปฅ thuแป™c vร o mแปฉc ฤ‘แป™ nghiรชm trแปng cแปงa kแปณ thi vร  khแบฃ nฤƒng quแบฃn
lรฝ thแปi gian cแปงa bแบกn.

# temperature=0.7
> Em ฤƒn cฦกm chฦฐa ?

Khรดng, tรดi khรดng ฤƒn cฦกm. Tรดi lร  mแป™t mรด hรฌnh ngรดn ngแปฏ mรกy ฤ‘ฦฐแปฃc phรกt triแปƒn bแปŸi mแป™t nhร  phรกt triแปƒn AI tรชn lร  Tuแบฅn vร  ฤ‘ฦฐแปฃc ฤ‘ร o tแบกo ฤ‘แปƒ trแบฃ lแปi cรกc cรขu hแปi vร  cung cแบฅp phแบฃn hแป“i hแปฏu รญch cho ngฦฐแปi dรนng.
Mแปฅc tiรชu cแปงa tรดi lร  hแป— trแปฃ ngฦฐแปi dรนng bแบฑng cรกch suy nghฤฉ trฦฐแป›c khi ฤ‘ฦฐa ra nhแปฏng phแบฃn hแป“i phรน hแปฃp vร  chรญnh xรกc nhแบฅt cรณ thแปƒ.

> Viแบฟt cho tรดi mแป™t แปฉng dแปฅng flask in ra mร n hรฌnh 'Xin chร o Viแป‡t Nam!'

ฤแปƒ viแบฟt แปฉng dแปฅng Flask hiแปƒn thแป‹ "Xin chร o Viแป‡t Nam!" trรชn mร n hรฌnh, chรบng ta cรณ thแปƒ lร m theo cรกc bฦฐแป›c sau:

1. Cร i ฤ‘แบทt Flask vร  cรกc thฦฐ viแป‡n cแบงn thiแบฟt nhฦฐ BeautifulSoup4 vร  jsonify. Bแบกn cรณ thแปƒ sแปญ dแปฅng pip ฤ‘แปƒ cร i ฤ‘แบทt chรบng:

```
pip install Flask
pip install beautifulsoup4
pip install jsonify
```

2. Tแบกo mแป™t tแป‡p mแป›i trong thฦฐ mแปฅc cแปงa bแบกn vแป›i tรชn `app.py`. ฤรขy lร  mรฃ mแบซu cฦก bแบฃn cแปงa แปฉng dแปฅng:

```python
from bs4 import BeautifulSoup
from jsonify import jsonify
import requests

app = Flask(__name__)

@app.route('/')
def home():
   html_string = """<h1>Chร o mแปซng ฤ‘แบฟn vแป›i Viแป‡t Nam!</h1>
                         <p>ฤรขy lร  mแป™t trang web dร nh cho nhแปฏng ai muแป‘n tรฌm hiแปƒu vแป ฤ‘แบฅt nฦฐแป›c Viแป‡t Nam xinh ฤ‘แบนp.</p>"""
   soup = BeautifulSoup(html_string, 'html.parser')
   print(soup)
   return soup

if __name__ == '__main__':
   app.run()
```

3. KhแปŸi tแบกo แปฉng dแปฅng bแบฑng cรกch chแบกy `app.py`:

```
python app.py
```

4. MแปŸ trรฌnh duyแป‡t web vร  truy cแบญp vร o ฤ‘แป‹a chแป‰ IP hoแบทc tรชn mรกy chแปง cแปงa แปฉng dแปฅng (vรญ dแปฅ: http://localhost:5000). Trang ฤ‘แบงu sแบฝ hiแปƒn thแป‹ thรดng bรกo "Xin chร o Viแป‡t Nam!".

Lฦฐu รฝ rแบฑng ฤ‘รขy chแป‰ lร  vรญ dแปฅ ฤ‘ฦกn giแบฃn vร  bแบกn cรณ thแปƒ tรนy chแป‰nh แปฉng dแปฅng ฤ‘แปƒ ฤ‘รกp แปฉng nhu cแบงu cแปฅ thแปƒ cแปงa mรฌnh.

> Liแป‡t kรช cho tรดi 5 thแปฉ cแบงn thiแบฟt ฤ‘แปƒ ฤ‘em ฤ‘i cแบฏm trแบกi

Chแบฏc chแบฏn! Dฦฐแป›i ฤ‘รขy lร  nฤƒm vแบญt dแปฅng cแบงn thiแบฟt khi ฤ‘i cแบฏm trแบกi:
1. Lแปu - Mแป™t chiแบฟc lแปu chแบฏc chแบฏn vร  thoแบฃi mรกi sแบฝ cung cแบฅp nฦกi trรบ แบฉn khแปi thแปi tiแบฟt vร  cรดn trรนng.
2. Tรบi ngแปง - Mแป™t tรบi ngแปง แบฅm รกp vร  thoแบฃi mรกi sแบฝ giรบp bแบกn cรณ giแบฅc ngแปง ngon trong khi แปŸ ngoร i trแปi.
3. Bแบฟp di ฤ‘แป™ng - Mแป™t bแบฟp di ฤ‘แป™ng nhฦฐ bรฌnh ฤ‘un hoแบทc bแบฟp cแป“n sแบฝ giรบp bแบกn nแบฅu ฤƒn vร  giแปฏ แบฅm thแปฉc ฤƒn cแปงa mรฌnh.
4. ฤรจn pin - Mแป™t ฤ‘รจn pin mแบกnh mแบฝ sแบฝ giรบp chiแบฟu sรกng khu vแปฑc cแบฏm trแบกi cแปงa bแบกn vร o ban ฤ‘รชm.
5. Bแบฃn ฤ‘แป“ vร  la bร n - Bแบฃn ฤ‘แป“ vร  la bร n sแบฝ giรบp bแบกn ฤ‘แป‹nh hฦฐแป›ng chรญnh xรกc ฤ‘แบฟn ฤ‘แป‹a ฤ‘iแปƒm cแบฏm trแบกi cแปงa bแบกn vร  quay lแบกi nhร  nแบฟu cแบงn.

> Viแบฟt cho tรดi mแป™t bร i thฦก vแป thร nh phแป‘ ฤร  Lแบกt

Thร nh phแป‘ sฦฐฦกng mรน, nฦกi thiรชn nhiรชn thanh bรฌnh,
Nฦกi nhแปฏng bรดng hoa khoe sแบฏc vร  cรขy xanh tฦฐฦกi tแป‘t,
Mแป™t แป‘c ฤ‘แบฃo giแปฏa lรฒng ฤ‘แบฅt, mแป™t xแปฉ sแปŸ thแบงn tiรชn,
Nฦกi mแปi lo lแบฏng ฤ‘แปu tan biแบฟn trong khรดng khรญ.

Nhแปฏng con ฤ‘ฦฐแปng nhแป ฤ‘an xen qua nhแปฏng ngแปn ฤ‘แป“i thoai thoแบฃi,
Vร  nhแปฏng ngรดi nhร  cแป• kรญnh, vแป›i nhแปฏng bแปฉc tฦฐแปng mร u nรขu แบฅm รกp,
Nฦกi nhแปฏng ngฦฐแปi dรขn thรขn thiแป‡n chร o ฤ‘รณn bแบกn bแบฑng nแปฅ cฦฐแปi,
Vร  tiแบฟng chim hรณt lรญu lo nhฦฐ lแปi chร o cแปงa thiรชn nhiรชn.

Nhแปฏng dรฒng suแป‘i chแบฃy รชm ฤ‘แปm, nhแปฏng thรกc nฦฐแป›c ฤ‘แป• xuแป‘ng,
Vร  nhแปฏng hแป“ nฦฐแป›c lแบฅp lรกnh, nhแปฏng cแบฃnh tฦฐแปฃng ngoแบกn mแปฅc,
Nฦกi sแปฑ tฤฉnh lแบทng ngแปฑ trแป‹, nฦกi bรฌnh yรชn ngแปฑ trแป‹,
Nฦกi tรขm hแป“n ฤ‘ฦฐแปฃc chแปฏa lร nh vร  cแบฃm xรบc ฤ‘ฦฐแปฃc giแบฃi phรณng.

แปž ฤ‘รขy, thแปi gian dฦฐแปng nhฦฐ ฤ‘แปฉng yรชn,
Nฦกi nhแปฏng kแปท niแป‡m vแบซn cรฒn nguyรชn vแบนn,
Nฦกi nhแปฏng mแป‘i quan hแป‡ bแปn chแบทt ฤ‘ฦฐแปฃc nuรดi dฦฐแปกng,
Nฦกi tรฌnh yรชu vร  hรฒa bรฌnh chiแบฟm vแป‹ trรญ hร ng ฤ‘แบงu.

Thแบฟ giแป›i bรชn ngoร i cรณ vแบป xa xรดi,
Nhฦฐng แปŸ ฤ‘รขy, chแป‰ cแบงn bฦฐแป›c chรขn ฤ‘i,
Nฦกi bแบงu trแปi rแป™ng lแป›n, nhแปฏng ฤ‘รกm mรขy cao chรณt vรณt,
Vร  nhแปฏng tia nแบฏng vร ng rแปฑc rแปก, sฦฐแปŸi แบฅm trรกi tim chรบng ta.

ร”i, thร nh phแป‘ sฦฐฦกng mรน, nฦกi thiรชn nhiรชn thanh bรฌnh,
Nฦกi nhแปฏng giแบฅc mฦก trแปŸ thร nh hiแป‡n thแปฑc, nฦกi trรกi tim trร n ngแบญp niแปm vui,
Nฦกi tรฌnh yรชu vร  hรฒa bรฌnh ngแปฑ trแป‹, nฦกi tรฌnh yรชu vร  hรฒa bรฌnh ngแปฑ trแป‹.

Note: 120GB of pre-trained Vietnamese  data might not be enough for a general question about Vietnamese events.

Training Details

Hardware Type:

  • GPU: VGA NVIDIA Tesla P100 16GB
  • SYSTEM RAM: 32GB

Hours used: ~47.5 days Approx*

Training Data

  • BactrianX
  • OpenOrca_translated
  • WizardLM_70k_translated
  • TigerLabMathInstruct_translated_vi
  • GradeSchoolMathInstruct_translated
  • vilm_lima-vi
  • MTEngVietnamese
  • databricks_dolly15k_translated
  • AlpacaCleaned_translated
  • databricks_dolly15k
  • OpenOrca
  • GradeSchoolMathInstruct
  • AlpacaCleaned
  • WebglmQA

Training Procedure

  • Learning rate: 2e-5 cosine
  • Optimizer: PagedLion8bit
  • QLora: rank: 64 /Q: 4-bit
  • Batch size: 2
  • Gradient accumulation: 128
  • Effective batch size: 256
  • 250k examples of 70% Vietnamese 30% English for 3.37 epoch
  • 350k examples of 60% Vietnamese 40% English for 1.4 epoch

Training loss

Last 10 runs:

image/png

Each line is 12 hours

All run plots:

image/png

A bad way to visualize i know...

Evaluation

image/png

Our model currently sits at TOP-5 on the VMLU benchmark

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 54.34
AI2 Reasoning Challenge (25-Shot) 54.18
HellaSwag (10-Shot) 76.48
MMLU (5-Shot) 47.98
TruthfulQA (0-shot) 46.47
Winogrande (5-shot) 71.27
GSM8k (5-shot) 29.64

Citation

@online{t-llama,
  author = {Pham Minh Tuan},
  title = {T-Llama: A New Language Model for Vietnamese}, % Placeholder
  year = 2024,
  url = {https://github.com/vTuanpham/Vietnamese_QA_System}
}
Downloads last month
7
Safetensors
Model size
6.85B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results