Introduction

We introduce Llama-3-Motif, a new language model family of Moreh, specialized in Korean and English.
Llama-3-Motif-102B-Instruct is a chat model tuned from this model.

Training Platform

Llama-3-Motif-102B is trained on MoAI platform, refer to link for more information.

Quick Usage

base model is not served directly. Instead, you can chat directly with Llama-3-Motif-102B-Instruct through our Model hub.

Details

More details will be provided in the upcoming technical report.

Release Date

2024.12.02

Benchmark Results

Provider	Model	kmmlu_direct score
Moreh	Llama-3-Motif-102B	64.74	+
Meta	Llama3-70B-instruct	54.5*
Meta	Llama3.1-70B-instruct	52.1*
Meta	Llama3.1-405B-instruct	65.8*
Alibaba	Qwen2-72B-instruct	64.1*
OpenAI	GPT-4-0125-preview	59.95*
OpenAI	GPT-4o-2024-05-13	64.11**
Google	gemini pro	50.18*
LG	exaone 3.0	44.5*	+
Naver	HyperCLOVA X	53.4*	+
Upstage	SOLAR-10.7B	41.65*	+

* : Community report
** : Measured by Moreh
+ : Claimed to have better capability in Korean

How to use

We do not recommend using base model directly!

Use with vLLM

Refer to this link to install vllm

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

# Change tensor_parallel_size to GPU numbers you can afford
model = LLM("moreh/Llama-3-Motif-102B", tensor_parallel_size=4)
tokenizer = AutoTokenizer.from_pretrained("moreh/Llama-3-Motif-102B")
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "유치원생에게 빅뱅 이론의 개념을 설명해보세요"},
]

messages_batch = [tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)]

# vllm does not support generation_config of hf. So we have to set it like below
sampling_params = SamplingParams(max_tokens=512, temperature=0, repetition_penalty=1.0, stop_token_ids=[tokenizer.eos_token_id])
responses = model.generate(messages_batch, sampling_params=sampling_params)

print(responses[0].outputs[0].text)

Use with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "moreh/Llama-3-Motif-102B"

# all generation configs are set in generation_configs.json
model = AutoModelForCausalLM.from_pretrained(model_id).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_id)
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "유치원생에게 빅뱅 이론의 개념을 설명해보세요"},
]

messages_batch = tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)
input_ids = tokenizer(messages_batch, padding=True, return_tensors='pt')['input_ids'].cuda()

outputs = model.generate(input_ids)

moreh
/

Llama-3-Motif-102B

You need to agree to share your contact information to access this model