|
--- |
|
license: mit |
|
language: |
|
- ko |
|
- en |
|
base_model: MLP-KTLim/llama-3-Korean-Bllossom-8B |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model is an AWS Neuron compiled version, neuron-cc 2.14, of the Korean fine-tuned model MLP-KTLim/llama-3-Korean-Bllossom-8B, available at https://huggingface.co/MLP-KTLim/llama-3-Korean-Bllossom-8B. It is intended for deployment on Amazon EC2 Inferentia2 and Amazon SageMaker. For detailed information about the model and its license, please refer to the original MLP-KTLim/llama-3-Korean-Bllossom-8B model page |
|
|
|
## Model Details |
|
|
|
This model is compiled with neuronx-cc version, 2.14 |
|
It can be deployed with [v1.0-hf-tgi-0.0.24-pt-2.1.2-inf-neuronx-py310](https://github.com/aws/deep-learning-containers/releases?q=tgi+AND+neuronx&expanded=true on SageMaker Endpoint because this inference docker image is only used on SageMaker |
|
|
|
This model can be deployed to Amazon SageMaker Endtpoint with this guide, [S3 ์ ์ ์ฅ๋ ๋ชจ๋ธ์ SageMaker INF2 ์ ๋ฐฐํฌํ๊ธฐ](https://github.com/aws-samples/aws-ai-ml-workshop-kr/blob/master/neuron/hf-optimum/04-Deploy-Llama3-8B-HF-TGI-Docker-On-INF2/notebook/03-deploy-llama-3-neuron-moel-inferentia2-from-S3.ipynb) |
|
|
|
In order to do neuron-compilation and depoly in detail , you can refer to [Amazon ECR ์ ๋์ปค ์ด๋ฏธ์ง ๊ธฐ๋ฐํ์ Amazon EC2 Inferentia2 ์๋นํ๊ธฐ](https://github.com/aws-samples/aws-ai-ml-workshop-kr/blob/master/neuron/hf-optimum/04-Deploy-Llama3-8B-HF-TGI-Docker-On-INF2/README-NeuronCC-2-14.md) |
|
|
|
|
|
## Hardware |
|
|
|
At a minimum hardware, you can use Amazon EC2 inf2.xlarge and more powerful family such as inf2.8xlarge, inf2.24xlarge and inf2.48xlarge. |
|
The detailed information is [Amazon EC2 Inf2 Instances](https://aws.amazon.com/ec2/instance-types/inf2/) |
|
|
|
|
|
## Model Card Contact |
|
|
|
Gonsoo Moon, [email protected] |