|
--- |
|
license: llama3 |
|
language: |
|
- ja |
|
- en |
|
tags: |
|
- moe |
|
- japanese |
|
- sql |
|
--- |
|
### モデルの説明(English explanation is below.) |
|
このモデルは、MergeKitツールを使用して作成されたMixture of Experts (MoE) 言語モデルをGGUF形式で量子化したものです。 |
|
|
|
量子化していないものは [こちら](https://huggingface.co/keitokei1994/swallow-3-8B-sqlcoder-2x8B) |
|
|
|
デモは [こちら](https://huggingface.co/spaces/keitokei1994/SQL_LLM) |
|
|
|
|
|
### モデルの詳細 |
|
- **モデル名**: swallow-3-8B-sqlcoder-2x8B-GGUF |
|
- **モデルアーキテクチャ**: Mixture of Experts (MoE) |
|
- **ベースモデル**: |
|
- [aixsatoshi/Llama3-Swallow-8B-instruct-vector-merged](https://huggingface.co/aixsatoshi/Llama3-Swallow-8B-instruct-vector-merged) |
|
- [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b) |
|
- **マージツール**: MergeKit |
|
|
|
このMoEモデルは、Llama3-Swallow-8B-instruct-vector-mergedの日本語能力とLlama-3-sqlcoder-8bのSQL生成能力を組み合わせることで、より強力で多機能な言語モデルを目指しています。 |
|
#### 特徴 |
|
- 日本語と英語の両方に対応 |
|
- Llama3-Swallow-8B-instruct-vector-mergedによる優れた日本語処理能力 |
|
- Llama-3-sqlcoder-8bによる高度なSQL生成と処理能力 |
|
#### 要求スペック |
|
Q4_K_M量子化モデルであれば、RTX3060 12GBでフルロード可能です。 |
|
筆者はWSL2やGoogle Colaboratotry Proでの作成後、Llama.cppとLMstudioにて動作確認を行っています。 |
|
|
|
--- |
|
### Model Description |
|
This model is a Mixture of Experts (MoE) language model created using the MergeKit tool. |
|
The gguf version can be found [こちら](https://huggingface.co/keitokei1994/swallow-3-8B-sqlcoder-2x8B). |
|
### Model Details |
|
- **Model Name**: swallow-3-8B-sqlcoder-2x8B-GGUF |
|
- **Model Architecture**: Mixture of Experts (MoE) |
|
- **Base Models**: |
|
- [aixsatoshi/Llama3-Swallow-8B-instruct-vector-merged](https://huggingface.co/aixsatoshi/Llama3-Swallow-8B-instruct-vector-merged) |
|
- [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b) |
|
- **Merge Tool**: MergeKit |
|
|
|
This MoE model aims to create a more powerful and versatile language model by combining the Japanese language capabilities of Llama3-Swallow-8B-instruct-vector-merged with the SQL generation abilities of Llama-3-sqlcoder-8b. |
|
#### Features |
|
- Support for both Japanese and English languages |
|
- Excellent Japanese processing capabilities from Llama3-Swallow-8B-instruct-vector-merged |
|
- Advanced SQL generation and processing capabilities from Llama-3-sqlcoder-8b |
|
#### System Requirements |
|
If using the Q4_K_M quantized model, it can be fully loaded on an RTX3060 12GB. |
|
The author has created the model using WSL2 and Google Colaboratory Pro, and has tested it using Llama.cpp and LMstudio. |