hatakeyama-llm-team (Team Hatakeyama)

kanhatakeyama

updated a dataset 3 months ago

hatakeyama-llm-team/PMC

Viewer • Updated Oct 1 • 819k • 3.23k

kanhatakeyama

updated a dataset 5 months ago

hatakeyama-llm-team/BumpoRikai

Viewer • Updated Aug 6 • 30.4k • 44 • 2

kaisugi

posted an update 6 months ago

Post

785

🚀 Llama-3-ELYZA-JP-8B

ELYZA, Inc. has developed two large language models (LLMs) for Japanese called "Llama-3-ELYZA-JP-70B" with 70 billion parameters and "Llama-3-ELYZA-JP-8B" with 8 billion parameters, based on Meta's "Llama 3" series. These models have been fine-tuned through additional pre-training and post-training to improve Japanese language capabilities significantly.

Key Points:

Performance:
- Llama-3-ELYZA-JP-70B surpasses global models such as GPT-4, Claude 3 Sonnet, and Gemini 1.5 Flash.
- Llama-3-ELYZA-JP-8B matches models like GPT-3.5 Turbo and Claude 3 Haiku despite having fewer parameters.

Availability:
- The 8B model is available on Hugging Face Hub and can be used for both research and commercial purposes under the Llama 3 Community License.

Methodology:
- ELYZA enhanced the Japanese performance of the Llama 3 models through additional training with high-quality Japanese corpora and Instruction Tuning with proprietary datasets.

Benchmarks:
- Evaluations using ELYZA Tasks 100 and Japanese MT-Bench showed significant improvements in Japanese language generation.

Inference Speed:
- To address inference speed issues due to model size, ELYZA implemented Speculative Decoding, which achieved up to 1.6 times faster inference for the 70B model.

Overall, ELYZA's models demonstrate state-of-the-art performance in Japanese language tasks and are optimized for both efficiency and effectiveness.

Model URL:
- elyza/Llama-3-ELYZA-JP-8B
- elyza/Llama-3-ELYZA-JP-8B-AWQ
- elyza/Llama-3-ELYZA-JP-8B-GGUF

Blog post (in Japanese):
https://note.com/elyza/n/n360b6084fdbd

kaisugi

posted an update 6 months ago

Post

664

🚀 KARAKURI LM 8x7B Instruct v0.1

KARAKURI Inc. has publicly released "KARAKURI LM 8x7B Instruct v0.1", the first domestic Large Language Model (LLM) in Japan to support Function calling and Retrieval-Augmented Generation (RAG). This AI agent can handle tasks across various applications autonomously, significantly reducing implementation costs compared to traditional models.

Model Features:
- Capable of autonomously choosing optimal documents and databases for various tasks.
- Applied extensively in customer support for automating responses and processes, analyzing Voice of Customer (VoC), and predicting optimal outreach timings.

Model URL:
karakuri-ai/karakuri-lm-8x7b-instruct-v0.1

Detailed press release (in Japanese):
https://karakuri.ai/seminar/news/karakuri-lm-8x7b-instruct-v0-1/

kaisugi

posted an update 6 months ago

Post

2270

🚀 Sarashina1-65B

SB Intuitions has announced the release of Japanese Large Language Models (LLMs) with 7 billion, 13 billion, and 65 billion parameters to aid academic and industrial research and development. The company plans to develop a 390 billion parameter model by the end of 2024. The models, named Sarashina1 and Sarashina2, show significant performance improvements, especially Sarashina2 which is an enhanced version of Sarashina1.

Performance evaluations using five Japanese language datasets reveal that Sarashina2 outperforms other models, including continued pre-trained models. The name "Sarashina" originates from a historical diary linked to the headquarters' location in Tokyo's Takeshiba area, symbolizing the company's ambition to create globally utilized models from Japan.

Model URL:
- sbintuitions/sarashina1-65b
- sbintuitions/sarashina2-13b

Detailed press release (in Japanese):
https://www.sbintuitions.co.jp/news/press/20240614_01/

kaisugi

posted an update 6 months ago

Post

869

🚀 llava-calm2-siglip

CyberAgent Inc. has announced the public release of "llava-calm2-siglip," a 7.5 billion parameter Vision Language Model (VLM) for Japanese, available for commercial use. This model, trained primarily on a high-quality Japanese dataset, is accessible on Hugging Face Hub under an Apache-2.0 license. The advancement aims to improve Japanese language-specific VLMs, which are fewer compared to English-centric models.

Model URL:
cyberagent/llava-calm2-siglip

Demo URL:
cyberagent/llava-calm2-preview

Detailed press release (in Japanese): https://www.cyberagent.co.jp/news/detail/id=30344

kaisugi

updated a Space 7 months ago

Running

📉

README

misdelivery

updated a model 7 months ago

hatakeyama-llm-team/tokenizer_65000

Updated Jun 1 • 1

kanhatakeyama

updated 2 datasets 7 months ago

hatakeyama-llm-team/CommonCrawlPDFJa

Viewer • Updated May 28 • 142k • 41 • 1

hatakeyama-llm-team/AutoPreferenceDataset

Viewer • Updated May 20 • 119k • 37 • 1

misdelivery

updated 4 datasets 7 months ago

kaisugi

posted an update 7 months ago

Post

1579

🚀 Stockmark-100b

Stockmark Inc. has developed and released one of Japan's largest commercial-scale Language Models (LLM) with 100 billion parameters, named "Stockmark-LLM-100b". This model significantly reduces hallucinations and provides accurate responses to complex business-related queries. Developed from scratch with a focus on Japanese business data, the model aims to be reliable for high-stakes business environments. It's open-source and available for commercial use.

Key highlights:
- The model reduces hallucinations—incorrect confident responses that AI models sometimes generate.
- Stockmark-LLM-100b can answer basic business questions and specialized queries in industries like manufacturing.
- The model's performance surpasses GPT-4-turbo in accuracy for business-specific queries.
- Evaluation benchmarks (VicunaQA) show high performance.
- Fast inference speed, generating 100-character Japanese text in 1.86 seconds.

stockmark/stockmark-100b
stockmark/stockmark-100b-instruct-v0.1

Detailed press release (in Japanese): https://stockmark.co.jp/news/20240516

4 replies

·

namiuchi

updated a dataset 9 months ago

hatakeyama-llm-team/CommonCrawl_wet_v2

Viewer • Updated Mar 26 • 482k • 79 • 1

kanhatakeyama

updated a dataset 9 months ago

hatakeyama-llm-team/japanese2010

Viewer • Updated Mar 21 • 3.66M • 472 • 2

kanhatakeyama

updated a model 10 months ago

hatakeyama-llm-team/bertopic_model

Updated Mar 12

Team Hatakeyama

AI & ML interests

Recent Activity

hatakeyama-llm-team's activity

hatakeyama-llm-team/PMC

hatakeyama-llm-team/BumpoRikai

README

hatakeyama-llm-team/tokenizer_65000

hatakeyama-llm-team/CommonCrawlPDFJa

hatakeyama-llm-team/AutoPreferenceDataset

hatakeyama-llm-team/AutoGeneratedJapaneseQA-other

hatakeyama-llm-team/AutoGeneratedJapaneseQA-CC

hatakeyama-llm-team/WikiBookJa

hatakeyama-llm-team/AutoGeneratedJapaneseQA

hatakeyama-llm-team/CommonCrawl_wet_v2

hatakeyama-llm-team/japanese2010

hatakeyama-llm-team/bertopic_model

AI & ML interests

Recent Activity

Team members 19

hatakeyama-llm-team's activity

README