fr-boris-8bit / README.md
gustavecortal's picture
Update README.md
65c4df2
|
raw
history blame
1.42 kB
metadata
language: fr
license: mit
tags:
  - causal-lm
  - fr
datasets:
  - c4
  - The Pile

Quantized Cedille/fr-boris with 8-bit weights

This is a version of Cedille's GPT-J (fr-boris) with 6 billion parameters that is modified so you can generate and fine-tune the model in colab or equivalent desktop gpu (e.g. single 1080Ti). Inspired by GPT-J 8bit.

Here's how to run it: colab

This model can be easily loaded using the GPTJForCausalLM functionality:

from transformers import GPTJForCausalLM
model = GPTJForCausalLM.from_pretrained("gustavecortal/fr-boris-8bit")

fr-boris

Boris is a 6B parameter autoregressive language model based on the GPT-J architecture and trained using the mesh-transformer-jax codebase.

Boris was trained on around 78B tokens of French text from the C4 dataset.

Links