YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Pico-OpenLAiNN-500M-GGUF 🤗
Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a new, biggest Pico-OpenLAiNN Model. This LLM was trained on the full 32B tokens that the entire Open-PicoLAiNN family is trained on.
These are the GGUF quants of the models. For the original models, you can find them here
Models Overview
- Pico-OpenLAiNN-100: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are extremely limited.
- Pico-OpenLAiNN-250: This is the middle child of the PicoLAiNN family, it's still tiny at 250M parameters but is more capable than the 100M parameter model.
- Pico-OpenLAiNN-500: My current "Heavyweight" Model, this model has 500M parameters and is the most capable of the Pico-OpenLAiNN models.
Pretraining Details
This specific version of Pico LAiNN was trained on just 32B tokens of the fineweb dataset.
Other information:
- Compatibility: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
- Ease of Use: No need to reinvent the wheel. These models are ready to be plugged into your applications.
- Open Source: Fully open source, so you can tweak, tune, and twist them to your heart's content.
Benchy :3
Tasks | Value | Stderr | |
---|---|---|---|
arc_challenge | 0.1903 | ± | 0.115 |
arc_easy | 0.4617 | ± | 0.0102 |
boolq | 0.6034 | ± | 0.0086 |
hellaswag | 0.3400 | ± | 0.0047 |
lambada_openai | 0.3670 | ± | 0.0067 |
piqa | 0.6795 | ± | 0.0109 |
winogrande | 0.4925 | ± | 0.0141 |
Future Plans
- More Models: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned. These will be Released as OpenLAiNN.
- New architecture: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned, this will likely be named FLaRE-LAiNN.
- Paper: A detailed paper will be made available for those interested in the details.
Credit Where Credit's Due
If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗
Contact
If you have questions, Please reach out to me at [email protected]
- Downloads last month
- 27