riccorl commited on
Commit
417f49f
·
verified ·
1 Parent(s): 8e19c03

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ language:
5
+ - it
6
+ - en
7
+ tags:
8
+ - chat
9
+ - minerva-7b
10
+ - gguf
11
+ - instruct
12
+ - dpo
13
+ base_model:
14
+ - sapienzanlp/Minerva-7B-instruct-v1.0
15
+ library_name: transformers
16
+ ---
17
+
18
+ <div style="text-align: center; display: flex; flex-direction: column; align-items: center;">
19
+ <img src="https://huggingface.co/sapienzanlp/Minerva-7B-instruct-v1.0/resolve/main/minerva-logo.png" style="max-width: 550px; height: auto;">
20
+ </div>
21
+
22
+ # Model Card for GGUF version of Minerva-7B-instruct-v1.0
23
+
24
+ Minerva is the first family of **LLMs pretrained from scratch on Italian** developed by [Sapienza NLP](https://nlp.uniroma1.it)
25
+ in the context of the [Future Artificial Intelligence Research (FAIR)](https://fondazione-fair.it/) project, in collaboration with [CINECA](https://www.cineca.it/) and with additional contributions from [Babelscape](https://babelscape.com) and the [CREATIVE](https://nlp.uniroma1.it/creative/) PRIN Project.
26
+ Notably, the Minerva models are truly-open (data and model) Italian-English LLMs, with approximately half of the pretraining data
27
+ including Italian text. The full tech is available at [https://nlp.uniroma1.it/minerva/blog/2024/11/26/tech-report](https://nlp.uniroma1.it/minerva/blog/2024/11/26/tech-report).
28
+
29
+ ## Description
30
+
31
+ This is the model card for the GGUF conversion of [**Minerva-7B-instruct-v1.0**](https://huggingface.co/sapienzanlp/Minerva-7B-instruct-v1.0), a 7 billion parameter model trained on almost 2.5 trillion tokens (1.14 trillion in Italian,
32
+ 1.14 trillion in English and 200 billion in code). This repository contains the model weights in float32 and float16 formats, as well as quantized versions in 8-bit, 6-bit, and 4-bit precision.
33
+
34
+ **Important**: This model is compatible with [llama.cpp](https://github.com/ggerganov/llama.cpp) updated to at least commit `6fe624783166e7355cec915de0094e63cd3558eb` (5 November 2024).
35
+