ssmits's picture
Update README.md
4dc59c5 verified
|
raw
history blame
3.03 kB
---
base_model:
- tiiuae/falcon-11B
library_name: transformers
tags:
- mergekit
- merge
- lazymergekit
- tiiuae/falcon-11B
license: apache-2.0
language:
- ro
---
# sliced
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was pruned using the passthrough merge method.
### Models Merged
The following models were included in the merge:
* [tiiuae/falcon-11B](https://huggingface.co/tiiuae/falcon-11B)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
slices:
- sources:
- model: tiiuae/falcon-11B
layer_range: [0, 24]
- sources:
- model: tiiuae/falcon-11B
layer_range: [55, 59]
merge_method: passthrough
dtype: bfloat16
```
[PruneMe](https://github.com/arcee-ai/PruneMe) has been utilized using the wikimedia/wikipedia Romanian (ro) subset by investigating layer similarity with 2000 samples. The layer ranges for pruning were determined based on this analysis to maintain performance while reducing model size.
![Layer Similarity Plot](https://cdn-uploads.huggingface.co/production/uploads/660c0a02cf274b3ab77dd6b7/YBR_4MSxX8Jex6AEDMdLg.png)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model = "ssmits/Falcon2-5.5B-Romanian"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
)
sequences = pipeline(
"Can you explain the concepts of Quantum Computing?",
max_length=200,
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
```
💥 **Falcon LLMs require PyTorch 2.0 for use with `transformers`!**
For fast inference with Falcon, check-out [Text Generation Inference](https://github.com/huggingface/text-generation-inference)! Read more in this [blogpost]((https://huggingface.co/blog/falcon).
## Direct Use
Research on large language models; as a foundation for further specialization and finetuning for specific usecases (e.g., summarization, text generation, chatbot, etc.)
## Out-of-Scope Use
Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
## Bias, Risks, and Limitations
Falcon2-5.5B is trained mostly on English, but also German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish. It will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.
## Recommendations
We recommend users of Falcon2-5.5B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use.