--- base_model: - tiiuae/falcon-11B library_name: transformers tags: - mergekit - merge - lazymergekit - tiiuae/falcon-11B license: apache-2.0 language: - ro --- # sliced This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was pruned using the passthrough merge method. ### Models Merged The following models were included in the merge: * [tiiuae/falcon-11B](https://huggingface.co/tiiuae/falcon-11B) ### Configuration The following YAML configuration was used to produce this model: ```yaml slices: - sources: - model: tiiuae/falcon-11B layer_range: [0, 24] - sources: - model: tiiuae/falcon-11B layer_range: [55, 59] merge_method: passthrough dtype: bfloat16 ``` [PruneMe](https://github.com/arcee-ai/PruneMe) has been utilized using the wikimedia/wikipedia Romanian (ro) subset by investigating layer similarity with 2000 samples. The layer ranges for pruning were determined based on this analysis to maintain performance while reducing model size. ![Layer Similarity Plot](https://cdn-uploads.huggingface.co/production/uploads/660c0a02cf274b3ab77dd6b7/YBR_4MSxX8Jex6AEDMdLg.png) ```python from transformers import AutoTokenizer, AutoModelForCausalLM import transformers import torch model = "ssmits/Falcon2-5.5B-Romanian" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.bfloat16, ) sequences = pipeline( "Can you explain the concepts of Quantum Computing?", max_length=200, do_sample=True, top_k=10, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id, ) for seq in sequences: print(f"Result: {seq['generated_text']}") ``` 💥 **Falcon LLMs require PyTorch 2.0 for use with `transformers`!** For fast inference with Falcon, check-out [Text Generation Inference](https://github.com/huggingface/text-generation-inference)! Read more in this [blogpost]((https://huggingface.co/blog/falcon). ## Direct Use Research on large language models; as a foundation for further specialization and finetuning for specific usecases (e.g., summarization, text generation, chatbot, etc.) ## Out-of-Scope Use Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. ## Bias, Risks, and Limitations Falcon2-5.5B is trained mostly on English, but also German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish. It will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. ## Recommendations We recommend users of Falcon2-5.5B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use.