falcon-QAMaster / README.md
librarian-bot's picture
Librarian Bot: Update Hugging Face dataset ID
237e161 verified
|
raw
history blame
4.29 kB
metadata
language:
  - en
license: mit
library_name: adapter-transformers
tags:
  - QLoRA
  - Adapters
  - llms
  - Transformers
  - Fine-Tuning
  - PEFT
  - SFTTrainer
  - Open-Source
  - LoRA
  - Attention
  - code
  - Falcon-7b
datasets:
  - squad
  - tiiuae/falcon-refinedweb
  - UCLNLP/adversarial_qa
  - avnishkr/trimpixel
pipeline_tag: question-answering

馃殌 Falcon-QAMaster

Falcon-7b-QueAns is a chatbot-like model for Question and Answering. It was built by fine-tuning Falcon-7B on the SQuAD, Adversarial_qa, Trimpixel (Self-Made) datasets. This repo only includes the QLoRA adapters from fine-tuning with 馃's peft package.

Model Summary

  • Model Type: Causal decoder-only
  • Language(s): English
  • Base Model: Falcon-7B (License: Apache 2.0)
  • Dataset: SQuAD (License: cc-by-4.0), Adversarial_qa (License: cc-by-sa-4.0), Falcon-RefinedWeb (odc-by), Trimpixel (Self-Made)
  • License(s): Apache 2.0 inherited from "Base Model" and "Dataset"

Why use Falcon-7B?

  • It outperforms comparable open-source models (e.g., MPT-7B, StableLM, RedPajama etc.), thanks to being trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. See the OpenLLM Leaderboard.
  • It features an architecture optimized for inference, with FlashAttention (Dao et al., 2022) and multiquery (Shazeer et al., 2019).
  • It is made available under a permissive Apache 2.0 license allowing for commercial use, without any royalties or restrictions.

鈿狅笍 This is a finetuned version for specifically question and answering. If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at Falcon-7B-Instruct.

馃敟 Looking for an even more powerful model? Falcon-40B is Falcon-7B's big brother!

Model Details

The model was fine-tuned in 4-bit precision using 馃 peft adapters, transformers, and bitsandbytes. Training relied on a method called "Low Rank Adapters" (LoRA), specifically the QLoRA variant. The run took approximately 12 hours and was executed on a workstation with a single T4 NVIDIA GPU with 25 GB of available memory. See attached [Colab Notebook] used to train the model.

Model Date

July 13, 2023

Open source falcon 7b large language model fine tuned on SQuAD, Adversarial_qa, Trimpixel datasets for question and answering. QLoRA technique used for fine tuning the model on consumer grade GPU SFTTrainer is also used.

Datasets

  1. Dataset used: SQuAD Dataset Size: 87599 Training Steps: 350

  2. Dataset used: Adversarial_qa Dataset Size: 30000 Training Steps: 400

  3. Dataset used: Trimpixel Dataset Size: 1757 Training Steps: 400

Training procedure

The following bitsandbytes quantization config was used during training:

  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float16

The following bitsandbytes quantization config was used during training:

  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float16

Framework versions

  • PEFT 0.4.0.dev0

  • PEFT 0.4.0.dev0