FalconMind3b / README.md
CoolCreator's picture
Update README.md
a7c2ec5 verified
|
raw
history blame
3.02 kB
metadata
tags:
  - autotrain
  - text-generation
  - peft
  - chain-of-thought
  - finetuned
library_name: transformers
base_model: tiiuae/Falcon3-3B-Instruct
widget:
  - messages:
      - role: user
        content: What is your favorite condiment?

Model Card for FalconMind3B

This is a fine-tuned open-source model trained to excel in chain-of-thought reasoning. The model is designed to process tasks step by step, providing logical and structured responses for a wide range of applications.

Model Details

Model Description

FalconMind3B is a fine-tuned variant of the tiiuae/Falcon3-3B-Instruct model. It leverages chain-of-thought reasoning techniques to handle complex tasks requiring step-by-step thinking. The fine-tuning process was conducted using PEFT/LoRA on the Hugging Face AutoTrain platform.

  • Developed by: Faris Allafi
  • Model type: Text-generation (causal language modeling)
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Finetuned from model: tiiuae/Falcon3-3B-Instruct

Model Sources [optional]

Uses

Direct Use

This model is designed for text generation tasks that require logical reasoning, including problem-solving, code explanations, and general Q&A applications.

Downstream Use [optional]

FalconMind3B can be fine-tuned further for specific tasks in education, programming, or other domains requiring detailed step-by-step reasoning.

Out-of-Scope Use

This model is not suitable for tasks requiring real-time interaction or applications that rely on languages other than English.

Bias, Risks, and Limitations

FalconMind3B is fine-tuned using synthetic datasets, which may introduce biases or limitations in generalization. It is recommended to test the model on your specific use cases to ensure reliability.

Recommendations

Users should be aware of potential biases and limitations when applying the model in high-stakes or sensitive scenarios.

How to Get Started with the Model

Use the code below to get started with the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "CoolCreator/FalconMind3b"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype="auto"
).eval()

# Define chat messages
messages = [
    {"role": "user", "content": "hi"}
]

# Generate response
input_ids = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)
output_ids = model.generate(input_ids.to("cuda"))
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)

print(response)  # Model response: "Hello! How can I assist you today?"