Model Card for Model ID

A LoRA made for the ORKG Ask synthesis usecase. It expects a research question, and a list of five (5) abstracts!

The model also supports 13 languages and 4 diffierent language tones.

Model Details

Finetuned using unsloth with a dataset created using larger LLMs like GPT-4o.

Languages supported are:

  • English
  • Spanish
  • German
  • Dutch
  • French
  • Italian
  • Portuguese
  • Russian
  • Chinese
  • Japanese
  • Korean
  • Arabic
  • Farsi

Language tones supported:

  • Researcher
  • Adult
  • Teenager
  • Child

Model Description

The language tones are described as follows:

  1. Child (10โ€“11 years old):

    • Simple, short sentences and basic accurate explanations.
    • No advanced jargons.
    • Everyday examples that tie into the research findings.
  2. Teenager:

    • Casual, engaging manner; relevant slang moderately.
    • Interesting and emotional research findings.
    • Relatable explanations, referencing everyday scenarios or pop culture where applicable.
  3. Adult:

    • Concise details yet with a polished, clear tone.
    • Moderate, non-technical vocabulary where possible.
    • Essential context and logical flow, focusing on practical applications of research.
  4. Researcher:

    • Formal, precise language with clear references to methodologies or data.
    • Discipline-specific terminology as needed.
    • Balanced, objective presentation of research complexities.

The system prompt of the model is:

Generate a comprehensive answer to the given research question (but no more than three/four sentences)
solely based on the content provided.
Cite the number of the content referenced for each claim like this:
[1] for a single reference or [2][3] for multiple references.
Generate the synthesis in the "{language}" language, and phrase the complexity of the text to be suitable for a/an {level}.

The user prompt should look like this:

# Research Question: {{ question }}
# Abstracts:
Abstract #1:
 Title Here
Abstract text here

Abstract #2:
 Title Here
Abstract text here

Abstract #3:
 Title Here
Abstract text here

Abstract #4:
 Title Here
Abstract text here

Abstract #5:
 Title Here
Abstract text here

# Answer with inline-citations:

The model should be used in chat mode or use the chat template (check tokenizer) and feed it to a normal generation endpoint.

Trainging Details

LoRA details

r=16
finetune_vision_layers=False,  # Turn off for just text!
finetune_language_layers=True,  # Should leave on!
finetune_attention_modules=True,  # Attention good for GRPO
finetune_mlp_modules=True,  # Should leave on always!
lora_alpha=32
lora_dropout=0
seed=42

SFT details

per_device_train_batch_size=4
gradient_accumulation_steps=8
warmup_steps=5
num_train_epochs=1
learning_rate=2e-4
bf16=True
optim="adamw_torch_fused"
weight_decay=0.01
lr_scheduler_type="linear"
seed=42

Trained on responses only!!

Model Card Contact

ORKG Ask Team - [email protected]

Framework versions

  • PEFT 0.14.0 [More Information Needed]
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yaser-j/gemma3-4b-syn-adapter

Adapter
(12)
this model