FunctionGemma 270M - E-Commerce Multi-Agent Router

Fine-tuned version of google/functiongemma-270m-it for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems.

Model Description

This model demonstrates how FunctionGemma can be adapted beyond mobile actions for multi-agent orchestration in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with 89.4% accuracy.

Key Achievement: Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model).

Architecture

Base Model: google/functiongemma-270m-it (270M parameters)
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Trainable Parameters: 1,474,560 (0.55%)
LoRA Rank: 16
LoRA Alpha: 32
Target Modules: q_proj, k_proj, v_proj, o_proj

Training Details

Training Data: 12,550 synthetic customer queries (balanced across 7 agents)
Training Time: 45 minutes on Google Colab T4 GPU
Framework: Hugging Face Transformers + PEFT + TRL
Quantization: 4-bit NF4 during training
Optimizer: paged_adamw_8bit
Learning Rate: 2e-4
Epochs: 3
Batch Size: 4 (effective 16 with gradient accumulation)

Intended Use

Primary Use Case

Multi-agent customer support routing for e-commerce platforms:

Route queries to order management, product search, returns, payments, account, technical support agents
Maintain conversation context across multi-turn interactions
Enable intelligent task switching

Supported Agents

The model routes queries to 7 specialized agents:

Order Management (route_to_order_agent) - Track orders, update delivery, cancel orders
Product Search (route_to_search_agent) - Search catalog, check availability, recommendations
Product Details (route_to_details_agent) - Specifications, reviews, comparisons
Returns & Refunds (route_to_returns_agent) - Initiate returns, process refunds, exchanges
Account Management (route_to_account_agent) - Update profile, manage addresses, security
Payment Support (route_to_payment_agent) - Resolve payment issues, update methods, billing
Technical Support (route_to_technical_agent) - Fix app/website issues, login problems

Out-of-Scope Use

❌ General-purpose chatbot (use base Gemma models instead)
❌ Direct dialogue generation (this is a routing model)
❌ More than 20 agents (context window limitations)
❌ Non-customer-support domains without fine-tuning

Performance

Test Set Results

Overall Accuracy: 89.40% (1,684/1,883 correct)

Per-Agent Performance:
  order_management      92.3%  (251/272)
  product_search        91.1%  (257/282)
  product_details       94.7%  (233/246)
  returns_refunds       88.2%  (238/270)
  account_management    85.1%  (229/269)
  payment_support       89.5%  (241/269)
  technical_support     87.0%  (234/269)

Comparison to Baselines

Approach	Accuracy	Latency	Memory
Keyword Matching	52-58%	5ms	Negligible
Rule-based (100 rules)	65-70%	8ms	Negligible
BERT Classifier (300M)	82-85%	45ms	400 MB
This Model (LoRA)	89.4%	127ms	2.1 GB
GPT-4 API (zero-shot)	85-90%	2500ms	Cloud

Latency Breakdown (T4 GPU)

Routing Decision: 127ms average
Agent Execution: ~52ms average
Total End-to-End: ~179ms average

How to Use

Installation

pip install transformers peft torch accelerate bitsandbytes

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/functiongemma-270m-it",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Load LoRA adapters
model = PeftModel.from_pretrained(
    base_model,
    "scionoftech/functiongemma-270m-ecommerce-router"
)

tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")

# Define available agents
agent_declarations = """<start_function_declaration>
route_to_order_agent(): Track, update, or cancel customer orders
route_to_search_agent(): Search products, check availability
route_to_details_agent(): Get product specifications and reviews
route_to_returns_agent(): Handle returns, refunds, exchanges
route_to_account_agent(): Manage user profile and settings
route_to_payment_agent(): Resolve payment and billing issues
route_to_technical_agent(): Fix app, website, login issues
<end_function_declaration>"""

# Route a query
query = "Where is my order?"

prompt = f"""<start_of_turn>user
{agent_declarations}

User query: {query}<end_of_turn>
<start_of_turn>model
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=30,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)
# Output: <function_call>route_to_order_agent</function_call>

Production Deployment (4-bit Quantization)

from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# 4-bit quantization config
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# Load with quantization
base_model = AutoModelForCausalLM.from_pretrained(
    "google/functiongemma-270m-it",
    quantization_config=quant_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "scionoftech/functiongemma-270m-ecommerce-router"
)

# Result: 180 MB model, 132ms latency, 89.1% accuracy

Parsing Function Calls

import re

def extract_agent_function(response: str) -> str:
    """Extract function name from FunctionGemma output."""
    match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response)
    return match.group(1) if match else "unknown"

# Usage
agent = extract_agent_function(response)
print(f"Route to: {agent}")
# Output: Route to: route_to_order_agent

Training Procedure

Dataset Preparation

Generated 12,550 synthetic examples with linguistic variations:

# Example training format
{
    "query": "Please track my package ASAP",
    "function": "route_to_order_agent",
    "agent": "order_management"
}

Variations included:

Polite forms: "Please", "Could you", "Can you"
Casual starters: "Hey", "Hi", "Um"
Urgency markers: "ASAP", "urgently", "immediately"
Edge cases and ambiguous queries

Training Configuration

from transformers import TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig

# LoRA config
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

# Training args
training_args = TrainingArguments(
    output_dir="./functiongemma-ecommerce-router",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.1,
    weight_decay=0.01,
    bf16=True,
    optim="paged_adamw_8bit",
    logging_steps=20,
    eval_strategy="epoch",
    save_strategy="epoch"
)

Training Results

Final Training Loss: 0.0182
Final Validation Loss: 0.0198
Training Time: 45 minutes (T4 GPU)
Peak Memory: 11.2 GB

Limitations and Biases

Known Limitations

Ambiguous Queries: 10.6% error rate concentrated in genuinely ambiguous queries
- Example: "I need help" (could be any agent)
- Mitigation: Implement confidence-based clarification (confidence < 0.7)
Context Dependency: Requires conversation state management for multi-turn interactions
- Solution: Use durable workflow orchestrators (Temporal, Cadence)
Agent Confusion: Most common misclassifications:
- Returns ↔ Order Management (12 cases)
- Account ↔ Payment (8 cases)
- Technical ↔ Product Details (6 cases)
Language: Trained only on English queries
- For multilingual support, fine-tune on translated datasets

Biases

Domain-Specific: Trained exclusively on e-commerce customer support
Synthetic Data: Generated examples may not capture all real-world variations
Agent Distribution: Balanced training may not reflect real query distributions

Ethical Considerations

Misrouting Impact: Incorrect routing may frustrate customers or delay issue resolution
Recommendation: Implement fallback to human agents for low-confidence predictions
Privacy: Model doesn't store user data; conversation state managed externally
Fairness: Ensure equal routing performance across user demographics

Citation

If you use this model in your research or production systems, please cite:

@misc{functiongemma-ecommerce-router,
  author = {Sai Kumar Yava},
  title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}},
}

@article{functiongemma2025,
  title={FunctionGemma: Bringing bespoke function calling to the edge},
  author={Google DeepMind},
  year={2025},
  url={https://blog.google/technology/developers/functiongemma/}
}

Acknowledgments

Google DeepMind for FunctionGemma base model
Hugging Face for PEFT and Transformers libraries
The open-source AI community

License

This model inherits the Gemma license from the base model. See Gemma Terms of Use.

Commercial Use: Permitted under Gemma license terms.

Related Resources

Training Notebook: Google Colab
GitHub Repository: Complete code
Dataset: Training data
Base Model: google/functiongemma-270m-it

Updates

2025-12-25: Initial release - 89.4% routing accuracy on e-commerce customer support

Questions? Open an issue on GitHub

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for scionoftech/functiongemma-e-commerce-tool-calling

Base model

google/functiongemma-270m-it

Adapter

(6)

this model

Dataset used to train scionoftech/functiongemma-e-commerce-tool-calling

Evaluation results

Routing Accuracy on E-commerce Customer Support Routing
self-reported

89.400
Macro F1 Score on E-commerce Customer Support Routing
self-reported

89.000