FunctionGemma 270M - E-Commerce Multi-Agent Router
Fine-tuned version of google/functiongemma-270m-it for intelligent routing of customer queries across 7 specialized agents in e-commerce customer support systems.
Model Description
This model demonstrates how FunctionGemma can be adapted beyond mobile actions for multi-agent orchestration in enterprise systems. It intelligently routes natural language customer queries to the appropriate specialized agent with 89.4% accuracy.
Key Achievement: Replacing brittle rule-based routing (52-58% accuracy) with learned intelligence using only 1.47M trainable parameters (0.55% of the model).
Architecture
- Base Model: google/functiongemma-270m-it (270M parameters)
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Trainable Parameters: 1,474,560 (0.55%)
- LoRA Rank: 16
- LoRA Alpha: 32
- Target Modules: q_proj, k_proj, v_proj, o_proj
Training Details
- Training Data: 12,550 synthetic customer queries (balanced across 7 agents)
- Training Time: 45 minutes on Google Colab T4 GPU
- Framework: Hugging Face Transformers + PEFT + TRL
- Quantization: 4-bit NF4 during training
- Optimizer: paged_adamw_8bit
- Learning Rate: 2e-4
- Epochs: 3
- Batch Size: 4 (effective 16 with gradient accumulation)
Intended Use
Primary Use Case
Multi-agent customer support routing for e-commerce platforms:
- Route queries to order management, product search, returns, payments, account, technical support agents
- Maintain conversation context across multi-turn interactions
- Enable intelligent task switching
Supported Agents
The model routes queries to 7 specialized agents:
- Order Management (
route_to_order_agent) - Track orders, update delivery, cancel orders - Product Search (
route_to_search_agent) - Search catalog, check availability, recommendations - Product Details (
route_to_details_agent) - Specifications, reviews, comparisons - Returns & Refunds (
route_to_returns_agent) - Initiate returns, process refunds, exchanges - Account Management (
route_to_account_agent) - Update profile, manage addresses, security - Payment Support (
route_to_payment_agent) - Resolve payment issues, update methods, billing - Technical Support (
route_to_technical_agent) - Fix app/website issues, login problems
Out-of-Scope Use
- โ General-purpose chatbot (use base Gemma models instead)
- โ Direct dialogue generation (this is a routing model)
- โ More than 20 agents (context window limitations)
- โ Non-customer-support domains without fine-tuning
Performance
Test Set Results
Overall Accuracy: 89.40% (1,684/1,883 correct)
Per-Agent Performance:
order_management 92.3% (251/272)
product_search 91.1% (257/282)
product_details 94.7% (233/246)
returns_refunds 88.2% (238/270)
account_management 85.1% (229/269)
payment_support 89.5% (241/269)
technical_support 87.0% (234/269)
Comparison to Baselines
| Approach | Accuracy | Latency | Memory |
|---|---|---|---|
| Keyword Matching | 52-58% | 5ms | Negligible |
| Rule-based (100 rules) | 65-70% | 8ms | Negligible |
| BERT Classifier (300M) | 82-85% | 45ms | 400 MB |
| This Model (LoRA) | 89.4% | 127ms | 2.1 GB |
| GPT-4 API (zero-shot) | 85-90% | 2500ms | Cloud |
Latency Breakdown (T4 GPU)
- Routing Decision: 127ms average
- Agent Execution: ~52ms average
- Total End-to-End: ~179ms average
How to Use
Installation
pip install transformers peft torch accelerate bitsandbytes
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/functiongemma-270m-it",
device_map="auto",
torch_dtype=torch.bfloat16
)
# Load LoRA adapters
model = PeftModel.from_pretrained(
base_model,
"scionoftech/functiongemma-270m-ecommerce-router"
)
tokenizer = AutoTokenizer.from_pretrained("google/functiongemma-270m-it")
# Define available agents
agent_declarations = """<start_function_declaration>
route_to_order_agent(): Track, update, or cancel customer orders
route_to_search_agent(): Search products, check availability
route_to_details_agent(): Get product specifications and reviews
route_to_returns_agent(): Handle returns, refunds, exchanges
route_to_account_agent(): Manage user profile and settings
route_to_payment_agent(): Resolve payment and billing issues
route_to_technical_agent(): Fix app, website, login issues
<end_function_declaration>"""
# Route a query
query = "Where is my order?"
prompt = f"""<start_of_turn>user
{agent_declarations}
User query: {query}<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=30,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)
# Output: <function_call>route_to_order_agent</function_call>
Production Deployment (4-bit Quantization)
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
# 4-bit quantization config
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
# Load with quantization
base_model = AutoModelForCausalLM.from_pretrained(
"google/functiongemma-270m-it",
quantization_config=quant_config,
device_map="auto"
)
model = PeftModel.from_pretrained(
base_model,
"scionoftech/functiongemma-270m-ecommerce-router"
)
# Result: 180 MB model, 132ms latency, 89.1% accuracy
Parsing Function Calls
import re
def extract_agent_function(response: str) -> str:
"""Extract function name from FunctionGemma output."""
match = re.search(r'<function_call>([a-zA-Z_]+)</function_call>', response)
return match.group(1) if match else "unknown"
# Usage
agent = extract_agent_function(response)
print(f"Route to: {agent}")
# Output: Route to: route_to_order_agent
Training Procedure
Dataset Preparation
Generated 12,550 synthetic examples with linguistic variations:
# Example training format
{
"query": "Please track my package ASAP",
"function": "route_to_order_agent",
"agent": "order_management"
}
Variations included:
- Polite forms: "Please", "Could you", "Can you"
- Casual starters: "Hey", "Hi", "Um"
- Urgency markers: "ASAP", "urgently", "immediately"
- Edge cases and ambiguous queries
Training Configuration
from transformers import TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig
# LoRA config
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
# Training args
training_args = TrainingArguments(
output_dir="./functiongemma-ecommerce-router",
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=2e-4,
lr_scheduler_type="cosine",
warmup_ratio=0.1,
weight_decay=0.01,
bf16=True,
optim="paged_adamw_8bit",
logging_steps=20,
eval_strategy="epoch",
save_strategy="epoch"
)
Training Results
- Final Training Loss: 0.0182
- Final Validation Loss: 0.0198
- Training Time: 45 minutes (T4 GPU)
- Peak Memory: 11.2 GB
Limitations and Biases
Known Limitations
Ambiguous Queries: 10.6% error rate concentrated in genuinely ambiguous queries
- Example: "I need help" (could be any agent)
- Mitigation: Implement confidence-based clarification (confidence < 0.7)
Context Dependency: Requires conversation state management for multi-turn interactions
- Solution: Use durable workflow orchestrators (Temporal, Cadence)
Agent Confusion: Most common misclassifications:
- Returns โ Order Management (12 cases)
- Account โ Payment (8 cases)
- Technical โ Product Details (6 cases)
Language: Trained only on English queries
- For multilingual support, fine-tune on translated datasets
Biases
- Domain-Specific: Trained exclusively on e-commerce customer support
- Synthetic Data: Generated examples may not capture all real-world variations
- Agent Distribution: Balanced training may not reflect real query distributions
Ethical Considerations
- Misrouting Impact: Incorrect routing may frustrate customers or delay issue resolution
- Recommendation: Implement fallback to human agents for low-confidence predictions
- Privacy: Model doesn't store user data; conversation state managed externally
- Fairness: Ensure equal routing performance across user demographics
Citation
If you use this model in your research or production systems, please cite:
@misc{functiongemma-ecommerce-router,
author = {Sai Kumar Yava},
title = {FunctionGemma 270M Fine-tuned for E-Commerce Multi-Agent Routing},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/scionoftech/functiongemma-270m-ecommerce-router}},
}
@article{functiongemma2025,
title={FunctionGemma: Bringing bespoke function calling to the edge},
author={Google DeepMind},
year={2025},
url={https://blog.google/technology/developers/functiongemma/}
}
Acknowledgments
- Google DeepMind for FunctionGemma base model
- Hugging Face for PEFT and Transformers libraries
- The open-source AI community
License
This model inherits the Gemma license from the base model. See Gemma Terms of Use.
Commercial Use: Permitted under Gemma license terms.
Related Resources
- Training Notebook: Google Colab
- GitHub Repository: Complete code
- Dataset: Training data
- Base Model: google/functiongemma-270m-it
Updates
- 2025-12-25: Initial release - 89.4% routing accuracy on e-commerce customer support
Questions? Open an issue on GitHub
Model tree for scionoftech/functiongemma-e-commerce-tool-calling
Base model
google/functiongemma-270m-itDataset used to train scionoftech/functiongemma-e-commerce-tool-calling
Evaluation results
- Routing Accuracy on E-commerce Customer Support Routingself-reported89.400
- Macro F1 Score on E-commerce Customer Support Routingself-reported89.000